The Definitive Guide to Power Query (M): Mastering complex data transformation with Power Query
5/5
()
About this ebook
Data transformation is a critical step in building data models and business intelligence reports. Power Query is an invaluable tool for anyone who wants to master data transformation, and this book will equip you with the knowledge and skills to make the most of it.
The Definitive Guide to Power Query (M) will help you build a solid foundation in the Power Query M language. As you progress through the chapters, you'll learn how to use that knowledge to implement advanced concepts and data transformations. This will set the stage for an uncompromisingly thorough exploration of the Power Query M Language.
You'll also get to grips with optimizing performance, handling errors, and implementing efficient data processing techniques. As this is a hands-on guide, the practical examples in the chapters will help you gain the skills to apply Power Query to real-world problems and improve your data analysis capabilities.
By the end of this book, you will be able to leverage all of Power Query's remarkable capabilities for data transformation.
Read more from Gregory Deckler
Mastering Microsoft Power BI – Second Edition: Expert techniques to create interactive insights for effective data analytics and business intelligence Rating: 0 out of 5 stars0 ratingsLearn Power BI: A comprehensive, step-by-step guide for beginners to learn real-world business intelligence Rating: 4 out of 5 stars4/5
Related to The Definitive Guide to Power Query (M)
Related ebooks
Azure Data Factory Cookbook: A data engineer's guide to building and managing ETL and ELT pipelines with data integration Rating: 0 out of 5 stars0 ratingsMicrosoft Power BI Cookbook Rating: 0 out of 5 stars0 ratingsExtending Power BI with Python and R: Perform advanced analysis using the power of analytical languages Rating: 0 out of 5 stars0 ratingsData Analysis and Business Modeling with Excel 2013 Rating: 1 out of 5 stars1/5Microsoft Power BI Cookbook: Convert raw data into business insights with updated techniques, use cases, and best practices Rating: 0 out of 5 stars0 ratingsPower BI for Jobseekers: Learn how to create interactive dashboards and reports, and gain insights from the data (English Edition) Rating: 0 out of 5 stars0 ratingsM Is for (Data) Monkey: A Guide to the M Language in Excel Power Query Rating: 4 out of 5 stars4/5Master Your Data with Power Query in Excel and Power BI: Leveraging Power Query to Get & Transform Your Task Flow Rating: 0 out of 5 stars0 ratingsDAX Cookbook: Over 120 recipes to enhance your business with analytics, reporting, and business intelligence Rating: 0 out of 5 stars0 ratingsInstant Creating Data Models with PowerPivot How-to Rating: 1 out of 5 stars1/5Power BI DAX Essentials Getting Started with Basic DAX Functions in Power BI Rating: 5 out of 5 stars5/5Value Stream Mapping Software A Complete Guide - 2020 Edition Rating: 0 out of 5 stars0 ratingsHands-On Microsoft Lists: Create custom data models and improve the way data is organized using Lists in Microsoft 365 Rating: 0 out of 5 stars0 ratingsEconomic data Second Edition Rating: 0 out of 5 stars0 ratingsSelf-Service AI with Power BI Desktop: Machine Learning Insights for Business Rating: 0 out of 5 stars0 ratingsThe DevOps Mokitas: Avoid the Elephants in Your DevOps Adoption: A Handbook of Devops Mistakes to Avoid Rating: 0 out of 5 stars0 ratingsTeach Yourself VISUALLY Power BI Rating: 0 out of 5 stars0 ratingsAdvanced Data Analytics with AWS Rating: 0 out of 5 stars0 ratingsPentaho Data Integration 4 Cookbook Rating: 0 out of 5 stars0 ratingsMastering Amazon Relational Database Service for MySQL: Building and configuring MySQL instances (English Edition) Rating: 0 out of 5 stars0 ratingsTabular Modeling with SQL Server 2016 Analysis Services Cookbook Rating: 4 out of 5 stars4/5Power BI: Unleashing Insights with Power BI. A Comprehensive Guide to Data Visualization and Business Intelligence Rating: 0 out of 5 stars0 ratingsData Engineering Best Practices: Architect robust and cost-effective data solutions in the cloud era Rating: 0 out of 5 stars0 ratings
Computers For You
The Innovators: How a Group of Hackers, Geniuses, and Geeks Created the Digital Revolution Rating: 4 out of 5 stars4/5SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL Rating: 4 out of 5 stars4/5How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally Rating: 4 out of 5 stars4/5Mastering ChatGPT: 21 Prompts Templates for Effortless Writing Rating: 4 out of 5 stars4/5The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology Rating: 4 out of 5 stars4/5Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics Rating: 4 out of 5 stars4/5Elon Musk Rating: 4 out of 5 stars4/5Uncanny Valley: A Memoir Rating: 4 out of 5 stars4/5The Invisible Rainbow: A History of Electricity and Life Rating: 5 out of 5 stars5/5Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad Rating: 5 out of 5 stars5/5Alan Turing: The Enigma: The Book That Inspired the Film The Imitation Game - Updated Edition Rating: 4 out of 5 stars4/5Slenderman: Online Obsession, Mental Illness, and the Violent Crime of Two Midwestern Girls Rating: 4 out of 5 stars4/5Excel 101: A Beginner's & Intermediate's Guide for Mastering the Quintessence of Microsoft Excel (2010-2019 & 365) in no time! Rating: 0 out of 5 stars0 ratingsCompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61 Rating: 0 out of 5 stars0 ratingsDeep Search: How to Explore the Internet More Effectively Rating: 5 out of 5 stars5/5101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters Rating: 4 out of 5 stars4/5Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are Rating: 4 out of 5 stars4/5CompTIA Security+ Get Certified Get Ahead: SY0-701 Study Guide Rating: 5 out of 5 stars5/5Computer Science I Essentials Rating: 5 out of 5 stars5/5The Hacker Crackdown: Law and Disorder on the Electronic Frontier Rating: 4 out of 5 stars4/5People Skills for Analytical Thinkers Rating: 5 out of 5 stars5/5Data Analytics for Beginners: Introduction to Data Analytics Rating: 4 out of 5 stars4/5CompTia Security 701: Fundamentals of Security Rating: 0 out of 5 stars0 ratings
Reviews for The Definitive Guide to Power Query (M)
1 rating0 reviews
Book preview
The Definitive Guide to Power Query (M) - Gregory Deckler
The Definitive Guide to Power Query (M)
Mastering complex data transformation with Power Query
Gregory Deckler
Rick de Groot
Melissa de Korte
The Definitive Guide to Power Query (M)
Copyright © 2024 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Senior Publishing Product Manager: Gebin George
Acquisition Editor – Peer Reviews: Tejas Mhasvekar
Project Editor: Rianna Rodrigues
Content Development Editors: Shruti Menon and Shazeen Iqbal
Copy Editor: Safis Editing
Technical Editors: Aneri Patel and Kushal Sharma
Proofreader: Safis Editing
Indexer: Tejal Daruwale Soni
Presentation Designer: Pranit Padwal
Developer Relations Marketing Executive: Vignesh Raju
First published: March 2024
Production reference: 1260324
Published by Packt Publishing Ltd.
Grosvenor House
11 St Paul’s Square
Birmingham
B3 1RB, UK.
ISBN 978-1-83508-972-9
www.packt.com
Foreword
Occasionally, the stars align in just the right way, and the result is something that is unexpectedly great. That’s the story behind the book you’re holding now. But I’m getting ahead of myself. Let’s turn the clock back to the start of 2022, which is where the story of this book begins.
At that point, I had been a data professional and Excel user for over 30 years and was doing a lot of Power BI development while conducting online training as well. I had long used Power Query to clean and transform my data prior to performing analysis and building reports. But almost all of that prep was done through the Power Query user interface. As well designed as the Power Query UI is, it’s also quite limiting, providing access to only a fraction of the 700+ available M functions. So, at the start of 2022, I decided that I was going to stop dabbling in M and make learning it thoroughly my top-priority training goal for the year. This was not easy in 2022, since while there were fantastic books on Power Query, they each had only a small percentage of their content dedicated to M. There were many great M resources online, but they offered deep insights into a grab bag of very specialized topics and applications, which also did not make for an ideal learning experience.
Despite these hurdles, by the end of 2022, I found that my deep dive into M had been transformative. My Power BI reports ran much faster, the DAX used in them was dramatically simpler, I was able to clean data faster than ever, and I now could easily build reusable custom functions to automate repetitive data tasks. I also found that learning how to visualize and manipulate complex data structures in M substantially improved my coding ability in other languages. However, I also realized that my experience was entirely unreplicable due to the fact that three of my friends (not coincidentally the authors of this book) happened to be among the most knowledgeable Power Query/M experts in the world, and I relied heavily on their expertise during my learning. I actually used to joke with both Melissa and Rick online about each of them being the Chosen One
destined to write the M counterpart to The Definitive Guide to DAX, but I think that seemed too much of a monumental task to both of them.
During this time, I was the primary tester on a suite of apps that Gregory Deckler was developing. We would regularly talk about Power Query and M when we would discuss the potential future capabilities of those apps. One day, Greg mentioned to me that he was working on a book proposal outline for that same comprehensive guide to M that I had been joking about (but also genuinely wishing existed). He asked me who I would recommend as co-authors for that book, and I am beyond thrilled to say that my response can be found on the cover of this book.
Mastering M is neither a quick nor easy task. However, I am completely confident that whether you are a data analyst, Power BI developer, data scientist, business analyst, or even an Excel end user looking to level up your skills, having the same three clear, patient, and brilliant instructors I did will make learning M an enjoyable and game-changing experience for you too.
Brian Julius
Power BI Expert and Instructor
Contributors
About the authors
Gregory Deckler is a seven-time Microsoft MVP for Data Platform and an active blogger and Power BI community member, having authored over 6,000 solutions to community questions. Greg has authored numerous books on Power BI, including the first and second editions of Learn Power BI, the second editions of Power BI Cookbook and Mastering Power BI, and DAX Cookbook. Greg has also created several external tools for Power BI and regularly posts video content to his YouTube channels – Microsoft Hates Greg and DAX For Humans.
I would like to acknowledge my wonderful co-authors for this book; I learned so much from all of them. I’m also grateful to my son Rocket for supporting all of my endeavors. A special thanks to Brian Julius who inspired this book and without whom this book would not exist
Rick de Groot is a Power BI consultant, blogger, YouTuber, and author who hails from the Netherlands. With a background in finance, he has devoted 14+ years to building his expertise in data analysis, particularly in the Power Query M language. He is an independent consultant who also provides training on Power BI, Power Query, and DAX. He regularly publishes articles on his blogs, Power Query How and BI Gorilla, which have emerged as important sources for M articles and tutorials. He contributes extensively to his YouTube channel BI Gorilla as well. Rick’s commitment to sharing his expertise has earned him the Microsoft Data Platform MVP award for two consecutive years.
My deepest appreciation goes to Brian Julius . His belief in me nudged me to share more about Power Query and helped me grow as a writer for the M language. Without his support, I don’t think my name would have been on the cover of this book. He also played an important part in our discussions and helped shape the contents of this book. I am profoundly thankful for your support, Brian.
To my family – your patience and support while writing this book have meant the world to me.
Also, a huge shout-out to all of you out there. All the discussions we’ve had have been incredibly valuable.
Melissa de Korte is a passionate problem-solver known for simplifying complex problems with Power Query. With an impressive track record as an enterprise DNA expert and super user, she has become an asset to the community, actively engaging, supporting, and inspiring others. Behind her professional persona lies a genuine dedication to empowering others through sharing knowledge. Her portfolio includes blogs, tutorials, courses, and webinars to make Power Query and M more accessible.
To Marcel and Sam for filling every moment we share(d) with joy.
My gratitude goes to my co-authors, as well as Brian Julius, for their roles in this book’s creation. To my family for their endless patience and support. And to all who generously share their knowledge, spark curiosity and promote a supportive and inclusive community. Your contributions are invaluable!
About the reviewers
Erin Ostrowsky is a digital nomad who endeavors to build healthy, happy relationships wherever she goes. She’s passionate about STEAM – don’t forget the Arts, folks! – and hopes to inspire others to join in the fun problem-solving this world needs. Erin loves learning, creating, making friends, traveling, and writing about life’s many curiosities. Her current focus is running Fit 4 Duty Data, a data and reporting consultancy and training firm that supports the global mining and energy sectors.
I’d like to thank my Creator, my friends, and my family – you all make life beautiful, especially during the hard times.
Vahid Doustimajd is a recognized Microsoft MVP and certified trainer with over 15 years of experience. He is a technophile who is passionate about project management and data analytics. Currently, he works in Australia as a Project Controls and Analytics Manager. He is an active participant in the Microsoft Fabric and Power BI communities, enjoys sharing his expertise through blogs and videos, and organizes events for the Persian Power BI User Group.
I would like to extend my heartfelt gratitude to my wife for her endless support and patience. Her encouragement has been my guiding light.
Ahmed Oyelowo, a Managing Partner, lead trainer, and consultant at Foresight BI & Analytics Global Solutions, is a Microsoft Certified Power BI Data Analyst Associate and Azure Enterprise Data Analyst. He is also a Microsoft Certified Trainer, with a best-selling Power BI course on Udemy that has been taken by 90,000+ students. A four-time Microsoft MVP in the Data Platform category, he has developed cutting-edge solutions for various corporate clients.
My sincere appreciation to my Creator for the gift of all required to complete this review. I must also extend my gratitude to my dearest wife, Kafayat, and my lovely children, Yusraa and Yaasir, for their support and understanding while I was spending some time away from them to complete this task. Finally, I’d like to thank the authors of this book for this wonderful resource!
Learn more on Discord
Join our community’s Discord space for discussions with the author and other readers:
https://discord.gg/vCSG5GBbyS
Contents
Preface
Who this book is for
What this book covers
To get the most out of this book
Get in touch
Introducing M
The history of M
Who should learn M?
Where and how is M used?
Experiences
Products and services
Why learn M?
M language basics
The let expression
The characteristics of M
Formal classification
Informal characteristics of M
Summary
Working with Power Query/M
Technical requirements
Touring the Power Query Desktop experience
A brief tour
Header
Formula bar
Ribbon
Queries pane
Query Settings pane
Preview pane
Status Bar
Your first query
Options and data source settings
Options
Data source settings
Editing experience-generated code
Creating custom columns
Adding an index column
Adding columns with examples
Math operations
Adding custom m code columns
Using the Advanced Editor
Summary
Accessing and Combining Data
Technical requirements
Accessing files and folders
File.Contents
Text/CSV
Excel
Folder
XML
Xml.Tables
Xml.Document
Azure Storage
Additional file formats
Retrieving web content
Investigating binary functions
Lines functions
Accessing databases and cubes
Cube functions
Working with standard data protocols
Addressing additional connectors
Popular software systems
Identity functions
Combining and joining data
Table.Combine
Table.NestedJoin and Table.Join
Table.FuzzyNestedJoin and Table.FuzzyJoin
Summary
Understanding Values and Expressions
Introducing the types of values
Binary values
Structure
Related functions
Special considerations
The Date/Time family of values
Date values
Time values
DateTime values
DateTimeZone values
Duration values
Logical values
Structure
Related Functions
Special considerations
Null values
Structure
Related functions
Special considerations
Number values
Structure and examples
Related functions
Special considerations
Text values
Structure
Related functions
Special considerations
List values
Structure
Related functions
Special considerations
Record values
Structure
Related functions
Special considerations
Table values
Structure
Related functions
Special considerations
Function values
Structure
Related functions
Special considerations
Type values
Operators
Expressions
Nesting let expressions
Coding best practices for expressions
Control structures
Enumerations
Summary
Understanding Data Types
What are data types?
The type system
Columns with mixed types
Column data type versus value type
Importance of types
Clarity and consistency
Data validation
Other reasons
Data types available in M
Primitive types
Abstract primitive types
Nullable primitive types
Using primitive types for data filtering
Custom types
List type
Record type
Table type
Function Type
Type detection
Retrieving data types from a data source
Automatically detecting types
Data type conversion
Converting value types
Converting column types
Avoiding data loss during conversion
Effect of locale/culture
Facets
Type Claims
Available Type Claims
Converting values using type claims
Inspecting Type Claims
Ascribing types
What is ascription?
Functions that support ascribing types
Ascribing types when creating records
Ascribing types when creating tables
Ascribing types when modifying tables
Ascribing types to any value
Errors when ascribing types
The base type is incompatible with the value
The Type Claim does not conform with the value
Ascribing incompatible types to structured values
Type equivalence, conformance, and assertion
Type equality
Type conformance
Type assertion
Summary
Structured Values
Introducing structured values
Lists
Introduction to lists
List operators
Equal
Not equal
Concatenate
Coalesce
Methods to create a list
Creating lists using the list initializer
Creating lists using functions
Referencing a table column
Using the a..b form
Accessing items in a list
Accessing list values by index
Handling non-existent index positions
Common operations with lists
Assigning data types to a list
Records
Introduction to records
Record operators
Equal
Not equal
Concatenation
Coalesce
Methods to create a record
Creating records using the record initializer
Creating records using functions
Retrieving a record by referencing a table row
Accessing fields in a record
Field selection
Record projection
Common operations with records
Structure for variables
Referencing the current row
Providing options for functions
Keeping track of intermediary results
Assigning a data type to records
Tables
Introduction to tables
Table operators
Equal
Not equal
Concatenation
Coalesce
Methods to create a table
Retrieve data from a source
Manually input data into functions
Re-use existing tables/queries
Using the Enter data functionality
Accessing elements in a table
Item access
Field access
Common operations with tables
Assigning a data type to tables
Summary
Conceptualizing M
Technical requirements
Understanding scope
Examining the global environment
Studying sections
Creating your own global environment
Understanding closures
Query folding
Managing metadata
Summary
Working with Nested Structures
Transitioning to coding
Getting started
Understanding Drill Down
The trick to getting more out of the UI
Methods for multistep value transformation
Transforming values in tables
Table.AddColumn
Table.TransformColumns
Table.ReplaceValue
Working with lists
Transforming a list
List.Transform
List.Zip
Extracting an item
Resizing a list
List.Range
List.Alternate
Filtering a list
List.FindText
List.Select
To-list conversions
Column or field names
A single column
All columns
All rows
Other operations
Expanding multiple list columns simultaneously
Flattening inconsistent multi-level nested lists
Working with records
Transforming records
Extracting a field value
Resizing records
Filtering records
To-record conversions
Table row to record
Record from table
Record from list
Conditional lookup or value replacement
Working with tables
Transforming tables
Extracting a cell value
Resizing a table in length
Resizing a table in width
Filtering tables
Approximate match
To-table conversions
Record-to-table conversion
Creating tables from columns, rows, or records
Table information
Working with mixed structures
Lists of tables, lists, or records
Tables with lists, records, or tables
Mixed structures
Flatten all
Unpacking all record fields from lists
Extracting data through lookup
Summary
Parameters and Custom Functions
Parameters
Understanding parameters
Creating parameters
Using parameters in your queries
Putting it all together
Parameterizing connection information
Dynamic file paths
Filtering a date range
Custom functions
What are custom functions?
Transforming queries into a function
What is the create function
functionality?
Simplifying troubleshooting and making changes
Invoking custom functions
Manually in the advanced editor or formula bar
Using the UI
The each expression
Common usecases
Refining function definitions
Specifying data types
Making parameters optional
Referencing column and field names
Debugging custom functions
Function scope
Top-level expression
In line within a query
Putting it all together
Turning all columns into text
Merging tables based on date ranges
Summary
Dealing with Dates, Times, and Durations
Technical requirements
Dates
M calendar table
Other date formats
Julian days
Alternate date formats
Additional custom date functions
Working days
Moving average
Time
Creating a time table
Shift classification
Dates and times
Time zones
Correcting data refresh times
Duration
Working duration
Summary
Comparers, Replacers, Combiners, and Splitters
Technical requirements
Key concepts
Function invocation
Some common errors
Closures
Higher-order functions
Anonymous functions
Ordering values
Comparers
Comparer.Equals
Comparer.Ordinal
Comparer.OrdinalIgnoreCase
Comparer.FromCulture
Comparison criteria
Numeric value
Computing a sort key
List with key and order
Custom comparer with conditional logic
Custom comparer with Value.Compare
Equation criteria
Default comparers
Custom comparer
Key selectors
Combining key selectors and comparers
Replacers
Replacer.ReplaceText
Replacer.ReplaceValue
Custom replacers
Combiners
Combiner.CombineTextByDelimiter
Functionality
Example
Combiner.CombineTextByEachDelimiter
Functionality
Example
Combiner.CombineTextByLengths
Functionality
Example
Combiner.CombineTextByPositions
Functionality
Example
Combiner.CombineTextByRanges
Functionality
Example
Splitters
Splitter.SplitByNothing
Functionality
Example
Splitter.SplitTextByAnyDelimiter
Functionality
Example
Splitter.SplitTextByCharacterTransition
Functionality
Example
Splitter.SplitTextByDelimiter
Functionality
Example
Splitter.SplitTextByEachDelimiter
Functionality
Example
Splitter.SplitTextByLengths
Functionality
Example
Splitter.SplitTextByPositions
Functionality
Example
Splitter.SplitTextByRanges
Functionality
Example
Splitter.SplitTextByRepeatedLengths
Functionality
Example
Splitter.SplitTextByWhitespace
Functionality
Example
Practical examples
Removing control characters and excess spaces
Goals
A cleanTrim function
Extract email addresses from a string
Goals
Developing a getEmail function
Split combined cell values into rows
Goal
Transforming the table
Replacing multiple values
Goal
Accumulating a result
Combining rows conditionally
Goal
Group By’s comparer to the rescue
Summary
Handling Errors and Debugging
Technical requirements
What is an error?
Error containment
Error detection
Raising errors
The error expression
The … (ellipsis) operator
Error handling
Strategies for debugging
Common errors
Syntax errors
Dealing with errors – a top priority
DataSource.Error, could not find the source
An unknown or missing identifier
An unknown function
An unknown column reference
An unknown field reference
Not enough elements in the enumeration
Formula.Firewall error
Expression.Error: The key didn’t match any rows in the table
Expression.Error: The key matched more than one row in the table
Expression.Error: Evaluation resulted in a stack overflow and cannot continue
Putting it all together
Column selection
Building a custom solution
Reporting cell-level errors
Building a custom solution
Summary
Iteration and Recursion
Introduction to iteration
List.Transform
Extracting items from a list by position
Allocating a yearly budget to months
List.Accumulate
Function anatomy
Replacing multiple values
List.Generate
Advantages of List.Generate
Function anatomy
Handling variables using records
List.Generate alternatives
What are useful List.Generate scenarios
Looping through API data using List.Generate
Creating an efficient running total
Recursion
Why is recursion important?
Recursion versus iteration: a brief comparison
Recursive functions
What is the @ scoping operator?
Inclusive-identifier-reference
Using recursive functions
How to use the @ operator
Step 1: Write your initial code
Step 2: Identify the recursive call
Step 3: Add the @ operator
Step 4: Test your function
Removing consecutive spaces
Performance considerations using recursion
Summary
Troublesome Data Patterns
Pattern matching
Basics of pattern matching
Case sensitivity
Contains versus exact match
Allowed characters
Handling one or more elements
Wildcards
Extracting fixed patterns
Example 1, prefixed
Example 2, pattern
Example 3, splitters
Example 4, substitution
Example 5, regex
Combining data
Basics for combining data
Extract, transform, and combine
Get and inspect data
Location parameter
Connect to data
Filter files
Overall strategy
Choose a sample file
File parameter
Transformation pattern
Query, create function
Set up monitoring
Finetuning
Combine files
Summary
Optimizing Performance
Understanding memory usage when evaluating queries
Memory limit variations and adjustments
Query folding
Query folding in action
Query evaluation
Folding, not folding, and partial folding
Tools to determine foldability
View data source query
Query folding indicators
Query plan
Operations and their impact on folding
Foldable operations
Non-foldable operations
Data source privacy levels
Native database queries
Functions designed to prevent query folding
Strategies for maintaining query folding
Rearranging steps
Working with native queries
Rewriting code
Using cross-database folding
The formula firewall
What is the formula firewall?
Understanding partitions
The fundamental principle of the formula firewall
Firewall error: Referencing other partitions
Connecting to a URL using native parameters
Connecting to a URL using an Excel parameter
Resolving the firewall error
Firewall error: Accessing compatible data sources
Understanding privacy levels
Setting privacy levels
Resolving the firewall error
Optimizing query performance
Prioritize filtering rows and removing columns
Buffering versus streaming operations
Buffering operations
Streaming operations
Using the query plan
Using buffer functions
The impact of buffering and running totals
The setup
Method 1: Running total from regular table
Method 2: Running total from buffered table
Method 3: Running total from buffered column
Method 4: Running total from buffered column using List.Generate
Data source considerations
Data sources and speed
Using dataflows
Performance tips
Summary
Enabling Extensions
Technical requirements
What are Power Query extensions?
What can you do with extensions?
Preparing your environment
Getting Visual Studio Code
Getting the Power Query SDK
Setting up Internet Information Services
Setting up Discord
Discord client
Discord server
Discord app
Discord OAuth configuration
Creating a custom connector
Creating an extension project
TDGTPQM_Discord.pq
Configuring authentication
Adding client ID and client secret files
Adding Configuration Settings
Creating OAuth functions
Modifying the data source record definition
Adding a credential
Testing the connection
Configuring navigation and content
Adding API call functions for data retrieval
Adding navigation functions
Modifying the contents function
Installing and using a custom connector
Summary
Other Books You May Enjoy
Index
Landmarks
Cover
Index
Preface
Over the last decade, the popularity of the Power Query’s M language has continuously grown, such that it has become nearly ubiquitous within the Microsoft ecosystem, including Power BI, Excel, Power Platform, Dynamics 365, SQL Server, and Data Factory. Today, M and Power Query are indispensable tools for modern data professionals, such as business analysts, data scientists, and data enthusiasts.
This book seeks to make you a master of the M language. While M can be written using the Power Query Editor’s graphical user interface (GUI), this approach severely limits you to only a small fraction of the 700+ core M functions. We estimate that the GUI for authoring M queries allows you to solve only about 50% of challenges related to data transformation. However, mastery of M allows you to bring that figure close to 100%. Starting with simple concepts and code, this book progressively moves you along a path of increasing complexity. Complex concepts are explained in clear and concise language with practical examples that demonstrate the concepts in action. By the end of this book, there will be few, if any, data transformation challenges that you won’t be able to tackle head-on.
Get ready for an exciting and rewarding journey toward your mastery of the M language!
Who this book is for
If you’re new to Power Query, then this book might not be the best choice for you. There are books out there that introduce the basics through the user interface and simple button-based actions, offering a straightforward way to get started.
On the other hand, if you are serious about fully understanding the M language, then this book is for you. If this is your first experience with M, you may find certain concepts challenging. Reading through the pages the first time will provide you with lots of information, but some topics may be too complex. As you become more familiar with M, reading the book again will likely make these complex topics easier to understand.
Overall, our goal was to write a book that blends theoretical knowledge with practical examples. The first few chapters are more theoretical and provide the M language fundamentals. As the book progresses and your understanding deepens, we include projects and exercise files for you to work along with.
The Power Query M language is useful for a variety of users. Power BI users may use M to prepare their data model, Excel users may transform data and output it on their spreadsheets, while data factory users could use M to query an API and transfer the result into a database. In this book, we aim to cater to all these different audiences.
Although some subjects, such as creating custom connectors or optimizing performance, may appeal more to those with advanced knowledge due to their technical nature, we believe it’s beneficial for all users to understand the different ways the M language can be used.
Starting with the M language can be daunting, and mastering it took us several years. It’s a challenging journey that requires focus and a lot of experimentation. However, if you persevere, our book offers unprecedented depth of all aspects of M, allowing you to become a true M expert.
What this book covers
Chapter 1, Introducing M, introduces M language basics such as the
let
expression and also covers the history of M and the formal and informal characteristics of M. It also discusses who should learn M, why you should learn M, and where and how to use M.
Chapter 2, Working with Power Query/M, introduces the Power Query Editor, the main application for coding M. In addition, it allows you to transition from only using the Power Query Editor’s GUI to create M code to writing the code yourself via custom columns and the Advanced Editor.
Chapter 3, Accessing and Combining Data, explores the multitude of different data connectors available for M, including file and folder connectors, database and cube connectors, working with binary data, and finally, how to combine and merge data between queries.
Chapter 4, Values and Expressions in M, introduces the various kinds of values in the M language, as well as expressions, operators, control structures, and enumerations.
Chapter 5, Understanding Data Types, explores the importance of data types in the M language. It teaches you about their structure and their application in real-world scenarios. The chapter offers techniques to automatically detect data types and discusses why type conversion is important. Additionally, it presents the concept of facets and type ascription, which often lead to errors in your queries.
Chapter 6, Structured Values, covers some of the most important values in the M language,such as lists, records and tables. These values can store multiple values within them. You will learn the techniques for creating them, their typical uses, and the operators that work with structured values. Furthermore, the chapter explains how to access the individual values they contain.
Chapter 7, Conceptualizing M, explores some of the more abstract concepts of M that are critical to truly understanding how the M language works. The topics covered include the global environment, creating your own global environment, sections, closure, query folding, and metadata.
Chapter 8, Working with Nested Structures, presents several techniques to transform and manipulate structured values, such as lists, records, and tables, helping you move past the limitations of the GUI.
Chapter 9, Parameters and Custom Functions, begins by discussing the role of parameters in making queries flexible and dynamic. It then delves into how to create custom functions, covering their syntax, the importance of data types, and debugging techniques. Finally, you learn that the
each
expression is syntax sugar for a function.
Chapter 10, Dealing with Dates, Times, and Durations, explains how temporal analysis is key to unlocking many data insights and why you must be proficient in dealing with dates, times, and durations to perform such analyses. This chapter explores these subjects and more with numerous practical examples.
Chapter 11, Comparers, Replacers, Combiners, and Splitters, explores techniques for manipulating data. It demonstrates how to customize the way values are compared, ordered, replaced, combined, or split. These methods are essential for a wide range of common data transformation tasks.
Chapter 12, Handling Errors and Debugging, focuses on what errors are and provides guidance on how to handle them in the M language and debug your code successfully, enabling you to build more robust queries. In addition, it offers techniques to report errors.
Chapter 13, Iteration and Recursion, explains recursion through the use of the
@
operator. The chapter then shifts focus to iteration techniques, using
List.Transform
,
List.Accumulate
, and
List.Generate
. You’ll learn about memory considerations and what aspects to consider for the best performance.
Chapter 14, Troublesome Data Patterns, illustrates the versatility of the M language, covering various common text extraction techniques as well as providing a comprehensive approach to building a manageable custom solution for dealing with files in bulk.
Chapter 15, Optimizing Performance, examines factors that influence query performance. It introduces mashup containers and the importance of memory management. We’ll delve into query folding, explore the formula firewall’s mechanics, and present various methods to improve query performance.
Chapter 16, Enabling Extensions, demonstrates how to extend the M language by creating your own, reusable library of M functions, including a detailed example of creating a custom data connector.
To get the most out of this book
We expect you to have a basic knowledge of Power Query and some experience with analyzing data. If you have experience with the M language, that is helpful to understand concepts more quickly. However, knowing M is not a requirement.
Throughout the book, there are references to SQL and Data Analysis Expressions (DAX) code. However, you don’t need to know these languages because the comparisons are simply meant to reflect on the different approaches between the languages. There’s no need to worry if you don’t understand a particular code snippet; that means the comparison is not as applicable to you.
In the more advanced sections of the book, we cover query folding, custom connectors, and memory usage. Some of you may not be familiar with these topics and that’s okay. However, this information is an important element of what the M language is used for and we think it’s good for everyone to read.
Download the example code files
The code bundle for the book is hosted on GitHub at https://github.com/PacktPublishing/The-Definitive-Guide-to-Power-Query-M-/. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
Download the color images
We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://packt.link/gbp/9781835089729.
Conventions used
There are a number of text conventions used throughout this book.
CodeInText
: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. For example: "Navigate to the
/ClientApp/src/app/cities
folder."
A block of code is set as follows:
#
date
( year
as
number, month
as
number, day
as
number, )
as
date
When we wish to draw your attention to a particular part of a code block, the relevant lines or items are highlighted:
#
date
( year
as
number, month
as
number, day
as
number, )
as
date
Bold: Indicates a new term, an important word, or words that you see on the screen. For instance, words in menus or dialog boxes appear in the text like this. For example: "Navigate to the Home tab of the ribbon, click on the dropdown below the Transform data button, and select Edit parameters."
Warnings or important notes appear like this.
Tips and tricks appear like this.
Get in touch
Feedback from our readers is always welcome.
General feedback: Email
and mention the book’s title in the subject of your message. If you have questions about any aspect of this book, please email us at
.
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you reported this to us. Please visit http://www.packtpub.com/submit-errata, click Submit Errata, and fill in the form.
Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at
with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit http://authors.packtpub.com.
Share your thoughts
Once you’ve read The Definitive Guide to Power Query (M), we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.
Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.
Download a free PDF copy of this book
Thanks for purchasing this book!
Do you like to read on the go but are unable to carry your print books everywhere?
Is your eBook purchase not compatible with the device of your choice?
Don’t worry, now with every Packt book you get a DRM-free PDF version of that book at no cost.
Read anywhere, any place, on any device. Search, copy, and paste code from your favorite technical books directly into your application.
The perks don’t stop there, you can get exclusive access to discounts, newsletters, and great free content in your inbox daily
Follow these simple steps to get the benefits:
Scan the QR code or visit the link below
Qr code Description automatically generatedhttps://packt.link/free-ebook/9781835089729
Submit your proof of purchase
That’s it! We’ll send your free PDF and other benefits to your email directly
1
Introducing M
M is a powerful and versatile formula language specifically designed for data manipulation and transformation. The term M is an informal designation. M’s official name is the Power Query Formula Language. For an explanation of this designation, see the History of M section later in this chapter. M is the language at the heart of Power Query, which is used in numerous applications like Microsoft Excel, Power BI, Power Platform, and Microsoft Fabric for data transformation and preparation.
The popularity of the M language has continuously grown over the last decade, and the language has been integrated into an impressive array of Microsoft tools and platforms. Today, M and Power Query are indispensable tools for modern data professionals such as business analysts, data scientists, and data enthusiasts.
This chapter is the beginning of your exciting journey that culminates in the mastery of the M language. We start with a brief history of M and then cover the basics of who, where, why, and how. Next, we introduce the absolute basics of the M language and finish with the formal and informal characteristics of M (in effect, what is M?). Overall, this chapter provides a firm foundation for the more in-depth exploration of the M language found throughout the rest of this book. Specifically, this chapter covers the following topics:
The history of M
Who should learn M?
Where and how Is M used?
Why learn M?
M language basics
The characteristics of M
The history of M
The process of extracting, transforming, and loading data is a challenge as old as information technology itself. Both business users and IT professionals have historically struggled with the challenge, and numerous software tools have been developed over the years to help deal with the challenge such as SQL Server Integration Services (SSIS) and Alteryx.
However, many of these tools were complex and not easily portable. The M language and Power Query were created to help solve these issues.
While there might be some more speculative history regarding the origins of M, we can at least definitively trace M back to a project originally code-named Data Explorer. Data Explorer was an Azure SQL Labs project circa 2011 that aimed to simplify the process of accessing, cleaning, and preparing data from various sources. The query language was thought of as a mashup language (hence the M for mashup).
In 2013, Microsoft released Power Query as an add-in for Excel. Power Query introduced a user-friendly interface, allowing business users to perform data transformations via a visual editor. Behind the scenes, Power Query utilized the M language as the underlying formula language to drive the data transformations, and as such, these data transformations became repeatable. Instead of, for example, business users continually performing the same manual data transformations on source data received as comma-delimited files, that process could now be effectively automated.
Following the success of Power Query in Excel, Microsoft included Power Query as part of its new product, Power BI Designer, which eventually became Power BI Desktop. As Power Query gained popularity, there was a need to standardize the underlying formula language. In 2016, Microsoft submitted the Power Query Formula Language specification to the European Computer Manufacturers Association (ECMA), an international standards organization. This effort established a formal specification for the language, ensuring compatibility and interoperability between different implementations.
While the language was formally referred to as the Power Query Formula Language, it became commonly known as M among the user community. The informal name M gained widespread acceptance and is now widely used to refer to the language.
Microsoft continues to enhance and refine the M language as part of its ongoing investment in data integration and transformation technologies. New functions, features, and improvements are periodically introduced to provide users with more powerful and efficient ways to manipulate and prepare their data. In addition, Microsoft continues to introduce M within additional software tools and platforms, such as data integration within Microsoft Power Platform and dataflows within Power BI and Fabric.
Today, the M language is a key component of Microsoft’s data transformation and integration toolset. The proliferation of M, as well as its versatility and extensibility, make it an invaluable language for today’s modern data professionals.
Let’s now turn our attention to who should learn M.
Who should learn M?
M is a powerful tool for data professionals and individuals who work with data on a regular basis. The versatility and capabilities of M make it a valuable language to learn for various roles, including the following:
Data analysts: Data analysts who deal with data extraction, transformation, and preparation tasks can greatly benefit from learning M. It provides a comprehensive set of functions and operators that enable data analysts to efficiently shape and manipulate data from diverse sources. By mastering M, data analysts can automate repetitive tasks, handle complex data transformations, and ensure data quality, leading to more accurate and reliable data analysis.
Business intelligence professionals: Professionals in the business intelligence (BI) space can greatly enhance their skills by learning M. It is a core component of Power BI. By understanding M, BI professionals gain the ability to connect to various data sources, perform complex data transformations, and create reusable data preparation workflows, enabling them to provide actionable insights and drive informed decision-making.
Data engineers: Data engineers involved in the design and implementation of data pipelines and data integration processes can really benefit from learning M. It allows data engineers to efficiently extract, transform, and load (ETL) data from different sources into data warehouses or data lakes, specifically within Power BI and Microsoft Fabric. M also provides the flexibility and power to handle complex data formats, define custom transformations, and create efficient data processing workflows. By mastering M, data engineers can streamline data integration processes and ensure data consistency and quality.
Data scientists: Data scientists who perform exploratory data analysis, model development, and advanced analytics can leverage the capabilities of M to efficiently prepare their data. M provides a robust set of functions for cleaning, shaping, and aggregating data, allowing data scientists to focus on the analytical aspects of their work. By incorporating M into their data preparation workflows, data scientists can streamline the pipeline of turning raw data into insights, spending less time on data cleansing and preparation and more time on data modeling and analysis.
Power users: Power users in Excel and Microsoft’s Power Platform who work extensively with data and perform complex data manipulations can benefit from learning M. It is integrated into Excel through Power Query, empowering users to perform advanced data transformations within the familiar Excel interface. In addition, M is integrated into the Microsoft Power Platform via data integration, allowing data to be transformed and mapped between systems. By mastering M, Power users can expand their data manipulation capabilities, automate repetitive tasks, and enhance the accuracy and reliability of their analyses.
Individuals in data-driven roles: Beyond the specific roles mentioned above, individuals in various data-driven roles, such as project managers, consultants, researchers, and domain experts, can benefit from learning M. Mastery of M provides the ability for individuals in these roles to independently handle data-related tasks, extract meaningful insights, and make informed decisions based on reliable data.
M is a valuable language for a wide range of data professionals and other individuals who work with data. Whether you are a data analyst, BI professional, data engineer, data scientist, power user, or someone in a data-driven role, learning M empowers you to efficiently ingest, transform, and prepare data for analysis.
Now that we understand what types of individuals would want to learn M, let’s next explore where those individuals can leverage their mastery of M.
Where and how is M used?
M is a versatile language included in various tools and platforms where data transformation and manipulation are essential. Its integration within the Power Query ecosystem enables users to leverage M’s capabilities in different environments. In this section, we explore some key areas where M is widely used.
Experiences
Before discussing specific products where M is used, it is important to understand the different experiences available for authoring M. There are two experiences available for authoring M, one intended for on-premises use and the other for cloud-based applications. These experiences are the following:
Power Query Desktop: Power Query Desktop is the experience for Power Query found in desktop applications such as Power BI Desktop and Microsoft Excel. While the experiences are similar, there are differences. For example, the artificial intelligence (AI) and machine learning (ML) integrations as well as the integrations with R and Python present in Power BI Desktop are not present in Microsoft Excel. Conversely, the Structured Column options available in Excel are not available in Power BI Desktop.
Power Query Online: Power Query Online, a cloud-based service, allows users to create and manage data transformations within a web browser. M is used extensively in Power Query Online to define data transformations, connect to data sources, and perform complex data manipulations. Users can access and edit M queries directly within the browser interface, making it convenient to collaborate and work on data transformation tasks from anywhere with an internet connection. Power Query Online is integrated into a variety of Microsoft products, including the Power BI service, Power Apps, Power Automate, etc.
It is important to note that while two different experiences for authoring M exist, both provide nearly the exact same user experience. Even better, both provide the ability to edit the underlying M code, which is the primary focus of this book. Thus, the skills learned here apply equally to either experience used within any product or service.
Products and services
M is ubiquitous within the Microsoft ecosystem, including the following software and services:
Dataflows: Dataflows are product-agnostic, cloud-based M queries that can be reused across multiple different products. Dataflows enable users to build and manage reusable data preparation and transformation processes. Dataflows leverage the Power Query Online experience.
Power BI Desktop: M is a fundamental component of Power BI Desktop, a leading BI tool. M allows users to connect to different data sources, perform data transformations, and create interactive visualizations and reports.
M enables users to extract, clean, and shape data from diverse sources, such as databases, Excel files, web services, and more. With M, users can define data transformation steps and create reusable queries that refresh and update data automatically when the underlying source changes.
Within Power BI Desktop, M is used within the Power Query editor, a sub-program launched from within Power BI Desktop. The Power Query editor provides a powerful graphical user interface (GUI) for working with the M formula language, as shown in the following screenshot:
A screenshot of a computer Description automatically generatedFigure 1.1: Power Query editor in Power BI Desktop
The Power Query editor is covered in greater detail in Chapter 2, Working With Power Query/M.
Power BI Desktop also supports the use of dataflows.
Power BI/Fabric service: The Power BI/Fabric service (powerbi.com) is the cloud-based component of Power BI that enables you to share reports, dashboards, and other content. The service supports the use of M code via the creation of dataflows, using the Power Query Online experience.
To create a dataflow in the Power BI service, navigate to any workspace other than My Workspace and choose New and then Dataflow, as shown in Figure 1.2:
Figure 1.2: Create a dataflow in the Power BI service
Power BI Report Server: Power BI Report Server (PBRS) supports the Power Query Desktop experience, allowing users to create rich data transformations via M.
Excel (Windows and Macintosh): M is seamlessly integrated into Excel, empowering users to perform advanced data transformations within the familiar Excel interface. Power Query, the engine behind Excel’s data transformation capabilities, is powered by M. Users can access the Power Query editor in Excel to apply M transformations, filter and sort data, remove duplicates, merge and append tables, and perform other data preparation tasks. M allows users to clean, reshape, and enrich data in Excel, enhancing the accuracy and reliability of their analyses.
In Excel, the Power Query editor interface can be accessed by using the Data tab of the ribbon and choosing Get Data:
Figure 1.3: Get Data in Microsoft Excel
Once the data source is chosen, the Power Query Editor interface can be accessed by choosing the Transform Data button:
Figure 1.4: Transform Data option in Microsoft Excel
Both the Windows and Macintosh versions of Excel also support accessing and using dataflows.
Power Apps: Power Apps is Microsoft’s low-code platform for creating applications. Both the Power Query Online experience as well as the use of dataflows are supported. A common use case is to leverage M either via the Power Query Online experience or dataflows, allowing users to seamlessly bring their data into the Dataverse (formally Common Data Service).
Power Automate: Power Automate is Microsoft’s low code platform for automating workflows. Power Automate allows users to automate repetitive workflows and processes that may involve data manipulation and integration tasks. M can be employed within Power Automate to perform data transformations and handle complex data scenarios as part of the automated workflows, via the Power Query Online experience. By incorporating M into Power Automate, users can build sophisticated data integration and automation solutions that streamline their business processes. In addition, dataflows can be leveraged in Power Automate via Power Query Dataflows connector. This allows actions to occur once a dataflow completes and also provides the ability for a dataflow to be initiated as an action within a Power Automate flow.
Data Factory: Data Factory is a managed cloud service specifically built for complex extract-transform-load (ETL) and extract-load-transform (ELT) integration projects. Data Factory allows the creation and orchestration of data-driven workflows, data movement, and transformation at scale. Both Azure Data Factory and Data Factory in Microsoft Fabric support M code, via both the Power Query Online experience as well as dataflows.
SQL Server: SSIS supports the core M engine while SQL Server Analysis Services (SSAS) supports the Power Query Desktop experience.
Dynamics 365 Customer Insights: Customer Insights within Dynamics 365 is Microsoft’s customer data platform (CDP) that provides a holistic view of customers, enabling personalized customer experiences. Customer Insights supports both dataflows as well as the Power Query Online experience.
Visual Studio: Visual Studio allows M to be integrated as a language. This is done via the Power Query Language Service for Visual Studio Code and is available in the Visual Studio Code Marketplace. This language service provides fuzzy autocomplete, hover, function hints, and other functionality for writing M code within Visual Studio.
There is also the Visual Studio Power Query Software Development Kit (SDK). This SDK consists of a set of tools designed to help create custom Power Query data source connectors. The Visual Studio Power Query SDK is covered in greater detail in Chapter 16, Enabling Extensions.
Other data integration scenarios: M is not limited to the aforementioned software and services. M can also be leveraged in custom applications and programming environments that utilize Power Query libraries.
As you can see, M is widely used in different tools and platforms within the Microsoft ecosystem, such as Power BI Desktop, Excel, the Power BI and Fabric service, Power Platform, SQL Server, and Dynamics. M enables users to connect to various data sources, perform advanced data transformations, and automate data integration workflows. The skills learned in this book deal with the M language itself and, thus, transcend both the experience as well as the specific product or service. Thus, by mastering M, users gain the ability to create reusable data transformation processes and enhance their data manipulation capabilities across a wide range of data-related scenarios, as well as across any experience, product, or service that uses M as its underlying data transformation layer.
Let’s now turn our attention to why data professionals and other individuals might want to add M to their repertoire of language.
Why learn M?
In today’s data-driven world, efficiently and effectively transforming and analyzing data is a valuable skill. Power Query, a powerful data transformation and preparation tool, gained immense popularity due to its seamless integration with many popular software systems as well as its ease of use. At the heart of Power Query lies M, the Power Query Formula Language. But you may be asking yourself, why should you invest the time in order to learn M?
Here are seven reasons why we believe data professionals and other individuals should learn M:
Tapping into the full power of Power Query: In Gil Raviv’s book, Collect, Combine, and Transform Data Using Power Query in Excel and Power BI, Mr. Raviv estimates that the GUI for authoring M queries (see Where and how Is M used? in this chapter) allows you to solve only 40% of challenges related to data transformation, but mastery of M allows you to bring that figure closer to 99.99%. Later chapters demonstrate specific examples of solving data transformation challenges that cannot be done in the GUI. Since M serves as the backbone of Power Query’s data transformation capabilities, by mastering M, you gain full control over the data transformation process, allowing you to extract, clean, transform, and reshape data from diverse sources.
Automation of repetitive tasks: One of the primary reasons you should learn M is to automate repetitive data transformation tasks. Business and IT professionals are often tasked with receiving data on a recurring basis and then generating reports based on this data. Instead of manually transforming this data each time (often in Excel) to prepare it for reporting purposes, leveraging M for this data transformation allows the data transformation logic to be implemented once and then automatically run each time new data is received.
Flexibility and customization: While Power Query provides a user-friendly interface for data transformation tasks, it does have its limitations. By learning M, you can extend the capabilities of Power Query and overcome these limitations. M allows you to write custom functions, perform advanced transformations, and apply complex logic that goes beyond the built-in capabilities of the Power Query interface. This flexibility empowers you to tailor your data transformations precisely to meet the unique requirements of your data sources and analysis.
Efficiency and performance optimization: M is a highly efficient and optimized language for data transformations. The Power Query engine intelligently processes M expressions, optimizing performance by reducing unnecessary data loads and transformations. When working with large datasets or complex transformations, knowing M enables you to write efficient code that significantly speeds up your data processing, as demonstrated in Chapter 15, Optimizing Performance. By understanding the underlying principles of M and its performance considerations, you can optimize your data workflows and save valuable time. Finally, leveraging M can greatly reduce and simplify the formula and Data Analysis Expressions (DAX) code in downstream applications like Excel and Power BI Desktop.
Advanced data cleaning and transformation: M provides a comprehensive set of data cleaning and transformation functions that go far beyond the basic operations available in traditional spreadsheet applications. With M, you can easily handle data quality issues, such as removing duplicates, handling missing values, splitting columns, merging sets of data, and performing advanced calculations. Learning M enables you to tackle complex data cleaning and transformation tasks efficiently, leading to accurate and reliable data analysis.
Integration with other programming languages: M is not only a standalone language but also integrates well with other programming languages such as SQL, R, and Python. This integration allows you to leverage the capabilities of these languages within your Power Query workflows. You can combine M code with native SQL queries, call R or Python scripts, and seamlessly incorporate external libraries and functions into your data transformation process. By expanding your knowledge to include M, you unlock the potential to leverage the best features of various programming languages for data manipulation.
Career advancement: Proficiency in M and Power Query has become a sought-after skill in the data industry. As organizations increasingly rely on data for decision-making, individuals who possess the ability to efficiently transform, clean, and analyze data are in high demand. By investing time and effort in learning M, you position yourself as a valuable asset to organizations that rely on data-driven insights. The knowledge of M can open up new career opportunities, enhance your job prospects, and enable you to take on challenging data-related projects.
In summary, learning M enables you to efficiently ingest, transform, and analyze data from diverse sources. It provides flexibility, customization, and performance optimization capabilities that extend the functionality of Power Query itself. By mastering M, you gain a competitive edge in the data industry and open doors to new career possibilities.
We hope that you are now excited about learning M! Let’s turn our attention to the basics of the M language.
M language basics
As previously noted, M is a powerful language designed for data ingest and transformation within a variety of Microsoft software and services. Understanding the basics of the M language is essential for effectively leveraging its capabilities.
Here are some important fundamentals regarding the M language:
Expressions and functions: In M, expressions form the building blocks of data transformations. An expression represents a computation or operation that evaluates to a value. M provides a wide range of built-in functions that can be used to perform operations on data. Functions in M are called using a syntax where the function name is followed by arguments within parentheses. For example, the function
Text.Start(Hello, World!
, 5)
returns the substring
Hello
from the input text. More about expressions and functions are covered in Chapter 4, Understanding Values and Expressions, as well as Chapter 9, Parameters and Custom Functions.
Data types: M supports various data types, including text, numbers, dates, times, lists, tables, and records. Understanding the data types in M is crucial for performing accurate transformations. M provides functions to convert between different data types and manipulate data, based on their inherent characteristics. For example, the
Text.From
function converts a value to text, while the
Date.Year
function extracts the year component from a date or datetime value. Data types are covered in Chapter 5, Understanding Data Types.
Variables and constants: M allows you to define variables and constants to store and reuse values during data transformations. Variables are created within a let expression, followed by a comma-separated list of variable assignments. Constants, on the other hand, are fixed values that remain constant throughout the execution. Variables and constants help improve code readability, enable reuse, and make complex transformations more manageable. More about variables and constants can be found throughout this book.
Operators: M supports a variety of operators to perform mathematical calculations, logical comparisons, and text manipulations. Arithmetic operators (
+
,
-
,
*
,
/
, and so on) are used for numeric calculations, while comparison operators (
>
,
<
,
=
, and so on) evaluate logical conditions. The combination operator ‘
&
' is used for concatenating text values, appending lists and tables, or merging records. Operators are covered in Chapter 4, Understanding Values and Expressions.
Step-by-step transformation process: M follows a step-by-step transformation process where each step defines a data transformation operation. The Power Query editor provides a visual interface to define these steps and generates the corresponding M code. Steps can include operations such as filtering rows, removing duplicates, splitting columns, merging tables, and aggregating data. Chapter 2, Working with Power Query/M, covers this topic in more detail.
Query folding: Query folding is an optimization technique in Power Query that pushes data transformations to the data source whenever possible. When using M, it is important to be aware of query folding to ensure efficient data processing. Query folding can improve performance by reducing data transfer between the data source and Power Query. However, not all transformations can be folded, so it is essential to understand which operations can be folded and which cannot. For example, when using Direct Query or Dual storage mode for tables, all M queries must fold, which can limit certain transformation operations. Query folding is discussed in Chapter 7, Conceptualizing M, and in Chapter 15, Optimizing Performance.
Error handling and debugging: M provides error handling mechanisms to catch and handle exceptions during data transformations. By using functions like
try
,
otherwise
, and
error
, you can control the flow of execution and handle potential errors gracefully. Additionally, M supports debugging capabilities, such as the ability to step through the code to identify and resolve issues in complex transformations. Error handling and debugging are covered in Chapter 12, Handling Errors and Debugging.
Case sensitivity: M is case-sensitive. This applies to all functions, expressions, variables, constants, and other aspects of the M language.
Commenting: Comments in M follow the C language commenting style. Inline comments are proceeded by double slashes (
//
) while block comments use the slash-asterisk/asterisk-slash pattern (
/*
and
*/
).
Now that we have a good understanding of the core components of the M language, let’s next explore the most fundamental component of the M language, the
let
expression.
The let expression
At the core of the M language is the
let
expression, which must be paired with an
in
expression. In simple terms, the
let
expression contains the input and transformations, while the
in
expression contains the output. A simple
Hello World
for M looks like the following:
let
Hello
=
Hello World
in
Hello
This code would return the ubiquitous
Hello World
text.
It is important to note that every expression within a let statement must be followed by a comma (
,
) except the last expression prior to the
in
expression. Thus, if the
let
expression consists of multiple sub-expressions, then the code might look like the following:
let
Hello
=
Hello
, World
=
World
, Return
=
Hello
&
&
World
in
Return
This code also returns
Hello World
as output.
Understanding the basics of M, including expressions, functions, data types, variables, operators, and the step-by-step transformation process, is vital for effectively manipulating and preparing data. By mastering these foundational concepts, you gain the ability to perform complex transformations, optimize data workflows, and unlock the full potential of the M language. The rest of this book is devoted to helping you master all of these foundational concepts and how to apply them to complex data transformations.
The characteristics of M
M is a programming language that serves as the backbone of Power Query, enabling users to extract, clean, and reshape data from various sources, such as databases, spreadsheets, web pages, and more. However, unlike general-purpose