Discover millions of ebooks, audiobooks, and so much more with a free trial

From $11.99/month after trial. Cancel anytime.

The Definitive Guide to Power Query (M): Mastering complex data transformation with Power Query
The Definitive Guide to Power Query (M): Mastering complex data transformation with Power Query
The Definitive Guide to Power Query (M): Mastering complex data transformation with Power Query
Ebook2,242 pages21 hours

The Definitive Guide to Power Query (M): Mastering complex data transformation with Power Query

Rating: 5 out of 5 stars

5/5

()

Read preview

About this ebook

Data transformation is a critical step in building data models and business intelligence reports. Power Query is an invaluable tool for anyone who wants to master data transformation, and this book will equip you with the knowledge and skills to make the most of it.

The Definitive Guide to Power Query (M) will help you build a solid foundation in the Power Query M language. As you progress through the chapters, you'll learn how to use that knowledge to implement advanced concepts and data transformations. This will set the stage for an uncompromisingly thorough exploration of the Power Query M Language.

You'll also get to grips with optimizing performance, handling errors, and implementing efficient data processing techniques. As this is a hands-on guide, the practical examples in the chapters will help you gain the skills to apply Power Query to real-world problems and improve your data analysis capabilities.

By the end of this book, you will be able to leverage all of Power Query's remarkable capabilities for data transformation.

LanguageEnglish
Release dateMar 29, 2024
ISBN9781835086162
The Definitive Guide to Power Query (M): Mastering complex data transformation with Power Query

Read more from Gregory Deckler

Related to The Definitive Guide to Power Query (M)

Related ebooks

Computers For You

View More

Related articles

Reviews for The Definitive Guide to Power Query (M)

Rating: 5 out of 5 stars
5/5

1 rating0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    The Definitive Guide to Power Query (M) - Gregory Deckler

    cover_1.png

    The Definitive Guide to Power Query (M)

    Mastering complex data transformation with Power Query

    Gregory Deckler

    Rick de Groot

    Melissa de Korte

    The Definitive Guide to Power Query (M)

    Copyright © 2024 Packt Publishing

    All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

    Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

    Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

    Senior Publishing Product Manager: Gebin George

    Acquisition Editor – Peer Reviews: Tejas Mhasvekar

    Project Editor: Rianna Rodrigues

    Content Development Editors: Shruti Menon and Shazeen Iqbal

    Copy Editor: Safis Editing

    Technical Editors: Aneri Patel and Kushal Sharma

    Proofreader: Safis Editing

    Indexer: Tejal Daruwale Soni

    Presentation Designer: Pranit Padwal

    Developer Relations Marketing Executive: Vignesh Raju

    First published: March 2024

    Production reference: 1260324

    Published by Packt Publishing Ltd.

    Grosvenor House

    11 St Paul’s Square

    Birmingham

    B3 1RB, UK.

    ISBN 978-1-83508-972-9

    www.packt.com

    Foreword

    Occasionally, the stars align in just the right way, and the result is something that is unexpectedly great. That’s the story behind the book you’re holding now. But I’m getting ahead of myself. Let’s turn the clock back to the start of 2022, which is where the story of this book begins.

    At that point, I had been a data professional and Excel user for over 30 years and was doing a lot of Power BI development while conducting online training as well. I had long used Power Query to clean and transform my data prior to performing analysis and building reports. But almost all of that prep was done through the Power Query user interface. As well designed as the Power Query UI is, it’s also quite limiting, providing access to only a fraction of the 700+ available M functions. So, at the start of 2022, I decided that I was going to stop dabbling in M and make learning it thoroughly my top-priority training goal for the year. This was not easy in 2022, since while there were fantastic books on Power Query, they each had only a small percentage of their content dedicated to M. There were many great M resources online, but they offered deep insights into a grab bag of very specialized topics and applications, which also did not make for an ideal learning experience.

    Despite these hurdles, by the end of 2022, I found that my deep dive into M had been transformative. My Power BI reports ran much faster, the DAX used in them was dramatically simpler, I was able to clean data faster than ever, and I now could easily build reusable custom functions to automate repetitive data tasks. I also found that learning how to visualize and manipulate complex data structures in M substantially improved my coding ability in other languages. However, I also realized that my experience was entirely unreplicable due to the fact that three of my friends (not coincidentally the authors of this book) happened to be among the most knowledgeable Power Query/M experts in the world, and I relied heavily on their expertise during my learning. I actually used to joke with both Melissa and Rick online about each of them being the Chosen One destined to write the M counterpart to The Definitive Guide to DAX, but I think that seemed too much of a monumental task to both of them.

    During this time, I was the primary tester on a suite of apps that Gregory Deckler was developing. We would regularly talk about Power Query and M when we would discuss the potential future capabilities of those apps. One day, Greg mentioned to me that he was working on a book proposal outline for that same comprehensive guide to M that I had been joking about (but also genuinely wishing existed). He asked me who I would recommend as co-authors for that book, and I am beyond thrilled to say that my response can be found on the cover of this book.

    Mastering M is neither a quick nor easy task. However, I am completely confident that whether you are a data analyst, Power BI developer, data scientist, business analyst, or even an Excel end user looking to level up your skills, having the same three clear, patient, and brilliant instructors I did will make learning M an enjoyable and game-changing experience for you too.

    Brian Julius

    Power BI Expert and Instructor

    Contributors

    About the authors

    Gregory Deckler is a seven-time Microsoft MVP for Data Platform and an active blogger and Power BI community member, having authored over 6,000 solutions to community questions. Greg has authored numerous books on Power BI, including the first and second editions of Learn Power BI, the second editions of Power BI Cookbook and Mastering Power BI, and DAX Cookbook. Greg has also created several external tools for Power BI and regularly posts video content to his YouTube channels – Microsoft Hates Greg and DAX For Humans.

    I would like to acknowledge my wonderful co-authors for this book; I learned so much from all of them. I’m also grateful to my son Rocket for supporting all of my endeavors. A special thanks to Brian Julius who inspired this book and without whom this book would not exist

    Rick de Groot is a Power BI consultant, blogger, YouTuber, and author who hails from the Netherlands. With a background in finance, he has devoted 14+ years to building his expertise in data analysis, particularly in the Power Query M language. He is an independent consultant who also provides training on Power BI, Power Query, and DAX. He regularly publishes articles on his blogs, Power Query How and BI Gorilla, which have emerged as important sources for M articles and tutorials. He contributes extensively to his YouTube channel BI Gorilla as well. Rick’s commitment to sharing his expertise has earned him the Microsoft Data Platform MVP award for two consecutive years.

    My deepest appreciation goes to Brian Julius . His belief in me nudged me to share more about Power Query and helped me grow as a writer for the M language. Without his support, I don’t think my name would have been on the cover of this book. He also played an important part in our discussions and helped shape the contents of this book. I am profoundly thankful for your support, Brian.

    To my family – your patience and support while writing this book have meant the world to me.

    Also, a huge shout-out to all of you out there. All the discussions we’ve had have been incredibly valuable.

    Melissa de Korte is a passionate problem-solver known for simplifying complex problems with Power Query. With an impressive track record as an enterprise DNA expert and super user, she has become an asset to the community, actively engaging, supporting, and inspiring others. Behind her professional persona lies a genuine dedication to empowering others through sharing knowledge. Her portfolio includes blogs, tutorials, courses, and webinars to make Power Query and M more accessible.

    To Marcel and Sam for filling every moment we share(d) with joy.

    My gratitude goes to my co-authors, as well as Brian Julius, for their roles in this book’s creation. To my family for their endless patience and support. And to all who generously share their knowledge, spark curiosity and promote a supportive and inclusive community. Your contributions are invaluable!

    About the reviewers

    Erin Ostrowsky is a digital nomad who endeavors to build healthy, happy relationships wherever she goes. She’s passionate about STEAM – don’t forget the Arts, folks! – and hopes to inspire others to join in the fun problem-solving this world needs. Erin loves learning, creating, making friends, traveling, and writing about life’s many curiosities. Her current focus is running Fit 4 Duty Data, a data and reporting consultancy and training firm that supports the global mining and energy sectors.

    I’d like to thank my Creator, my friends, and my family – you all make life beautiful, especially during the hard times.

    Vahid Doustimajd is a recognized Microsoft MVP and certified trainer with over 15 years of experience. He is a technophile who is passionate about project management and data analytics. Currently, he works in Australia as a Project Controls and Analytics Manager. He is an active participant in the Microsoft Fabric and Power BI communities, enjoys sharing his expertise through blogs and videos, and organizes events for the Persian Power BI User Group.

    I would like to extend my heartfelt gratitude to my wife for her endless support and patience. Her encouragement has been my guiding light.

    Ahmed Oyelowo, a Managing Partner, lead trainer, and consultant at Foresight BI & Analytics Global Solutions, is a Microsoft Certified Power BI Data Analyst Associate and Azure Enterprise Data Analyst. He is also a Microsoft Certified Trainer, with a best-selling Power BI course on Udemy that has been taken by 90,000+ students. A four-time Microsoft MVP in the Data Platform category, he has developed cutting-edge solutions for various corporate clients.

    My sincere appreciation to my Creator for the gift of all required to complete this review. I must also extend my gratitude to my dearest wife, Kafayat, and my lovely children, Yusraa and Yaasir, for their support and understanding while I was spending some time away from them to complete this task. Finally, I’d like to thank the authors of this book for this wonderful resource!

    Learn more on Discord

    Join our community’s Discord space for discussions with the author and other readers:

    https://discord.gg/vCSG5GBbyS

    Contents

    Preface

    Who this book is for

    What this book covers

    To get the most out of this book

    Get in touch

    Introducing M

    The history of M

    Who should learn M?

    Where and how is M used?

    Experiences

    Products and services

    Why learn M?

    M language basics

    The let expression

    The characteristics of M

    Formal classification

    Informal characteristics of M

    Summary

    Working with Power Query/M

    Technical requirements

    Touring the Power Query Desktop experience

    A brief tour

    Header

    Formula bar

    Ribbon

    Queries pane

    Query Settings pane

    Preview pane

    Status Bar

    Your first query

    Options and data source settings

    Options

    Data source settings

    Editing experience-generated code

    Creating custom columns

    Adding an index column

    Adding columns with examples

    Math operations

    Adding custom m code columns

    Using the Advanced Editor

    Summary

    Accessing and Combining Data

    Technical requirements

    Accessing files and folders

    File.Contents

    Text/CSV

    Excel

    Folder

    PDF

    XML

    Xml.Tables

    Xml.Document

    Azure Storage

    Additional file formats

    Retrieving web content

    Investigating binary functions

    Lines functions

    Accessing databases and cubes

    Cube functions

    Working with standard data protocols

    Addressing additional connectors

    Popular software systems

    Identity functions

    Combining and joining data

    Table.Combine

    Table.NestedJoin and Table.Join

    Table.FuzzyNestedJoin and Table.FuzzyJoin

    Summary

    Understanding Values and Expressions

    Introducing the types of values

    Binary values

    Structure

    Related functions

    Special considerations

    The Date/Time family of values

    Date values

    Time values

    DateTime values

    DateTimeZone values

    Duration values

    Logical values

    Structure

    Related Functions

    Special considerations

    Null values

    Structure

    Related functions

    Special considerations

    Number values

    Structure and examples

    Related functions

    Special considerations

    Text values

    Structure

    Related functions

    Special considerations

    List values

    Structure

    Related functions

    Special considerations

    Record values

    Structure

    Related functions

    Special considerations

    Table values

    Structure

    Related functions

    Special considerations

    Function values

    Structure

    Related functions

    Special considerations

    Type values

    Operators

    Expressions

    Nesting let expressions

    Coding best practices for expressions

    Control structures

    Enumerations

    Summary

    Understanding Data Types

    What are data types?

    The type system

    Columns with mixed types

    Column data type versus value type

    Importance of types

    Clarity and consistency

    Data validation

    Other reasons

    Data types available in M

    Primitive types

    Abstract primitive types

    Nullable primitive types

    Using primitive types for data filtering

    Custom types

    List type

    Record type

    Table type

    Function Type

    Type detection

    Retrieving data types from a data source

    Automatically detecting types

    Data type conversion

    Converting value types

    Converting column types

    Avoiding data loss during conversion

    Effect of locale/culture

    Facets

    Type Claims

    Available Type Claims

    Converting values using type claims

    Inspecting Type Claims

    Ascribing types

    What is ascription?

    Functions that support ascribing types

    Ascribing types when creating records

    Ascribing types when creating tables

    Ascribing types when modifying tables

    Ascribing types to any value

    Errors when ascribing types

    The base type is incompatible with the value

    The Type Claim does not conform with the value

    Ascribing incompatible types to structured values

    Type equivalence, conformance, and assertion

    Type equality

    Type conformance

    Type assertion

    Summary

    Structured Values

    Introducing structured values

    Lists

    Introduction to lists

    List operators

    Equal

    Not equal

    Concatenate

    Coalesce

    Methods to create a list

    Creating lists using the list initializer

    Creating lists using functions

    Referencing a table column

    Using the a..b form

    Accessing items in a list

    Accessing list values by index

    Handling non-existent index positions

    Common operations with lists

    Assigning data types to a list

    Records

    Introduction to records

    Record operators

    Equal

    Not equal

    Concatenation

    Coalesce

    Methods to create a record

    Creating records using the record initializer

    Creating records using functions

    Retrieving a record by referencing a table row

    Accessing fields in a record

    Field selection

    Record projection

    Common operations with records

    Structure for variables

    Referencing the current row

    Providing options for functions

    Keeping track of intermediary results

    Assigning a data type to records

    Tables

    Introduction to tables

    Table operators

    Equal

    Not equal

    Concatenation

    Coalesce

    Methods to create a table

    Retrieve data from a source

    Manually input data into functions

    Re-use existing tables/queries

    Using the Enter data functionality

    Accessing elements in a table

    Item access

    Field access

    Common operations with tables

    Assigning a data type to tables

    Summary

    Conceptualizing M

    Technical requirements

    Understanding scope

    Examining the global environment

    Studying sections

    Creating your own global environment

    Understanding closures

    Query folding

    Managing metadata

    Summary

    Working with Nested Structures

    Transitioning to coding

    Getting started

    Understanding Drill Down

    The trick to getting more out of the UI

    Methods for multistep value transformation

    Transforming values in tables

    Table.AddColumn

    Table.TransformColumns

    Table.ReplaceValue

    Working with lists

    Transforming a list

    List.Transform

    List.Zip

    Extracting an item

    Resizing a list

    List.Range

    List.Alternate

    Filtering a list

    List.FindText

    List.Select

    To-list conversions

    Column or field names

    A single column

    All columns

    All rows

    Other operations

    Expanding multiple list columns simultaneously

    Flattening inconsistent multi-level nested lists

    Working with records

    Transforming records

    Extracting a field value

    Resizing records

    Filtering records

    To-record conversions

    Table row to record

    Record from table

    Record from list

    Conditional lookup or value replacement

    Working with tables

    Transforming tables

    Extracting a cell value

    Resizing a table in length

    Resizing a table in width

    Filtering tables

    Approximate match

    To-table conversions

    Record-to-table conversion

    Creating tables from columns, rows, or records

    Table information

    Working with mixed structures

    Lists of tables, lists, or records

    Tables with lists, records, or tables

    Mixed structures

    Flatten all

    Unpacking all record fields from lists

    Extracting data through lookup

    Summary

    Parameters and Custom Functions

    Parameters

    Understanding parameters

    Creating parameters

    Using parameters in your queries

    Putting it all together

    Parameterizing connection information

    Dynamic file paths

    Filtering a date range

    Custom functions

    What are custom functions?

    Transforming queries into a function

    What is the create function functionality?

    Simplifying troubleshooting and making changes

    Invoking custom functions

    Manually in the advanced editor or formula bar

    Using the UI

    The each expression

    Common usecases

    Refining function definitions

    Specifying data types

    Making parameters optional

    Referencing column and field names

    Debugging custom functions

    Function scope

    Top-level expression

    In line within a query

    Putting it all together

    Turning all columns into text

    Merging tables based on date ranges

    Summary

    Dealing with Dates, Times, and Durations

    Technical requirements

    Dates

    M calendar table

    Other date formats

    Julian days

    Alternate date formats

    Additional custom date functions

    Working days

    Moving average

    Time

    Creating a time table

    Shift classification

    Dates and times

    Time zones

    Correcting data refresh times

    Duration

    Working duration

    Summary

    Comparers, Replacers, Combiners, and Splitters

    Technical requirements

    Key concepts

    Function invocation

    Some common errors

    Closures

    Higher-order functions

    Anonymous functions

    Ordering values

    Comparers

    Comparer.Equals

    Comparer.Ordinal

    Comparer.OrdinalIgnoreCase

    Comparer.FromCulture

    Comparison criteria

    Numeric value

    Computing a sort key

    List with key and order

    Custom comparer with conditional logic

    Custom comparer with Value.Compare

    Equation criteria

    Default comparers

    Custom comparer

    Key selectors

    Combining key selectors and comparers

    Replacers

    Replacer.ReplaceText

    Replacer.ReplaceValue

    Custom replacers

    Combiners

    Combiner.CombineTextByDelimiter

    Functionality

    Example

    Combiner.CombineTextByEachDelimiter

    Functionality

    Example

    Combiner.CombineTextByLengths

    Functionality

    Example

    Combiner.CombineTextByPositions

    Functionality

    Example

    Combiner.CombineTextByRanges

    Functionality

    Example

    Splitters

    Splitter.SplitByNothing

    Functionality

    Example

    Splitter.SplitTextByAnyDelimiter

    Functionality

    Example

    Splitter.SplitTextByCharacterTransition

    Functionality

    Example

    Splitter.SplitTextByDelimiter

    Functionality

    Example

    Splitter.SplitTextByEachDelimiter

    Functionality

    Example

    Splitter.SplitTextByLengths

    Functionality

    Example

    Splitter.SplitTextByPositions

    Functionality

    Example

    Splitter.SplitTextByRanges

    Functionality

    Example

    Splitter.SplitTextByRepeatedLengths

    Functionality

    Example

    Splitter.SplitTextByWhitespace

    Functionality

    Example

    Practical examples

    Removing control characters and excess spaces

    Goals

    A cleanTrim function

    Extract email addresses from a string

    Goals

    Developing a getEmail function

    Split combined cell values into rows

    Goal

    Transforming the table

    Replacing multiple values

    Goal

    Accumulating a result

    Combining rows conditionally

    Goal

    Group By’s comparer to the rescue

    Summary

    Handling Errors and Debugging

    Technical requirements

    What is an error?

    Error containment

    Error detection

    Raising errors

    The error expression

    The … (ellipsis) operator

    Error handling

    Strategies for debugging

    Common errors

    Syntax errors

    Dealing with errors – a top priority

    DataSource.Error, could not find the source

    An unknown or missing identifier

    An unknown function

    An unknown column reference

    An unknown field reference

    Not enough elements in the enumeration

    Formula.Firewall error

    Expression.Error: The key didn’t match any rows in the table

    Expression.Error: The key matched more than one row in the table

    Expression.Error: Evaluation resulted in a stack overflow and cannot continue

    Putting it all together

    Column selection

    Building a custom solution

    Reporting cell-level errors

    Building a custom solution

    Summary

    Iteration and Recursion

    Introduction to iteration

    List.Transform

    Extracting items from a list by position

    Allocating a yearly budget to months

    List.Accumulate

    Function anatomy

    Replacing multiple values

    List.Generate

    Advantages of List.Generate

    Function anatomy

    Handling variables using records

    List.Generate alternatives

    What are useful List.Generate scenarios

    Looping through API data using List.Generate

    Creating an efficient running total

    Recursion

    Why is recursion important?

    Recursion versus iteration: a brief comparison

    Recursive functions

    What is the @ scoping operator?

    Inclusive-identifier-reference

    Using recursive functions

    How to use the @ operator

    Step 1: Write your initial code

    Step 2: Identify the recursive call

    Step 3: Add the @ operator

    Step 4: Test your function

    Removing consecutive spaces

    Performance considerations using recursion

    Summary

    Troublesome Data Patterns

    Pattern matching

    Basics of pattern matching

    Case sensitivity

    Contains versus exact match

    Allowed characters

    Handling one or more elements

    Wildcards

    Extracting fixed patterns

    Example 1, prefixed

    Example 2, pattern

    Example 3, splitters

    Example 4, substitution

    Example 5, regex

    Combining data

    Basics for combining data

    Extract, transform, and combine

    Get and inspect data

    Location parameter

    Connect to data

    Filter files

    Overall strategy

    Choose a sample file

    File parameter

    Transformation pattern

    Query, create function

    Set up monitoring

    Finetuning

    Combine files

    Summary

    Optimizing Performance

    Understanding memory usage when evaluating queries

    Memory limit variations and adjustments

    Query folding

    Query folding in action

    Query evaluation

    Folding, not folding, and partial folding

    Tools to determine foldability

    View data source query

    Query folding indicators

    Query plan

    Operations and their impact on folding

    Foldable operations

    Non-foldable operations

    Data source privacy levels

    Native database queries

    Functions designed to prevent query folding

    Strategies for maintaining query folding

    Rearranging steps

    Working with native queries

    Rewriting code

    Using cross-database folding

    The formula firewall

    What is the formula firewall?

    Understanding partitions

    The fundamental principle of the formula firewall

    Firewall error: Referencing other partitions

    Connecting to a URL using native parameters

    Connecting to a URL using an Excel parameter

    Resolving the firewall error

    Firewall error: Accessing compatible data sources

    Understanding privacy levels

    Setting privacy levels

    Resolving the firewall error

    Optimizing query performance

    Prioritize filtering rows and removing columns

    Buffering versus streaming operations

    Buffering operations

    Streaming operations

    Using the query plan

    Using buffer functions

    The impact of buffering and running totals

    The setup

    Method 1: Running total from regular table

    Method 2: Running total from buffered table

    Method 3: Running total from buffered column

    Method 4: Running total from buffered column using List.Generate

    Data source considerations

    Data sources and speed

    Using dataflows

    Performance tips

    Summary

    Enabling Extensions

    Technical requirements

    What are Power Query extensions?

    What can you do with extensions?

    Preparing your environment

    Getting Visual Studio Code

    Getting the Power Query SDK

    Setting up Internet Information Services

    Setting up Discord

    Discord client

    Discord server

    Discord app

    Discord OAuth configuration

    Creating a custom connector

    Creating an extension project

    TDGTPQM_Discord.pq

    Configuring authentication

    Adding client ID and client secret files

    Adding Configuration Settings

    Creating OAuth functions

    Modifying the data source record definition

    Adding a credential

    Testing the connection

    Configuring navigation and content

    Adding API call functions for data retrieval

    Adding navigation functions

    Modifying the contents function

    Installing and using a custom connector

    Summary

    Other Books You May Enjoy

    Index

    Landmarks

    Cover

    Index

    Preface

    Over the last decade, the popularity of the Power Query’s M language has continuously grown, such that it has become nearly ubiquitous within the Microsoft ecosystem, including Power BI, Excel, Power Platform, Dynamics 365, SQL Server, and Data Factory. Today, M and Power Query are indispensable tools for modern data professionals, such as business analysts, data scientists, and data enthusiasts.

    This book seeks to make you a master of the M language. While M can be written using the Power Query Editor’s graphical user interface (GUI), this approach severely limits you to only a small fraction of the 700+ core M functions. We estimate that the GUI for authoring M queries allows you to solve only about 50% of challenges related to data transformation. However, mastery of M allows you to bring that figure close to 100%. Starting with simple concepts and code, this book progressively moves you along a path of increasing complexity. Complex concepts are explained in clear and concise language with practical examples that demonstrate the concepts in action. By the end of this book, there will be few, if any, data transformation challenges that you won’t be able to tackle head-on.

    Get ready for an exciting and rewarding journey toward your mastery of the M language!

    Who this book is for

    If you’re new to Power Query, then this book might not be the best choice for you. There are books out there that introduce the basics through the user interface and simple button-based actions, offering a straightforward way to get started.

    On the other hand, if you are serious about fully understanding the M language, then this book is for you. If this is your first experience with M, you may find certain concepts challenging. Reading through the pages the first time will provide you with lots of information, but some topics may be too complex. As you become more familiar with M, reading the book again will likely make these complex topics easier to understand.

    Overall, our goal was to write a book that blends theoretical knowledge with practical examples. The first few chapters are more theoretical and provide the M language fundamentals. As the book progresses and your understanding deepens, we include projects and exercise files for you to work along with.

    The Power Query M language is useful for a variety of users. Power BI users may use M to prepare their data model, Excel users may transform data and output it on their spreadsheets, while data factory users could use M to query an API and transfer the result into a database. In this book, we aim to cater to all these different audiences.

    Although some subjects, such as creating custom connectors or optimizing performance, may appeal more to those with advanced knowledge due to their technical nature, we believe it’s beneficial for all users to understand the different ways the M language can be used.

    Starting with the M language can be daunting, and mastering it took us several years. It’s a challenging journey that requires focus and a lot of experimentation. However, if you persevere, our book offers unprecedented depth of all aspects of M, allowing you to become a true M expert.

    What this book covers

    Chapter 1, Introducing M, introduces M language basics such as the

    let

    expression and also covers the history of M and the formal and informal characteristics of M. It also discusses who should learn M, why you should learn M, and where and how to use M.

    Chapter 2, Working with Power Query/M, introduces the Power Query Editor, the main application for coding M. In addition, it allows you to transition from only using the Power Query Editor’s GUI to create M code to writing the code yourself via custom columns and the Advanced Editor.

    Chapter 3, Accessing and Combining Data, explores the multitude of different data connectors available for M, including file and folder connectors, database and cube connectors, working with binary data, and finally, how to combine and merge data between queries.

    Chapter 4, Values and Expressions in M, introduces the various kinds of values in the M language, as well as expressions, operators, control structures, and enumerations.

    Chapter 5, Understanding Data Types, explores the importance of data types in the M language. It teaches you about their structure and their application in real-world scenarios. The chapter offers techniques to automatically detect data types and discusses why type conversion is important. Additionally, it presents the concept of facets and type ascription, which often lead to errors in your queries.

    Chapter 6, Structured Values, covers some of the most important values in the M language,such as lists, records and tables. These values can store multiple values within them. You will learn the techniques for creating them, their typical uses, and the operators that work with structured values. Furthermore, the chapter explains how to access the individual values they contain.

    Chapter 7, Conceptualizing M, explores some of the more abstract concepts of M that are critical to truly understanding how the M language works. The topics covered include the global environment, creating your own global environment, sections, closure, query folding, and metadata.

    Chapter 8, Working with Nested Structures, presents several techniques to transform and manipulate structured values, such as lists, records, and tables, helping you move past the limitations of the GUI.

    Chapter 9, Parameters and Custom Functions, begins by discussing the role of parameters in making queries flexible and dynamic. It then delves into how to create custom functions, covering their syntax, the importance of data types, and debugging techniques. Finally, you learn that the

    each

    expression is syntax sugar for a function.

    Chapter 10, Dealing with Dates, Times, and Durations, explains how temporal analysis is key to unlocking many data insights and why you must be proficient in dealing with dates, times, and durations to perform such analyses. This chapter explores these subjects and more with numerous practical examples.

    Chapter 11, Comparers, Replacers, Combiners, and Splitters, explores techniques for manipulating data. It demonstrates how to customize the way values are compared, ordered, replaced, combined, or split. These methods are essential for a wide range of common data transformation tasks.

    Chapter 12, Handling Errors and Debugging, focuses on what errors are and provides guidance on how to handle them in the M language and debug your code successfully, enabling you to build more robust queries. In addition, it offers techniques to report errors.

    Chapter 13, Iteration and Recursion, explains recursion through the use of the

    @

    operator. The chapter then shifts focus to iteration techniques, using

    List.Transform

    ,

    List.Accumulate

    , and

    List.Generate

    . You’ll learn about memory considerations and what aspects to consider for the best performance.

    Chapter 14, Troublesome Data Patterns, illustrates the versatility of the M language, covering various common text extraction techniques as well as providing a comprehensive approach to building a manageable custom solution for dealing with files in bulk.

    Chapter 15, Optimizing Performance, examines factors that influence query performance. It introduces mashup containers and the importance of memory management. We’ll delve into query folding, explore the formula firewall’s mechanics, and present various methods to improve query performance.

    Chapter 16, Enabling Extensions, demonstrates how to extend the M language by creating your own, reusable library of M functions, including a detailed example of creating a custom data connector.

    To get the most out of this book

    We expect you to have a basic knowledge of Power Query and some experience with analyzing data. If you have experience with the M language, that is helpful to understand concepts more quickly. However, knowing M is not a requirement.

    Throughout the book, there are references to SQL and Data Analysis Expressions (DAX) code. However, you don’t need to know these languages because the comparisons are simply meant to reflect on the different approaches between the languages. There’s no need to worry if you don’t understand a particular code snippet; that means the comparison is not as applicable to you.

    In the more advanced sections of the book, we cover query folding, custom connectors, and memory usage. Some of you may not be familiar with these topics and that’s okay. However, this information is an important element of what the M language is used for and we think it’s good for everyone to read.

    Download the example code files

    The code bundle for the book is hosted on GitHub at https://github.com/PacktPublishing/The-Definitive-Guide-to-Power-Query-M-/. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

    Download the color images

    We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://packt.link/gbp/9781835089729.

    Conventions used

    There are a number of text conventions used throughout this book.

    CodeInText

    : Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. For example: "Navigate to the

    /ClientApp/src/app/cities

    folder."

    A block of code is set as follows:

    #

    date

    ( year

    as

    number, month

    as

    number, day

    as

    number, )

    as

    date

    When we wish to draw your attention to a particular part of a code block, the relevant lines or items are highlighted:

    #

    date

    ( year

    as

    number, month

    as

    number, day

    as

    number, )

    as

    date

    Bold: Indicates a new term, an important word, or words that you see on the screen. For instance, words in menus or dialog boxes appear in the text like this. For example: "Navigate to the Home tab of the ribbon, click on the dropdown below the Transform data button, and select Edit parameters."

    Warnings or important notes appear like this.

    Tips and tricks appear like this.

    Get in touch

    Feedback from our readers is always welcome.

    General feedback: Email

    [email protected]

    and mention the book’s title in the subject of your message. If you have questions about any aspect of this book, please email us at

    [email protected]

    .

    Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you reported this to us. Please visit http://www.packtpub.com/submit-errata, click Submit Errata, and fill in the form.

    Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at

    [email protected]

    with a link to the material.

    If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit http://authors.packtpub.com.

    Share your thoughts

    Once you’ve read The Definitive Guide to Power Query (M), we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.

    Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.

    Download a free PDF copy of this book

    Thanks for purchasing this book!

    Do you like to read on the go but are unable to carry your print books everywhere?

    Is your eBook purchase not compatible with the device of your choice?

    Don’t worry, now with every Packt book you get a DRM-free PDF version of that book at no cost.

    Read anywhere, any place, on any device. Search, copy, and paste code from your favorite technical books directly into your application. 

    The perks don’t stop there, you can get exclusive access to discounts, newsletters, and great free content in your inbox daily

    Follow these simple steps to get the benefits:

    Scan the QR code or visit the link below

    Qr code Description automatically generated

    https://packt.link/free-ebook/9781835089729

    Submit your proof of purchase

    That’s it! We’ll send your free PDF and other benefits to your email directly

    1

    Introducing M

    M is a powerful and versatile formula language specifically designed for data manipulation and transformation. The term M is an informal designation. M’s official name is the Power Query Formula Language. For an explanation of this designation, see the History of M section later in this chapter. M is the language at the heart of Power Query, which is used in numerous applications like Microsoft Excel, Power BI, Power Platform, and Microsoft Fabric for data transformation and preparation.

    The popularity of the M language has continuously grown over the last decade, and the language has been integrated into an impressive array of Microsoft tools and platforms. Today, M and Power Query are indispensable tools for modern data professionals such as business analysts, data scientists, and data enthusiasts.

    This chapter is the beginning of your exciting journey that culminates in the mastery of the M language. We start with a brief history of M and then cover the basics of who, where, why, and how. Next, we introduce the absolute basics of the M language and finish with the formal and informal characteristics of M (in effect, what is M?). Overall, this chapter provides a firm foundation for the more in-depth exploration of the M language found throughout the rest of this book. Specifically, this chapter covers the following topics:

    The history of M

    Who should learn M?

    Where and how Is M used?

    Why learn M?

    M language basics

    The characteristics of M

    The history of M

    The process of extracting, transforming, and loading data is a challenge as old as information technology itself. Both business users and IT professionals have historically struggled with the challenge, and numerous software tools have been developed over the years to help deal with the challenge such as SQL Server Integration Services (SSIS) and Alteryx.

    However, many of these tools were complex and not easily portable. The M language and Power Query were created to help solve these issues.

    While there might be some more speculative history regarding the origins of M, we can at least definitively trace M back to a project originally code-named Data Explorer. Data Explorer was an Azure SQL Labs project circa 2011 that aimed to simplify the process of accessing, cleaning, and preparing data from various sources. The query language was thought of as a mashup language (hence the M for mashup).

    In 2013, Microsoft released Power Query as an add-in for Excel. Power Query introduced a user-friendly interface, allowing business users to perform data transformations via a visual editor. Behind the scenes, Power Query utilized the M language as the underlying formula language to drive the data transformations, and as such, these data transformations became repeatable. Instead of, for example, business users continually performing the same manual data transformations on source data received as comma-delimited files, that process could now be effectively automated.

    Following the success of Power Query in Excel, Microsoft included Power Query as part of its new product, Power BI Designer, which eventually became Power BI Desktop. As Power Query gained popularity, there was a need to standardize the underlying formula language. In 2016, Microsoft submitted the Power Query Formula Language specification to the European Computer Manufacturers Association (ECMA), an international standards organization. This effort established a formal specification for the language, ensuring compatibility and interoperability between different implementations.

    While the language was formally referred to as the Power Query Formula Language, it became commonly known as M among the user community. The informal name M gained widespread acceptance and is now widely used to refer to the language.

    Microsoft continues to enhance and refine the M language as part of its ongoing investment in data integration and transformation technologies. New functions, features, and improvements are periodically introduced to provide users with more powerful and efficient ways to manipulate and prepare their data. In addition, Microsoft continues to introduce M within additional software tools and platforms, such as data integration within Microsoft Power Platform and dataflows within Power BI and Fabric.

    Today, the M language is a key component of Microsoft’s data transformation and integration toolset. The proliferation of M, as well as its versatility and extensibility, make it an invaluable language for today’s modern data professionals.

    Let’s now turn our attention to who should learn M.

    Who should learn M?

    M is a powerful tool for data professionals and individuals who work with data on a regular basis. The versatility and capabilities of M make it a valuable language to learn for various roles, including the following:

    Data analysts: Data analysts who deal with data extraction, transformation, and preparation tasks can greatly benefit from learning M. It provides a comprehensive set of functions and operators that enable data analysts to efficiently shape and manipulate data from diverse sources. By mastering M, data analysts can automate repetitive tasks, handle complex data transformations, and ensure data quality, leading to more accurate and reliable data analysis.

    Business intelligence professionals: Professionals in the business intelligence (BI) space can greatly enhance their skills by learning M. It is a core component of Power BI. By understanding M, BI professionals gain the ability to connect to various data sources, perform complex data transformations, and create reusable data preparation workflows, enabling them to provide actionable insights and drive informed decision-making.

    Data engineers: Data engineers involved in the design and implementation of data pipelines and data integration processes can really benefit from learning M. It allows data engineers to efficiently extract, transform, and load (ETL) data from different sources into data warehouses or data lakes, specifically within Power BI and Microsoft Fabric. M also provides the flexibility and power to handle complex data formats, define custom transformations, and create efficient data processing workflows. By mastering M, data engineers can streamline data integration processes and ensure data consistency and quality.

    Data scientists: Data scientists who perform exploratory data analysis, model development, and advanced analytics can leverage the capabilities of M to efficiently prepare their data. M provides a robust set of functions for cleaning, shaping, and aggregating data, allowing data scientists to focus on the analytical aspects of their work. By incorporating M into their data preparation workflows, data scientists can streamline the pipeline of turning raw data into insights, spending less time on data cleansing and preparation and more time on data modeling and analysis.

    Power users: Power users in Excel and Microsoft’s Power Platform who work extensively with data and perform complex data manipulations can benefit from learning M. It is integrated into Excel through Power Query, empowering users to perform advanced data transformations within the familiar Excel interface. In addition, M is integrated into the Microsoft Power Platform via data integration, allowing data to be transformed and mapped between systems. By mastering M, Power users can expand their data manipulation capabilities, automate repetitive tasks, and enhance the accuracy and reliability of their analyses.

    Individuals in data-driven roles: Beyond the specific roles mentioned above, individuals in various data-driven roles, such as project managers, consultants, researchers, and domain experts, can benefit from learning M. Mastery of M provides the ability for individuals in these roles to independently handle data-related tasks, extract meaningful insights, and make informed decisions based on reliable data.

    M is a valuable language for a wide range of data professionals and other individuals who work with data. Whether you are a data analyst, BI professional, data engineer, data scientist, power user, or someone in a data-driven role, learning M empowers you to efficiently ingest, transform, and prepare data for analysis.

    Now that we understand what types of individuals would want to learn M, let’s next explore where those individuals can leverage their mastery of M.

    Where and how is M used?

    M is a versatile language included in various tools and platforms where data transformation and manipulation are essential. Its integration within the Power Query ecosystem enables users to leverage M’s capabilities in different environments. In this section, we explore some key areas where M is widely used.

    Experiences

    Before discussing specific products where M is used, it is important to understand the different experiences available for authoring M. There are two experiences available for authoring M, one intended for on-premises use and the other for cloud-based applications. These experiences are the following:

    Power Query Desktop: Power Query Desktop is the experience for Power Query found in desktop applications such as Power BI Desktop and Microsoft Excel. While the experiences are similar, there are differences. For example, the artificial intelligence (AI) and machine learning (ML) integrations as well as the integrations with R and Python present in Power BI Desktop are not present in Microsoft Excel. Conversely, the Structured Column options available in Excel are not available in Power BI Desktop.

    Power Query Online: Power Query Online, a cloud-based service, allows users to create and manage data transformations within a web browser. M is used extensively in Power Query Online to define data transformations, connect to data sources, and perform complex data manipulations. Users can access and edit M queries directly within the browser interface, making it convenient to collaborate and work on data transformation tasks from anywhere with an internet connection. Power Query Online is integrated into a variety of Microsoft products, including the Power BI service, Power Apps, Power Automate, etc.

    It is important to note that while two different experiences for authoring M exist, both provide nearly the exact same user experience. Even better, both provide the ability to edit the underlying M code, which is the primary focus of this book. Thus, the skills learned here apply equally to either experience used within any product or service.

    Products and services

    M is ubiquitous within the Microsoft ecosystem, including the following software and services:

    Dataflows: Dataflows are product-agnostic, cloud-based M queries that can be reused across multiple different products. Dataflows enable users to build and manage reusable data preparation and transformation processes. Dataflows leverage the Power Query Online experience.

    Power BI Desktop: M is a fundamental component of Power BI Desktop, a leading BI tool. M allows users to connect to different data sources, perform data transformations, and create interactive visualizations and reports.

    M enables users to extract, clean, and shape data from diverse sources, such as databases, Excel files, web services, and more. With M, users can define data transformation steps and create reusable queries that refresh and update data automatically when the underlying source changes.

    Within Power BI Desktop, M is used within the Power Query editor, a sub-program launched from within Power BI Desktop. The Power Query editor provides a powerful graphical user interface (GUI) for working with the M formula language, as shown in the following screenshot:

    A screenshot of a computer Description automatically generated

    Figure 1.1: Power Query editor in Power BI Desktop

    The Power Query editor is covered in greater detail in Chapter 2, Working With Power Query/M.

    Power BI Desktop also supports the use of dataflows.

    Power BI/Fabric service: The Power BI/Fabric service (powerbi.com) is the cloud-based component of Power BI that enables you to share reports, dashboards, and other content. The service supports the use of M code via the creation of dataflows, using the Power Query Online experience.

    To create a dataflow in the Power BI service, navigate to any workspace other than My Workspace and choose New and then Dataflow, as shown in Figure 1.2:

    Figure 1.2: Create a dataflow in the Power BI service

    Power BI Report Server: Power BI Report Server (PBRS) supports the Power Query Desktop experience, allowing users to create rich data transformations via M.

    Excel (Windows and Macintosh): M is seamlessly integrated into Excel, empowering users to perform advanced data transformations within the familiar Excel interface. Power Query, the engine behind Excel’s data transformation capabilities, is powered by M. Users can access the Power Query editor in Excel to apply M transformations, filter and sort data, remove duplicates, merge and append tables, and perform other data preparation tasks. M allows users to clean, reshape, and enrich data in Excel, enhancing the accuracy and reliability of their analyses.

    In Excel, the Power Query editor interface can be accessed by using the Data tab of the ribbon and choosing Get Data:

    Figure 1.3: Get Data in Microsoft Excel

    Once the data source is chosen, the Power Query Editor interface can be accessed by choosing the Transform Data button:

    Figure 1.4: Transform Data option in Microsoft Excel

    Both the Windows and Macintosh versions of Excel also support accessing and using dataflows.

    Power Apps: Power Apps is Microsoft’s low-code platform for creating applications. Both the Power Query Online experience as well as the use of dataflows are supported. A common use case is to leverage M either via the Power Query Online experience or dataflows, allowing users to seamlessly bring their data into the Dataverse (formally Common Data Service).

    Power Automate: Power Automate is Microsoft’s low code platform for automating workflows. Power Automate allows users to automate repetitive workflows and processes that may involve data manipulation and integration tasks. M can be employed within Power Automate to perform data transformations and handle complex data scenarios as part of the automated workflows, via the Power Query Online experience. By incorporating M into Power Automate, users can build sophisticated data integration and automation solutions that streamline their business processes. In addition, dataflows can be leveraged in Power Automate via Power Query Dataflows connector. This allows actions to occur once a dataflow completes and also provides the ability for a dataflow to be initiated as an action within a Power Automate flow.

    Data Factory: Data Factory is a managed cloud service specifically built for complex extract-transform-load (ETL) and extract-load-transform (ELT) integration projects. Data Factory allows the creation and orchestration of data-driven workflows, data movement, and transformation at scale. Both Azure Data Factory and Data Factory in Microsoft Fabric support M code, via both the Power Query Online experience as well as dataflows.

    SQL Server: SSIS supports the core M engine while SQL Server Analysis Services (SSAS) supports the Power Query Desktop experience.

    Dynamics 365 Customer Insights: Customer Insights within Dynamics 365 is Microsoft’s customer data platform (CDP) that provides a holistic view of customers, enabling personalized customer experiences. Customer Insights supports both dataflows as well as the Power Query Online experience.

    Visual Studio: Visual Studio allows M to be integrated as a language. This is done via the Power Query Language Service for Visual Studio Code and is available in the Visual Studio Code Marketplace. This language service provides fuzzy autocomplete, hover, function hints, and other functionality for writing M code within Visual Studio.

    There is also the Visual Studio Power Query Software Development Kit (SDK). This SDK consists of a set of tools designed to help create custom Power Query data source connectors. The Visual Studio Power Query SDK is covered in greater detail in Chapter 16, Enabling Extensions.

    Other data integration scenarios: M is not limited to the aforementioned software and services. M can also be leveraged in custom applications and programming environments that utilize Power Query libraries.

    As you can see, M is widely used in different tools and platforms within the Microsoft ecosystem, such as Power BI Desktop, Excel, the Power BI and Fabric service, Power Platform, SQL Server, and Dynamics. M enables users to connect to various data sources, perform advanced data transformations, and automate data integration workflows. The skills learned in this book deal with the M language itself and, thus, transcend both the experience as well as the specific product or service. Thus, by mastering M, users gain the ability to create reusable data transformation processes and enhance their data manipulation capabilities across a wide range of data-related scenarios, as well as across any experience, product, or service that uses M as its underlying data transformation layer.

    Let’s now turn our attention to why data professionals and other individuals might want to add M to their repertoire of language.

    Why learn M?

    In today’s data-driven world, efficiently and effectively transforming and analyzing data is a valuable skill. Power Query, a powerful data transformation and preparation tool, gained immense popularity due to its seamless integration with many popular software systems as well as its ease of use. At the heart of Power Query lies M, the Power Query Formula Language. But you may be asking yourself, why should you invest the time in order to learn M?

    Here are seven reasons why we believe data professionals and other individuals should learn M:

    Tapping into the full power of Power Query: In Gil Raviv’s book, Collect, Combine, and Transform Data Using Power Query in Excel and Power BI, Mr. Raviv estimates that the GUI for authoring M queries (see Where and how Is M used? in this chapter) allows you to solve only 40% of challenges related to data transformation, but mastery of M allows you to bring that figure closer to 99.99%. Later chapters demonstrate specific examples of solving data transformation challenges that cannot be done in the GUI. Since M serves as the backbone of Power Query’s data transformation capabilities, by mastering M, you gain full control over the data transformation process, allowing you to extract, clean, transform, and reshape data from diverse sources.

    Automation of repetitive tasks: One of the primary reasons you should learn M is to automate repetitive data transformation tasks. Business and IT professionals are often tasked with receiving data on a recurring basis and then generating reports based on this data. Instead of manually transforming this data each time (often in Excel) to prepare it for reporting purposes, leveraging M for this data transformation allows the data transformation logic to be implemented once and then automatically run each time new data is received.

    Flexibility and customization: While Power Query provides a user-friendly interface for data transformation tasks, it does have its limitations. By learning M, you can extend the capabilities of Power Query and overcome these limitations. M allows you to write custom functions, perform advanced transformations, and apply complex logic that goes beyond the built-in capabilities of the Power Query interface. This flexibility empowers you to tailor your data transformations precisely to meet the unique requirements of your data sources and analysis.

    Efficiency and performance optimization: M is a highly efficient and optimized language for data transformations. The Power Query engine intelligently processes M expressions, optimizing performance by reducing unnecessary data loads and transformations. When working with large datasets or complex transformations, knowing M enables you to write efficient code that significantly speeds up your data processing, as demonstrated in Chapter 15, Optimizing Performance. By understanding the underlying principles of M and its performance considerations, you can optimize your data workflows and save valuable time. Finally, leveraging M can greatly reduce and simplify the formula and Data Analysis Expressions (DAX) code in downstream applications like Excel and Power BI Desktop.

    Advanced data cleaning and transformation: M provides a comprehensive set of data cleaning and transformation functions that go far beyond the basic operations available in traditional spreadsheet applications. With M, you can easily handle data quality issues, such as removing duplicates, handling missing values, splitting columns, merging sets of data, and performing advanced calculations. Learning M enables you to tackle complex data cleaning and transformation tasks efficiently, leading to accurate and reliable data analysis.

    Integration with other programming languages: M is not only a standalone language but also integrates well with other programming languages such as SQL, R, and Python. This integration allows you to leverage the capabilities of these languages within your Power Query workflows. You can combine M code with native SQL queries, call R or Python scripts, and seamlessly incorporate external libraries and functions into your data transformation process. By expanding your knowledge to include M, you unlock the potential to leverage the best features of various programming languages for data manipulation.

    Career advancement: Proficiency in M and Power Query has become a sought-after skill in the data industry. As organizations increasingly rely on data for decision-making, individuals who possess the ability to efficiently transform, clean, and analyze data are in high demand. By investing time and effort in learning M, you position yourself as a valuable asset to organizations that rely on data-driven insights. The knowledge of M can open up new career opportunities, enhance your job prospects, and enable you to take on challenging data-related projects.

    In summary, learning M enables you to efficiently ingest, transform, and analyze data from diverse sources. It provides flexibility, customization, and performance optimization capabilities that extend the functionality of Power Query itself. By mastering M, you gain a competitive edge in the data industry and open doors to new career possibilities.

    We hope that you are now excited about learning M! Let’s turn our attention to the basics of the M language.

    M language basics

    As previously noted, M is a powerful language designed for data ingest and transformation within a variety of Microsoft software and services. Understanding the basics of the M language is essential for effectively leveraging its capabilities.

    Here are some important fundamentals regarding the M language:

    Expressions and functions: In M, expressions form the building blocks of data transformations. An expression represents a computation or operation that evaluates to a value. M provides a wide range of built-in functions that can be used to perform operations on data. Functions in M are called using a syntax where the function name is followed by arguments within parentheses. For example, the function

    Text.Start(Hello, World!, 5)

    returns the substring

    Hello

    from the input text. More about expressions and functions are covered in Chapter 4, Understanding Values and Expressions, as well as Chapter 9, Parameters and Custom Functions.

    Data types: M supports various data types, including text, numbers, dates, times, lists, tables, and records. Understanding the data types in M is crucial for performing accurate transformations. M provides functions to convert between different data types and manipulate data, based on their inherent characteristics. For example, the

    Text.From

    function converts a value to text, while the

    Date.Year

    function extracts the year component from a date or datetime value. Data types are covered in Chapter 5, Understanding Data Types.

    Variables and constants: M allows you to define variables and constants to store and reuse values during data transformations. Variables are created within a let expression, followed by a comma-separated list of variable assignments. Constants, on the other hand, are fixed values that remain constant throughout the execution. Variables and constants help improve code readability, enable reuse, and make complex transformations more manageable. More about variables and constants can be found throughout this book.

    Operators: M supports a variety of operators to perform mathematical calculations, logical comparisons, and text manipulations. Arithmetic operators (

    +

    ,

    -

    ,

    *

    ,

    /

    , and so on) are used for numeric calculations, while comparison operators (

    >

    ,

    <

    ,

    =

    , and so on) evaluate logical conditions. The combination operator ‘

    &

    ' is used for concatenating text values, appending lists and tables, or merging records. Operators are covered in Chapter 4, Understanding Values and Expressions.

    Step-by-step transformation process: M follows a step-by-step transformation process where each step defines a data transformation operation. The Power Query editor provides a visual interface to define these steps and generates the corresponding M code. Steps can include operations such as filtering rows, removing duplicates, splitting columns, merging tables, and aggregating data. Chapter 2, Working with Power Query/M, covers this topic in more detail.

    Query folding: Query folding is an optimization technique in Power Query that pushes data transformations to the data source whenever possible. When using M, it is important to be aware of query folding to ensure efficient data processing. Query folding can improve performance by reducing data transfer between the data source and Power Query. However, not all transformations can be folded, so it is essential to understand which operations can be folded and which cannot. For example, when using Direct Query or Dual storage mode for tables, all M queries must fold, which can limit certain transformation operations. Query folding is discussed in Chapter 7, Conceptualizing M, and in Chapter 15, Optimizing Performance.

    Error handling and debugging: M provides error handling mechanisms to catch and handle exceptions during data transformations. By using functions like

    try

    ,

    otherwise

    , and

    error

    , you can control the flow of execution and handle potential errors gracefully. Additionally, M supports debugging capabilities, such as the ability to step through the code to identify and resolve issues in complex transformations. Error handling and debugging are covered in Chapter 12, Handling Errors and Debugging.

    Case sensitivity: M is case-sensitive. This applies to all functions, expressions, variables, constants, and other aspects of the M language.

    Commenting: Comments in M follow the C language commenting style. Inline comments are proceeded by double slashes (

    //

    ) while block comments use the slash-asterisk/asterisk-slash pattern (

    /*

    and

    */

    ).

    Now that we have a good understanding of the core components of the M language, let’s next explore the most fundamental component of the M language, the

    let

    expression.

    The let expression

    At the core of the M language is the

    let

    expression, which must be paired with an

    in

    expression. In simple terms, the

    let

    expression contains the input and transformations, while the

    in

    expression contains the output. A simple

    Hello World

    for M looks like the following:

    let

    Hello

    =

    Hello World

    in

    Hello

    This code would return the ubiquitous

    Hello World

    text.

    It is important to note that every expression within a let statement must be followed by a comma (

    ,

    ) except the last expression prior to the

    in

    expression. Thus, if the

    let

    expression consists of multiple sub-expressions, then the code might look like the following:

    let

    Hello

    =

    Hello

    , World

    =

    World

    , Return

    =

    Hello

    &

    &

    World

    in

    Return

    This code also returns

    Hello World

    as output.

    Understanding the basics of M, including expressions, functions, data types, variables, operators, and the step-by-step transformation process, is vital for effectively manipulating and preparing data. By mastering these foundational concepts, you gain the ability to perform complex transformations, optimize data workflows, and unlock the full potential of the M language. The rest of this book is devoted to helping you master all of these foundational concepts and how to apply them to complex data transformations.

    The characteristics of M

    M is a programming language that serves as the backbone of Power Query, enabling users to extract, clean, and reshape data from various sources, such as databases, spreadsheets, web pages, and more. However, unlike general-purpose

    Enjoying the preview?
    Page 1 of 1