Power Query Record as Dictionary: Part 1

#PowerQuery #ThinkTank #Record.Field #List.Accumulate #level300

With this publication I begin the series of posts about Power Query performance and challenges should be solved to gain the best results from the queries.

Initially these posts were written by Mikhail Muzykin for Telegram channel “For those from the tank” (https://t.me/pbi_pq_from_tank). Mikhail (also known as @buchlotnik) is an acknowledged Power Query master, slightly obsessed 😊 with the Power Query performance tweaks and the beauty of M language. That channel’s name has the second meaning, close to “I will explain this to you (again), in simple words”. Indeed, Mikhail’s simple and clear style absolutely reflects the name of the channel.

Here is the first post of the “Power Query Think Tank” series (I suppose this name perfectly describes the goal and style of the posts). It was slightly rewritten and enriched with pictures for the best reading experience.

Let’s start.

List.Accumulate + Record.Field vs Table.Join, or Power Query Dictionaries with Records

Let us have two tables: table “A” with two columns:

  • Name
  • Occupation
Pic. 1. Table “A”

and table “B” with the next columns:

  • Position
  • Salary
  • Bonus
Pic. 2. Table B

We want to get the table with the next set of columns:

  • Name
  • Occupation
  • Salary
  • Bonus

Of course, the first solution to get it is using Join:

Pic. 3. Simple inner join (merge) by Occupation and Position columns

But you always can say that we are cheating (a little) with Table.Join here:

  • columns, used as join keys, have different names while it often will be the same names
  • there are always (almost) unnecessary columns in the right table, which must be deleted before join, etc.

Well, then we’ll use Table.NestedJoin:

I think you can easily recognize this pattern.

So, what we are talking about there? We just want to provide a more performant alternative to join:

What changed?

  • A new strange step Dict with List.Accumilate and an empty record as its second argument.

Here we:

  • took a lookup table (B),
    • split it into list of rows (Table.ToRows), de facto – list of lists, where each inner list is the list of values from the table row),
    • consequent added fields to the empty record: name of the field is the value from the first column (c{0}), value of the field is all other values from the row (List.Skip(c)) in form of list.

In the other words, we have created a record, representing a dictionary (or collection), where names are lookup keys for corresponding values.

Purpose of this record is very simple: calling a field from this record by its name we’ll get this field value. This kind of lookup for values works very fast.

Pic. 4. Lookup record collected all row values per name, as list
  • The next step – named Generator – represents the function, which we’ll use for lookup on the next step.
    • Its only argument is the list X, and the function output is also a list.
    • Function takes this list X and add some elements to it. Which ones? This is the values of lookup record (Dict) from the field with the name taken from the second element of X: X{1}. We take the second element because it is the second column of the table which holds the “Occupation” value for each row. If you still confused, more detailed explanation is just a few paragraphs below.
  • At the end on the step E we pass function d to Table.FromList – the fastest table constructor function.
    • First argument is our initial table A, transformed to the list of lists (rows values).
  • Second argument – function Generator, and there’s the main magic happens. It takes row values as list (for example, {"John","Player"}), lookup for the second element ("Player" – do you remember that X{1}?) in the names of record Dict fields, and adds value of this field "Player" : {1000,100} to the initial values of the row ({"John","Player"}):
  • Third argument of Table.FromList sets column names for the resulting table
Pic. 5. Here we get the desired table, looking up values in our dictionary record

Looks unusual, isn’t it? But it could be really faster with the big tables. Try it!

Follow me: Facebooktwitterlinkedinmail
Share this: Facebooktwitterredditpinterestlinkedinmail
Share this
Еще раз о двунаправленных связях, неоднозначности модели и USERELATIONSHIP
Power Query Record as Dictionary: Part 2