Correctly Sum Two Or More Columns in Power Query and Power BI
Let’s say you have a few numerical columns [A], [B] and [C] in your table and want to sum them to the new column in Power Query or Query Editor in Power BI.
In Power Query we have special buttons for this:
For example, we want to sum columns [A] and [C]. Just click (holding Ctrl button) column headers you want to sum, then go to “Add Column” – “Standard” – “Add”, and you’ll get a new column named “Addition” with the row-by-row sum of desired columns:
If we want to add three columns at a time, then we’ll also get a desired result:
But if in this table we want so sum columns [A] and [B], we are not expecting a pitfall, aren’t we?
The reason of this behaviour is simple and it reveals itself when we look at our data a little bit close: there is a null
in column [B] in that row. In Power Query formula language (M) the expression null + value
always returns a null (see this excellent post of Ben Gribaudo about null type
and operations with null
values).
But why we get a correct result when we sum up three columns? It is because Power Query uses different formulas when we sum two columns or three and more columns:
List.Sum
function used in this case ignores null values and sums up only numerical values. Indeed, it gives more intuitive result, but on the contrary has not such intuitive syntax of simple addition.
I do not know what is the reason of such difference, and already complained to the development team. But if you rely on the buttons there, then you have to be aware of such behaviour.
What is the possible solutions there? It depends on what you want to get as a result, but in any case you should take a look at the formula bar and decide what to correct there:
- If the logic of your calculations assume that
value + null = null
, then you should use simple+
symbol between column names. - If you want to get
value + null = value
, then you should use List.Sum finction, like in that example:List.Sum({[A], [B], [C]})
THE SAME BEHAVIOR Power Query shows when you’ll try to multiply two columns and three or more columns: with two columns there will be the simple *
symbol, with three or more columns there will be List.Product
function used.
Ok, it is a really short post which I planned to (and ought to) write a long time ago…
Follow me: