Classical Incremental Refresh For Cloud Data Sources in Power BI Service (for Pro accounts)

[TL;DR]

In this post I describe how to implement the classical incremental refresh scenario for the cloud data sources in Power BI Service for Pro accounts. Step by step. It worth to read.

Foreword

As I wrote in the previous post, we can implement a semi-incremental refresh for Pro accounts in Power BI, using just dataflows.

The method I described has a main lack: although the “historical” dataflow remains unchanged and will never load data from the source again, the “refreshing” dataflow will load the whole “fresh” part of data repetitively until you change the date intervals manually.

Initially – there’s small amount of data in the “fresh” part…

Fresh data become not so fresh…

…but, after some consequential refreshes, it could become significant, and not so fresh.

You can again split “fresh” it in the two parts – “new historical” and “fresh”, and so on. But this is only SEMI-incremental refresh, and, of course, is not a good solution.

It seemed that implement a complete, classic incremental update using just dataflows is impossible.

But, after some investigations, I found a solution which helps to implement the classical incremental refresh scenario, where the fresh data part remains small and fresh, and historical part become updated without querying a data source.

Fresh data remains fresh 🙂

At least for the cloud data sources.

Continue Reading

Video: Semi-incremental refresh for Pro accounts in Power BI

After I published previous blog post about an incremental refresh for Pro accounts in Power BI, Microsoft MVP Parker Stevens from BI Elite channel kindly asked me to record a video for his channel. So here it is.

Here I not only introduce Power BI dataflows and describe the semi-incremental refresh concept, but also show how it works in Power BI Service.
Have a fun!

To my surprise, it has almost 1500 views in two days – not so bad 🙂 I understand that this happened because of hype topic, but, well… now I know how to remove some limitations and perform a CLASSICAL incremental refresh for some types of data sources. Blog post follows.


Incremental Refresh For Pro Accounts With Power BI Service Dataflows

Incremental refresh is a high-demand option in Power BI. Microsoft already provided it for the Premium capacities, but for the Pro accounts it is still in waiting list.

However, with introduction of dataflows in Power BI Service, an incremental refresh implementation becomes available for Pro accounts too.

I won’t to describe dataflows in detail here since there is a lot of blogs and resources about it (but I’ll provide a few links in the bottom of the post).

Concept

All you need to know now on how to implement an incremental refresh, is that

  1. Dataflow in Power BI Service is a set of web-based Power Query queries (named as ‘entities’).
  2. Each dataflow could be refreshed manually or by the its own schedule.
  3. The result of evaluation of a dataflow’s entities then stored in Azure Data Lake Storage Gen2 as tables (more precisely as CSV files).
  4. Then you can use dataflows (their entities) as a data sources in your Power BI dataset.

Let’s start from this point.

What is the incremental refresh at all? In simple words, it means that in the single data import action we are refreshing (updating) only the part of data instead of loading all the data again and again. In the other words, we are dividing data in two parts (partitions): first part does not need refresh and should remain untouched, second part must be refreshed to bring in updates and corrections.

Continue Reading