Author Archives: roel.pluijmen

Power BI dataflows: The BI self-service game changer

What  is Dataflows?

Dataflows is the new self-service data preparation feature within Power BI. Power BI Desktop has long offered the capability of running relatively simple ETL operations with Power Query. Importing data from multiple sources and then adapting its structure all happens within the report itself. Dataflows now makes the separation between the preparation and the use of the data in the report itself. This makes it possible to work around the complicated ETL processes within your organisation or to structure them in more workable ways. That makes it easy to share the model you create with other Power BI users within your organisation.

Dataflows in de context van uw BI-oplossing

figure 1: Dataflows in the context of your BI solution

Just like in Power BI Desktop, in Dataflows you can link multiple data sources and then convert them into a data set with its own structure, that you can combine any way you want. So you can create a model with Dataflows that other Power BI Desktop users can import into their own environments.

Dataflow Importeren: De beschikbare connectoren

Figure 2: Importing: The available connectors
Modelleren: Dataflows

Figure 3: Modelling: This entity has now been made available by Dataflows.
Dataflows delen

Figure 4: Sharing: With other users within your organisation who use Power BI Desktop

What are the Premium functions?

Power BI comes in two variants. There are a number of differences between the Power BI Premium and Power BI Pro version. The most significant are:

  • Incremental refreshing of the data. Normally, your data set is reprocessed all over again when you refresh it in Power BI. But that approach can the very time and resource-intensive for large volumes of data. In Power BI Premium, you can configure the refresh for each data set.
  • Linked entities. Premium allows you to refer to existing entities that you have created and add operations to existing entities.
  • 100 TB storage. A Power BI Pro user can “only” use 10 GB for sharing dataflows with other users.

Common Data Model support

The Common Data Model (CDM) is a collection of standardized data systems and a metadata system. This gives you consistency of data and its meaning for applications and business processes. Dataflows supports the CDM by giving you simple mapping of all data in any form in the standard CDM entities like Account, Contact, etc. Business analysts can benefit from the standard table and its semantic consistency. Or adjust entities based on their unique requirements.

What does this mean for your organization?

This method of sharing prepared data has a tremendous impact on your way of working. Because the data that your data flows generate are actually saved in an Azure Data Lake Store, this allows you to reuse them throughout your whole organization and beyond, for all kinds of purposes, like analytics (with solutions like Azure Databricks and Azure Machine Learning). The data and the models that this generates can then be the basis for endless BI processes and reporting for your business.

Dataflows binnen uw organisatie
Figure 5: Dataflows within your organization
Power BI Update – May 2018

Incremental refresh (preview Power BI Premium)

If there is one new feature I love the most this month, it’s the incremental refresh, heads down. Unfortunately it’s only available for Power BI Premium and not for Pro users. However, this has been discussed for a while now in the Power BI community and the preview is finally here. Before this feature came out, refreshing was an all or nothing thing in which you had to load the entire data model every time. If you had a large model – of say 15 GB or more – refreshing could take some time. The idea of incremental refresh is that you only refresh a segment of your data. You already had the data from before, so you just want to focus on the new data that has changed in the meantime. Needless to say, incremental refreshing will save you a lot of time when reports have to be renewed on – for example – an hourly basis.

To use this, you have to enable it as a preview feature in your Power BI desktop settings and then set up two parameters which you have to fill in, namely RangeStart and RangeEnd. These are reserved names.

When these parameters have been set up, we can select the table where we want to do incremental refresh on and define the rules that are going to be in place for this action.

We select the table name, how long we want to store the data in the past and what data we want to refresh. For example, you can refresh all rows from the last days, weeks and years. It’s all up to you.

Conditional Formatting by Different Field

We used to set colors on items based on the value of the field or aggregation type we had in our data. Value columns that we didn’t add in our visual couldn’t be used for conditional formatting. From now on this is possible. You can show a summarized field in your visual, but you can also format on, let’s say, an average value.

Web Connector by Example Data (preview)

This connector was already available before this release. It extracts data from HTML tables on a web page so that you can import the data into Power BI desktop and create a model from it. Usually the web pages you are trying to get data from don’t have simple HTML tables. However, the new feature includes a button called ‘Extract table using examples’. You can use it after you inserted a web page for retrieving data.

This button pulls up a page where you can see a preview of the site and a number of auto-detected HTML tables. You can now specify example sample values for what data you want to extract. All you need to do is start typing the data you see in the table and Power BI’s algorithm automatically retrieves all the data from the entire column you try to connect with.

Common Data Service for Apps Connector

Common Data Service for Apps allows you to securely store and manage data that’s used in apps you’ve developed or apps from Microsoft and app providers. Data within CDS for Apps is stored within a set of standard and custom entities. You can safely connect to your CDS from your Power BI desktop and use the model as a source for your reports. I will not go into detail what the CDS for apps actually is, but if you would like to know more about this topic I would suggest to take a quick look at this article.