What is Dataflows?
Dataflows is the new self-service data preparation feature within Power BI. Power BI Desktop has long offered the capability of running relatively simple ETL operations with Power Query. Importing data from multiple sources and then adapting its structure all happens within the report itself. Dataflows now makes the separation between the preparation and the use of the data in the report itself. This makes it possible to work around the complicated ETL processes within your organisation or to structure them in more workable ways. That makes it easy to share the model you create with other Power BI users within your organisation.
Just like in Power BI Desktop, in Dataflows you can link multiple data sources and then convert them into a data set with its own structure, that you can combine any way you want. So you can create a model with Dataflows that other Power BI Desktop users can import into their own environments.
What are the Premium functions?
Power BI comes in two variants. There are a number of differences between the Power BI Premium and Power BI Pro version. The most significant are:
- Incremental refreshing of the data. Normally, your data set is reprocessed all over again when you refresh it in Power BI. But that approach can the very time and resource-intensive for large volumes of data. In Power BI Premium, you can configure the refresh for each data set.
- Linked entities. Premium allows you to refer to existing entities that you have created and add operations to existing entities.
- 100 TB storage. A Power BI Pro user can “only” use 10 GB for sharing dataflows with other users.
Common Data Model support
The Common Data Model (CDM) is a collection of standardized data systems and a metadata system. This gives you consistency of data and its meaning for applications and business processes. Dataflows supports the CDM by giving you simple mapping of all data in any form in the standard CDM entities like Account, Contact, etc. Business analysts can benefit from the standard table and its semantic consistency. Or adjust entities based on their unique requirements.
What does this mean for your organization?
This method of sharing prepared data has a tremendous impact on your way of working. Because the data that your data flows generate are actually saved in an Azure Data Lake Store, this allows you to reuse them throughout your whole organization and beyond, for all kinds of purposes, like analytics (with solutions like Azure Databricks and Azure Machine Learning). The data and the models that this generates can then be the basis for endless BI processes and reporting for your business.