Microsoft recently released a new feature in Power BI called “Dataflows”. You may have seen it pop up in a recent release of Power BI Desktop under Home\Get Data\Power BI Dataflows (beta) as shown below.
It is also visible when you log into PowerBI.com and go to an App Workspace as shown in 1 below. Note it is not visible in “My Workspace” as shown in 2 below. App Workspaces are a pro feature and it, therefore, follows that you must have a pro account to see dataflows.
What are Dataflows?
I’m glad you asked because I have been scratching my head asking the same thing. I have been confused to date because there are lots of cogs in this machine. Based on my reading and research, here is my simplified explanation.
Dataflows are:
- An online service provided by Microsoft as part of Power BI (software as a service, or SaaS).
- In effect, dataflows are an online data collection and storage tool.
- Collection: It uses Power Query to connect to the data at the source and transform that data as needed.
- You will need to be able to access the data either through a cloud service (such as Dynamics 365) or to your PC/Network via a gateway.
- You can also use Power Query to write queries from scratch, such as my Power BI calendar table.
- Storage: Dataflows then stores that data in a table in the cloud so it can be used directly inside PowerBI.com, but more importantly (from my view) directly from Power BI Desktop.
- Collection: It uses Power Query to connect to the data at the source and transform that data as needed.
- Dataflows leverage the Power Query skills you have learnt (or are learning) using other tools (like Power BI Desktop, Power Query for Excel) allowing you to reuse those same skills in this online tool.
- Tables that are created as a result of the dataflow are stored in an Azure Data Lake.
- If you don’t know what that is, don’t worry – I don’t understand it either. The point is it doesn’t matter because it is all done automatically for you by the tool.
- Dataflows include the concept of the common data service (CDS) or common data model directly in the tool and you don’t have to know what it is, nor care.
- If you don’t know what that is, don’t worry – it doesn’t matter now/yet.
- This will become very important in the future as it will make the process of getting data out of complex databases (such as MS Dynamics 365) much easier in the future.
Dataflows are (and will be) much more than this, but for now, I think this is enough detail for the average business user.
Uses For Dataflows
The immediate use that comes to mind for me is the ability to standardise access to source data. Up until now, when I create a new Power BI Desktop file, I have always connected back to the original data source and loaded the data I needed for the new PBIX file.
Often the data needed to be transformed, so I performed some transformation steps in Power Query before loading.
Then, later on, when I needed to create a new workbook, I had to either copy the Power Query steps or re-write the steps again from scratch.
Each person in the same organisation with the same need for the same data had to do this too hence multiplying the rework. You can imagine that there is the potential to have many versions of the same thing.
With dataflows all that changes. One person can create a single dataflow and make it available via an App Workspace.
Anyone with access to that App Workspace can then directly access the table of data directly from that dataflow (demo to follow).
Other benefits include:
- Effectively creating an online centralised data mart/data warehouse for storage of data in a format that is better suited to reporting and analytics than the source data.
- As new dataflow connectors become available, it will be easier than ever to connect to important data such as “customers” from services such as MS Dynamics 365. I have tried to do this myself in the past and found it far to complex for the average business user. This is all about to change.
- Dataflows have incremental refresh (currently just for premium) meaning that large data refreshes that currently can take a very long time to complete can be configured to refresh just the changes.
Dataflows Video Demo
I have produced a video demonstrating how to create a dataflow. see 2 notes below:
Note 1: In the video I state that I can’t work out how to create a dataflow without first loading a PBIX file. It can be done (see image below), but it is not intuitive (Microsoft agrees and will change it shortly).
Note 2: There is a bug at the moment (Microsoft agrees) where the changes I made to the Locale for the date column didn’t stick. This will also be fixed in the near future.
Steps to Create Dataflow
I have also documented the steps to create a dataflow to get you started below.
- Login to PowerBI.com
- Create an App Workspace
- If no data is loaded in the workspace, click “skip” in the bottom right of the screen (as shown above)
- Click on the name of the app workspace on the left hand side (shown as 1 below, then Create (2), Dataflow (3).
- Click Add new entity (1 below) and proceed from there.
What Benefits do you see?
I would be really interested to hear from you about how you might use dataflows in your business. Please post your comments below.
Source: Matt Allington of Excelerator BI.
I am sure this post has touched all the internet users, its really really nice piece of writing on building up new website.