Tag: Dataflows

Power BI datasets and dataflows are the two native data sources for Power BI reports. Connecting to a datasets allows a report to be built against an existing Power BI dataset in place, and dataflows represent a source of data that has had transformations applied to it. When connecting to Power BI dataflows, data is imported, into a data model but the connection to a Power BI dataset is a direct connection.

The
two sources handle identity in drastically different ways, and this can lead to
confusion when dealing with multiple accounts and tenants. This post is an
attempt to help clarify this confusion

Connecting
to a dataset

To connect to a Power BI dataset, select the “Get Data” button from the ribbon, select the “Power BI” tab, select Power BI dataset, and finally the “Connect” button.

Next,
if a user is signed into Power BI Desktop a list of workspaces and their
datasets are presented. If not, the user is prompted to sign in. The dataset
can then be connected.

The important thing to notice here is the list of workspaces itself. The list presented is a list of the workspaces available in the tenant belonging to the currently signed in user. It is the same list of workspaces that can be chosen as a publishing target. It should also be noted that the identity of the user is displayed in the upper right corner of the dialog box, and the identity can be changed directly from there.

Connecting
to a dataset in a different tenant

The signed in user can be changed by selecting “Sign in” at the upper right of the Power BI Desktop client, or within the connection dialog itself. If the user signs into a different account (in a different tenant), a different list of workspaces and datasets will then be offered. The dataset source is hard linked to the currently signed in user. In this way, the Power BI dataset source behaves differently than all other data sources, which maintain connection credentials separately.

Connecting
to a Dataflow

Connecting to a dataflow follows the same steps as a dataset, with the exception that the “Power BI dataflows” option is chosen.

At this point, a Power BI data connection dialog will be shown.

There is only one authentication option
because Power BI dataflows only support one authentication option.

Unlike datasets, dataflows are NOT linked to the currently signed in user. The connection is authenticated, not the current user. The “Sign In” button must be selected, and authentication completed to connect to a Power BI dataflow.

Once signed in, selecting the “Connect” button will display a list of workspaces that contain dataflows. Expanding the workspace and then the dataflow will expose a list of entities that can be imported into the Power BI data model.

The connection information for the dataflow is cached with Power BI Desktop, and subsequent connections to dataflows will not require the user to sign in. The same authentication credentials will be used.

it should be noted that unlike the dataset connection dialog, this one does not show the current credentials and does not allow those credentials to be changed. This makes the process of changing credentials to use dataflows in multiple tenant somewhat less than intuitive.

Connecting to a dataflow in a different tenant

With datasets, changing the currently
signed in user will result in a different set of datasets being presented when
the dataset option is chosen. This is different with dataflows. No matter what
user is currently logged in, the cached credentials will be used.

This behaviour can be confusing when
multiple tenants need to be accessed. With most other data sources, the cached
credentials are linked with the specific data source. For example, when two
different SQL databases are connected, Power BI caches two different sets of
credentials.

To connect to dataflows in a different
tenant, the current connection information needs to be cleared. This can be
done with any data source, but it is particularly important to dataflows as it
is the only way to switch connection credentials.

To clear the credentials for the dataset, select “File”, “Options and Settings” and the “Data Source Settings”. The Data source settings dialog will then be presented.

Unlike most other data sources which can have multiple entries in the list, one for each unique data source, there will only be one source for dataflows. It is named “Power BI dataflows. For example, if the current instance of Power BI Desktop has authenticated to 3 different SQL servers, there will be three SQL Server connections in this list, but there will only be one for dataflows, no matter how many tenants that have been connected.

To switch tenants, the current credentials must be either cleared, or edited. The cached credentials can be fully removed by selecting “Clear Permissions” or they can be changed by selecting “Edit Permissions”. If cleared, the user will be prompted for credentials the next time the dataflow option is selected. If edited, the new credentials will be stored.

Conclusions,
and recommendations

It is possible to work with multiple
tenants for both connected datasets and dataflows. However, the methods for
doing so are completely different for either option. This can obviously lead to
some confusion.

It is my opinion that this behaviour should
be changed, and that the behaviour or connected datasets is the more intuitive.
If the credentials for the currently logged in user were user for both types of
connection, it would be much more intuitive, and also easier to user for report
designers.

Power BI dataflows are the first place that many users will encounter the new Power Query web based interface. Until now Power Query has been restricted to Power BI Desktop and Excel. This new web interface is, well, new, and it doesn’t contain all of the capabilities of the more mature client based interface. The good news is that you can take advantage of both the reusability of dataflows, and the maturity of the Power BI Desktop interface.

Two Interfaces

A quick glance at the user interfaces for Power Query on the web and in Power BI Desktop reveals the feature difference.

Power Query editor in dataflows

Power Query editor in Power BI Desktop

The Desktop editor has a full ribbon interface with a wide array of capabilities, while the web interface has a simple button bar with a subset of features. In the images above (which show the exact same set of queries in the two interfaces) it is easy to see that the combine binaries, or expand tables feature is not there for the “Content” column.

Advanced editor

The key to bringing all of these capabilities in Power BI Desktop to Power Query in dataflows is the Advanced editor. Power Query is at its essence an interface that is used to construct queries using the M language. This core code is available to you in both Desktop and dataflows.

In Desktop, the Advanced editor is available from the ribbon both in the Home tab and in the View tab. In the web based editor it is available by right clicking on an entity (query) and selecting Advanced editor.

Advanced editor in Power BI Desktop – Home Tab

Advanced editor in Power BI Desktop – View Tab

Advanced editor in dataflow editor

The code revealed by this editor can appear rather daunting for a complex query, but all you really need to understand is how to copy and paste it. Build up whatever query you need using Desktop (or Excel!), open the Advanced editor and copy it to your clipboard. Then, either create a new dataflow or add an entity to a new dataflow using the Blank Query data source. Once the editor is open, right click on the query, open the Advanced editor, and paste the query from your clipboard.

Done. Well, almost. While both environments execute M code, there are a few differences to be aware of.

Some functions may not work

There are subtle differences between the M engine in Excel and the one in Power BI Desktop. This approach works very well with these two products, but occasionally an incompatibility can crop up. This is no different with the M engine for dataflows. If you do encounter an incompatibility, try achieving the same thing a different way in Desktop and trying again.

Data Sources

Dataflows do not support all of the data sources that Power BI Desktop and Excel do. This will of course change over time, but as of this writing, dataflows are in preview, and currently support 24 data sources compared to the almost 100 in Power BI Desktop.

Data sources supported in dataflows – Nov 2018

Queries posted into a dataflow that use an unsupported data source will therefore likely not work. However, there’s nothing stopping you from trying, I’ve been pleasantly surprised by a few.

Functions

Functions ARE supported in dataflows. They can be created using a blank query (and copying function from Power BI Desktop). However, if that’s all that you do, you may receive an error like “This dataflow contains computed entities, which require Premium to refresh” or “We cannot convert a value of type Table to Function”.

You do NOT need Premium to use functions, but a function must its “Enable Load” function disabled. This is done by right clicking on the function and toggling the Enable Load item to off.

Turn off Enable Load for a function

Computed Entities

Computed entities (or calculated queries) are supported by dataflows but because the type of calculation can’t be predicted, they require the isolation that dedicated capacity (Premium) provides.

Referenced tables are an example of computed entities. If you are in the habit of designing a base query that does not load data, and then creating variants of that table that do in your reports, you will need to change that design in dataflows in order to avoid the Premium requirement.

In Power BI Desktop, this is the difference between Duplicate and Reference when creating a new query from a base query. Duplicate will simply create a new query with the same steps, while Reference will create a computed entity. If you want to avoid Premium, you’ll need to use Duplicate.

New query create options in Power BI Desktop

The Power Query capabilities in dataflows are more powerful that they might appear at first glance. Power BI Desktop is the key to unlocking them, unless you’re already a total wizard at writing M code. Even then, the new editing features in Power BI Desktop likely put it over the top as an editor.

For now we need to cut and paste, but I would love to see a day when Power BI Desktop could connect directly to a dataflow and edit it in place.