Using Custom Functions in Power BI Dataflows

You can use custom Power Query “M” functions in Power BI dataflows, even though they’re not exposed and supported in the preview Power Query Online editor to the same extent they are supported in Power BI Desktop.[1]

As mentioned in a recent post on Authoring Power BI Dataflows in Power BI Desktop, the Power Query “M” queries that define your dataflow entities can contain a lot more than what can be created in Power Query Online. One example of this is support for custom functions in a dataflow. Functions work the same way in dataflows as they work in Power Query Desktop – there’s just not the same UX support.

Let’s see how this works. Specifically, let’s build a dataflow that contains a custom function and which invokes it in one of the dataflow entities. Here’s what we’ll do:

We’ll pull in sales order data from the SalesOrderHeader table in the AdventureWorks sample database to be an “Order” entity in the dataflow.

We’ll use the min and max of the various date columns in the SalesOrderHeader table to get the parameter values to pass into the custom function. We’ll then call the custom function to build a Date entity in the dataflow.

We’ll close our eyes and imagine doing the rest of the work to load other entities in the dataflow to make what we’d need to build a full star schema in a Power BI dataset, but we won’t actually do the work.

Let’s go. Since we’re just copying the code from Matt‘s blog, we’ll skip the code here, but the result in Power Query Online is worth looking at.

Even though Power Query Online doesn’t have a dedicated “create function” option, it does recognize when a query is a function, and does include a familiar UX for working with a function. You will, however, need to clear the “Enable load” option for the query, since a function can’t be loaded directly.

The Order entity is super simple – we’re just pulling in a table and removing the complex columns that Power Query adds to represent related tables in the database. Here’s the script:

Now we need to put the two of them together. Let’s begin by duplicating the Order entity. If we referenced the Order entity instead of duplicating it, we would end up with a computed entity, which would require Power BI Premium capacity to refresh.

This is what the query looks like before we invoke the custom function. With all of the awesome options on the “Add Column” tab in Power BI Desktop, implementing this logic was surprisingly easy.

Most of the complexity in this approach is in the work required to get min and max values from three columns in a single table. The topic of the post – calling a custom function inside a dataflow entity definition – is trivial.

When we’re done, the list of entities only shows Order and Date, because these are the only two queries that are being loaded into the dataflow’s CDM folder storage. But the definition of the Date query includes the use of a custom function, which allows us to have rich and possibly complex functionality included in the dataflow code, and referenced by one or more entities as necessary.

[1] I was inspired to write this post when I saw this idea on ideas.powerbi.com. If this capability is obscure enough to get its own feature request and over a dozen votes, it probably justifies a blog post.