Microsoft Analysis Services, MDX, DAX, Power Pivot, Power Query and Power BI

Author: Chris Webb

My name’s Chris Webb, and I’m an independent consultant specialising in Analysis Services, MDX, PowerPivot and DAX. If you need help or training on any of these subjects please check out my company website at http://www.crossjoin.co.uk/
I’ve also co-written some books too:
Expert Cube Development with SQL Server Analysis Services 2008
MDX Solutions (2nd Edition)

When you are writing an MDX expression, everywhere you use a set you can give that set a name and then reference the name later on. This is known as creating an inline named set, something I have blogged about a few times (see here and here) over the years. When you are iterating over a set using a function like Generate() or Filter(), if you give that set a name you can then use the Current and CurrentOrdinal functions to find out more about the item in the set returned at the current iteration.

It returns a set of four tuples on rows: every combination of Gender and Marital Status:

If you pass the set on rows to the Filter() function and give it a name (for example MySet) you can then use the CurrentOrdinal function to find the 1-based ordinal of the current iteration. This query uses the CurrentOrdinal function to filter the set shown above so only the first and third items in the set are returned:

With an inline named set you can also use the Current function to return the tuple at the current iteration. Here’s another query that uses the Current function to remove the tuple (Female, Single) from the set:

I still love MDX, but I’m aware that I blog about it less and less – which is a shame, I know. Therefore I’ve decided that for the next four weeks I’m going to write about some obscure MDX topics that hopefully will make all you SSAS MD diehards out there feel less neglected… even if they don’t have much practical use.

Let’s start off with recreating my ever-popular DAX star-ratings measure in MDX. Well, not exactly pure MDX, but did you know that in MDX you can call some Excel functions (in the same way you can call some VBA functions)? It’s a really, really bad thing to do from a query performance point of view, but it does allow you to do some useful calculations that might otherwise be impossible. Here’s a query on the Adventure Works cube that uses the Excel Rept() and Unichar() functions (functions that do not exist in MDX proper) to recreate my start-ratings measure:

Yesterday there was a webinar on how Power Query is going to be used as the way to load data into the Microsoft Common Data Service. You can watch it online here (if you’re in a hurry, skip to 24 minutes in for the details on the Power Query integration):

I don’t have much to add to what’s in the webinar, but there are a few things that occurred to me:

This is Power Query in a browser. If they can build a web interface for Power Query for CDS, why not for Power BI? It would give us the full power of Power BI Desktop in the browser, on any platform (I know a few people have been asking for Power BI Desktop for Mac), with no tedious manual updating.

In the demo at around the 54 minute mark, Miguel shows a screen where there are two Database Load options:
The “Only load new or modified rows for existing entities” options is… incremental load! This makes me wonder whether Power BI users who want incremental load should be using using the CDS as a staging area (a super-simple data warehouse…?) and then connecting Power BI to it?

I’ll be honest, I’ve not done anything with the CDS so I can’t really say how useful this new functionality will actually be – and I’ve heard mixed reports about the CDS, if I’m honest. Certainly, as someone (I suspect Meagan), mentions in a question, the only way Power BI can connect to the CDS right now is via DirectQuery and not Import, which seems pretty crazy. Still… I’m very curious and will be paying close attention to how it develops. More Power Query in the world can only be a good thing!

Last week I had the chance to do something I have not done before: build a Power BI report to be displayed on a big screen hanging on a wall. To make up for the loss of user interactivity, I used the new Drilldown Player custom visual to cycle through different selections and display a new slice of data every few seconds; Devin Knight’s blog post here has a great summary of how to use it. However I wasn’t happy about the look of the Drilldown Player visual in this particular report: the play/stop/pause buttons aren’t much use if you can’t click on them and the visual doesn’t show all of the values that it is cycling through. As a result I hid the visual behind another one and came up with a different way of displaying the currently-displayed selection.

Here’s a simple example of what I did. Imagine you have two identical tables called Table1 and Table2 loaded into your dataset that contain a list of the 24 hours in a day:

With no relationship between these tables in the dataset, you can display the 24Hour column from one in a table in your report and then use the Drilldown Player to cycle through the values in the 24Hour column in the other. At this point, because there’s no relationship between the tables, the Drilldown Player visual has no effect on the table. Next create a measure called Displayed as follows:

…and add it to the table in the report. This measure uses my old favourite the Unichar() function to display an arrow against the row in the table that matches the currently selected hour in the Drilldown Player. The result is this:

This got me thinking about other fun stuff that I could do with this technique. After adding some more columns to my source data:

…I created the following measure:

Clock = UNICHAR(128335 + MAX('Table1'[Hour]))

This takes the hour selected by the Drilldown Player and displays the corresponding Unicode character for a clock face showing that hour. Here’s what the measure looks like when displayed in a card:

I also had a go at an animation showing the sun and moon rising and setting – I did this by displaying the Unicode characters as data labels in a scatter chart, then using colour to hide everything apart from the data labels – but by this stage I thought things were getting too silly…

Anyway, you can download the report with these animations in here, and view it online here. Have fun!

Over the past few weeks I’ve spent some time playing around with Power BI custom data connectors and while I don’t have anything to share publicly yet (other people are way ahead of me in this respect – see the work of Igor Cotruta, Miguel Escobar and Kasper de Jonge among others) I have learned some interesting things that are worth blogging about.

First of all, the data privacy rules around combining data from different data sources do not apply in custom data connector code. As the docs say here:

Data combination checks do not occur when accessing multiple data sources from within an extension. Since all data source calls made from within the extension inherit the same authorization context, it is assumed they are “safe” to combine. Your extension will always be treated as a single data source when it comes to data combination rules. Users would still receive the regular privacy prompts when combining your source with other M sources.

Those of you who have followed my recent series on this topic, or who have struggled with the Formula.Firewall error, will appreciate how much easier this makes combining data from different sources.

Secondly, you have a lot more flexibility when it comes to different types of authentication for web services. As I showed in my session on web services and M at the Data Insights Summit, there are a lot of limitations when it comes to working with web services in Power BI or Excel. Within a custom data connector, however, you can connect to web services that use OAuth for authentication, you can make POST requests to web services that require authentication and you can pass a web API key from the credentials store through an HTTP custom header and not just through a query parameter – none of which are possible in Power BI or Excel.

I’m sure there are a lot of other useful bits of functionality or behaviour that are only available in custom data connectors – I know I’ve only just begun to learn what’s possible. Even with what I’ve listed here, though, I get the feeling that there will be a lot of cases where you will have no choice but to build a custom data connector just to be able to access certain data sources, even if you only need to create a single report. There may also be cases where it’s preferable to build a custom data connector rather than embed lots of complex M code in a Power BI report or Excel workbook, perhaps to make code portability easier. It’s a bit of a pain to have to have Visual Studio and the SDK installed in order to do this, but building a custom data connector is fairly easy if you already know M and the development experience in Visual Studio (with intellisense!) is much better than in the Advanced Query Editor window.

Recently I’ve been building a lot of Power BI reports from csv and Excel files, and to make sure that scheduled refresh works I have been setting up data sources in an On Premises Data Gateway (what used to be called the Enterprise Gateway). I had assumed that if I was connecting to file-based data sources in my Power BI dataset then, in the gateway, I would need to set up one data source for each file that I’m connecting to – which is a bit of a pain. In fact it turns out that you can set up a gateway data source for the folder that the files are in instead.

Let me give you an example. Imagine that you have three Excel files in a folder called C:\Sales Data:

Now imagine that you have three queries in Power BI that get data from these three files:

However, once the report has been published only one data source needs to be set up in the On Premises Data Gateway for it to refresh successfully, even though the report connects to three different files. Here’s a screenshot of the gateway data source I set up in the Power BI service:

Two things to point out:

The data source type is set to Folder

The full path property is set to the path of the folder that the files used by the report are in, ie C:\Sales Data

Setting up a single gateway data source for a folder is obviously a much better option than setting up multiple data sources for all the files in the folder. Did everyone else know this but me? I guess this is all related to the inheritance of data privacy settings that I blogged about here.

Something I didn’t understand at all when I started writing this series was how the “None” data privacy level worked. Now, however, the ever- helpful Curt Hagenlocher of the Power Query dev team has explained it to me and in this post I’ll demonstrate how it behaves and show how data privacy levels can be inherited from other data sources.

Let’s go back to the original example I used in part 1 of this series where I showed how data from an Excel workbook can be combined with data from SQL Server, and how the data privacy settings on each data source determine whether query folding takes place or not (I suggest you read that post before continuing to get some background). Now, imagine that the Excel workbook is in a folder called C:\Data Privacy Demo, and a query called FilterDay is used to get data from it:

…and the query is run for the first time, then you will get prompted for credentials to access SQL Server and after that you’ll get prompted to set data privacy levels on both data sources used:

The dropdown boxes in the second column allow you to set the data privacy settings for each data source, but look at the data sources listed in the first column. There are two things to point out:

The data sources the two queries are accessing are the DimDate table in the Adventure Works DW database on localhost, and the file C:\Data Privacy Demo\FilterParameter.xlsx. However you’re not being prompted to set data privacy levels on those exact data sources, you’re being prompted to set data privacy levels on the localhost instance and the c:\ drive

The data source names are displayed in dropdown boxes, so there are other options to select here

Clicking each dropdown box is revealing:

For the SQL Server database you can set the data privacy level at two places: the localhost instance (the default), or the Adventure Works DW database on that instance. For the Excel workbook you get set the data privacy level at three places: the c:\ drive (the default), the folder c:\Data Privacy Demo that the Excel workbook is in, or the Excel workbook itself.

Let’s say you accept the defaults and set the data privacy settings to Public on localhost and the c:\ drive:

As you would expect after reading part 1 of this series, the query runs and query folding takes place:

Now, let’s say you copy the Excel file up to the root of the c:\ drive and rename it to filterparameter2.xlsx, then update the FilterDay query above to load data from this new Excel file instead:

At this point, when you click the Data Source Settings button and look at the permissions for the file c:\filterparameter2.xlsx you will see that the privacy level is set to None:

However, it behaves as if it has a data privacy level of Public: the second query that gets data from SQL Server runs successfully, query folding still takes place and you are not prompted to set a data privacy level for this data source. Why?

The “None” data privacy level means that no privacy level has been set for this exact data source. However, when this happens the engine checks to see if a data privacy level has been set for the folder that this file is in and then for all folders up to the root. In this case, since the data privacy level has been set to Public for the c:\ drive, all files in all folders on that drive that have a data privacy level set to None (like this one) will inherit the c:\ drive’s setting of Public:

The same goes for databases on a SQL Server instance: they can inherit the data privacy settings set for the instance. The same is also true for web services, where data privacy settings can be set for different parts of a URL; for example, here’s the list of options for a call to the https://data.gov.uk/api/3/action/package_search web service described in part 2 of this series:

The general rule is that the engine looks for permissions for the exact data source that it’s trying to access, and if none are set then it keeps looking for more general permissions until it runs out of places to look.

In my opinion, I don’t think the way the “None” privacy level and inheritance works is very clear right now – it makes sense now I’ve had it explained to me, but the UI does nothing to help you understand what’s going on. Luckily it sounds like the dev team are considering some changes to make it more transparent. I would like to see the fact that data privacy levels have been inherited for a data source, and where they have been inherited from, called out in the Edit Permissions dialog.

Posts navigation

Follow Blog via Email

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 7,729 other followers

Public Power BI and SSAS Training Courses

I'm running several SSAS, MDX and Power BI-related training courses through Technitrain in 2017. Check out the Technitrain course catalogue for full details, and to see other upcoming courses from the likes of Alberto Ferrari, Andy Leonard, Allan Hirt and Alex Yates.

Need some help?

As well as being a blogger, I'm an independent consultant specialising in Analysis Services, MDX, DAX, Power BI, Power Query and Power Pivot. I work with customers from all round the world solving design problems, performance tuning queries and delivering training courses, and I am happy to work on short-term engagements. For more details see http://www.crossjoin.co.uk