The SQL PASS Austria chapter is organizing a SQL Saturday in Vienna. It will be held on the March 6th 2014. (For those of you how are checking their calendars right now – yes it is actually a Thursday!) So register and be part of the first SQLSaturday that is held on a Thursday !!!

Also the Call for Speakers is opened until January 5th, so make sure to submit your sessions by then.

I just published a new post at http://blog.gbrueckl.at about how to use Power BI to analyze the size of your multidimensional Analysis Services cube down to partition-, aggregation- and attribute-level!

While browsing the web for any BI related topics I recently came across this blog post about SAP HANA and how it can be used to analyze Big Data. Its actually pretty cool, SAP together with Amazon Web Services (AWS) offer a free try out of their tools for 4 hours which you can use to rebuild a predefined demo. The demo itself is very well explained and document with videos and scripts and gives some good insights on how to deal with Big Data in SAP HANA. Basically it is divided into 3 steps: 1) Load Wikipedia data (Pagehits, etc.) from Hadoop/Hive (~2GB of flatfiles) 2) Create a data mart with SAP HANA 3) Analyze results with SAP Lumira

Having done a lot recently with Power BI and its tools I asked myself if this would also be possible with Power BI? So I first did the demo on SAP HANA and afterwards I was quite sure that I could do the same also with Power BI.

And this was the initiation of this blog post where we will do the same demo but instead of SAP HANA we will use only tools of the Power BI suite. Basically we only use 3 of our Power tools: 1) Power Query to load the data 2) Power Pivot to build the "data mart" 3) Power View to analyze the data

Once we have downloaded the files we can start loading them using Power Query’s "Load from Folder" source. It is basically textdata which needs to be split into several columns first. The delimiter used is 0x0001 which in Power Query’s M-language needs to be resolved to "#(0001)":

The columns are defined as PROJECTCODE, PAGENAME, YEAR, MONTH, DAY, HOUR, PAGEHITCOUNTFORHOUR and BYTESDOWNLOADEDFORHOUR where PROJECTCODE can be further split into language code and the real project code. There is some more logic which I will not explain in detail like error handling, data conversions, etc., which are similar to what is done in the SAP HANA demo.

If you also did the demo on SAP HANA you found some quality issues in the data why you needed to remove rows with more than 53,000,000 PageHitsPerHour as this 5 rows mess up the whole analysis. The above M-script already handles this (last statement).

There are some more important things to mention here. First of all we are dealing with about 37M rows, so loading to datasheet will not work. Instead we need to load the data directly into Power Pivot. I had several memory issues when using 32bit Excel but after switching to a 64bit Excel everything went just fine. The import itself takes about 15 minutes which I think is OK for roughly 2GB of data and 37M rows.

Both are very simple tables with only two columns. Again we can use Power Query to import them and add them to our Power Pivot data model.

The last thing to add is a time-dimension.SAP HANA has its predefined time-dimension. In the case of Power BI I simply used a linked table in Excel that holds all necessary days or actually all days of 2013. Adding all those tables and linking them we end up with this pretty simple Power Pivot model:

This is very similar to what you create during the demo as an analytical view in SAP HANA so both approaches are very similar here.

Now that the model is set up we can do our analysis using Power View and classic Excel Pivot Tables:

All together it took me about 2 hours to build the whole solution of which it took about 1h to download and process the data. The final workbook containing all the data has ~500MB which is mainly because of the PageName column which contains a high number of unique values which cannot be compressed very well. After removing the PageName column and some further tuning of the datamodel for compression I could bring the size down to 148 MB. This is quite OK – in numbers this is a compression of 12 from originally 1.74 GB

I hope you understand that I could not attach the whole workbook, instead I created a smaller workbook with only 50k rows by adding a TOP filter in the Power Query. This workbook can be downloaded below.

As I showed in this post, Power BI is capable of handling this kind of data very well. Both, in terms of data volume and also in terms of the type of data (unstructured/semi-structured data). Once the data is loaded into Power Pivot it can be analyzed just like any other data source using all Excel reporting capabilities.

Recently I faced the requirement to give specific users of my Analysis Services Multidimensional cube the opportunity to process the cube whenever they want. There are several ways to achieve this: – start a SQL Agent that processes the cube – create some kind of WebService that does the processing – create some custom .NET application that does the processing – run some script (PowerShell, Command Line, ascmd.exe, …)

From a usability point-of-view none of the above is really convenient as all options involve some kind of external process or application. In this post I will show how to process a cube using cube actions which can be triggered from Excel Pivot Tables natively. So this is what we want to achieve:

First of all we need to create a new "Visual C#" project of type "Class Library". To work with Analysis Services we need to add two references:

"Microsoft.AnalysisServices" refers to Microsoft.AnalysisServices.dll which can be found in the shared features folder of your installation (default is c:\Program Files\Microsoft SQL Server\110\SDK\Assemblies\)

"msmgdsrv" refers to msmgdsrv.dll which can found be in the \OLAP\bin-folder of your SSAS instance (default is c:\Program Files\Microsoft SQL Server\MSAS11.MSSQLSERVER\OLAP\bin\)

Once these references are added, we can start to create our processing code:

There are two things here that are worth to point out. First of all we need to somehow establish a connection to the server that hosts the SSAS database in order to process an object. The first thing that comes into mind would be to create a new connection to the server and run some processing XMLA. Unfortunately this does not work here because as this would result in a deadlock. The assembly will be called as a Rowset-Action which in terms of locking is similar to a query. So if we run separate processing command within our Rowset-Action this processing command can never be committed as there is always a query running on the database. The query (=Rowset-Action) will wait until the function is finished and the function waits until the processing is committed resulting in a classical deadlock!

To avoid this we need to connect to the current users sessions using "*" as our connection string:

//connect to the current session… server.Connect("*");

The second thing to point out is the return value of our function. In order to be used in a Rowset-action, the function has to return a DataTable object. The results would then be displayed similar to a drill through and a separate worksheet would pop up in Excel showing the result:

From my point-of-view this can be a bit annoying as after each processing this new worksheet pops up and you loose the focus of the pivot table. So I investigated a bit and found another way to display the output of the processing.

When a drill through / action throws an exception, this will also be displayed in Excel without actually displaying the result of the drill through / action in a separate worksheet. By default it may look like this message:

Well, not really much information here right?

To display more information, e.g. that the processing was successful (or not) we can throw our own exception in the code after processing is done:

I added a separate parameter to control this behavior, in the code above the default would be option 3 – custom Exception.

Once this project is built the DLL can be added to the SSAS database. It is necessary to grant the assembly "Unrestricted" permission set:

The last step would be to create the appropriate cube actions that call the assembly:

It is necessary that the action is of Type=Rowset, otherwise it would not be displayed in Excel! The Action Target is up the you, in this example it will be displayed on every cell. The Action expression is the actual call to our assembly passing the cube name as a parameter. The second parameter controls how the result is displayed and could also be omitted. In this example I added 3 separate actions, one for each result-type (as described above).

And that’s it! This simple steps allow you and your users to process any cube (or any other database object) from within your favorite client application, assuming it supports cube actions.

This is the final result where the three cube actions can be triggered from an Excel pivot table:

The attached zip-file includes everything that is necessary to recreate this example: – SSAS project – ASSP_Processing project

Security is always an important aspect of any BI solution. Especially for big enterprise solutions the security-concept can become very complex. Analysis Services Multidimensional in general offers two option on how to define security: Cell Data Permissions and Dimension Data Permissions. Because of the huge impact on performance Cell Data Permissions are barely used. Dimension Data Permissions are usually the way to go and can cover probably 98% of the security requirements. Though, also Dimension Data Permissions can get quite complex, especially for large-scale cubes with 100+ or even 1000+ users. Just think of an Analysis Services that holds sales data associated to KeyAccounts – further assume that we have 100 KeyAccounts. Each of this KeyAccounts is managed by one user, so we would end up creating 100 roles – one for each KeyAccount and its manager. In terms of maintenance this can get quite complex and in the long run unmanageable.

To address this issue Analysis Services offers the concept of Dynamic Security. Dynamic Security is evaluated at runtime for each user and allows you to consolidate your roles. So for our example from above instead of having 100 different roles we would end up having one dynamic role serving all 100 users. Basically, when a user connects it queries internal data (within the cube using MDX) or external data (using Analysis Services Stored Procedure (ASSP)/Assemblies) to get the security settings for the current user. Especially in combination with ASSP this can be quite powerful as you can use any datasource that you can query using .Net-code to derive your security from.

To make a role "dynamic" you basically have two options: 1) Username() – returns the current users in format "MyDomain\MyUser" 2) CustomData() – returns whatever was specified in the connectionstring’s CustomData-property

Both are native MDX-functions and can be used in an expression to build your Allowed Set, Denied Set or DefaultMember.

This works just fine, as long as the expressions returns a valid MDX set/member of course. But what about the Visual Totals – setting? You can only check or uncheck the checkbox and you are not able to use any dynamic expressions. That’s what the UI offers you – no more no less. But there might be some rare scenarios where you also need to make the Visual Totals dynamically based on Username() or CustomData() – and this is what this post is about.

So, as we have already seen above, it is not possible to make the Visual Totals dynamically using the UI. Fortunately there are also other ways besides the UI to work with the cube and modify its structure. I am talking about XMLA (XML for Analysis Services) here, which is the protocol that is used for all communications between a client and Analysis Services. Instead of writing XMLA on your own there is also a programmatically way to do this: AMO (Analysis Management Object). AMO is a .Net library that allows you to integrate calls to Analysis Services within any .Net code. After modifying any object in the code AMO generates the according XMLA for you and sends it to the server which is a much more convenient way than writing plane XMLA on your own.

For this little example I create a slimmed down version of Adventure works. The model contains only the Date and the Product dimension. We will put our dynamic security on the Category attribute of our Product dimension. The model also already contains a predefined empty role called "DynamicVisualTotals" that we will modify later (after the model has been deployed) using AMO. Please note here that AMO only work with the Analysis Service directly (online) but not at design time in Visual Studio. Once everything is deployed this is the AMO code to use to modify the predefined role:

// can be done in the UI oAttrPermission.AllowedSet = "DynamicSecurity.DynamicAllowedSet(CustomData())";// can not be done in the UI oAttrPermission.VisualTotals = "DynamicSecurity.DynamicVisualTotals(CustomData())";

oDimPermission.Update(); } } } } }

1) connect to server "localhost" 2) connect to database "DynamicVisualTotals" 3) get the Role-object for role "DynamicVisualTotals" from the database 4) get the Dimension-object for dimension "Product" 5) get the DimensionAttribute-object for attribute "Category" 6) get the DimensionPermission that is associated to our Role for dimension "Product" 7) get the AttributePermission that is associated to our attribute "Category"

So far this is straight forward. The AttributePermission object actually represents the UI that we have seen above. It has properties for all the fields and textboxes on the UI. The most important of course are: – AllowedSet – DeniedSet – Defaultmember – and VisualTotals

For the first three I was already aware that they are of type String but the VisualTotals I expected to be Boolean which turned out to be wrong – it is also a String!

Naïve as I am I simply put a MDX expression in there that resolves to "1" or "0" (Boolean, similar to the Checkbox in the UI, it could probably also resolve to "true" or "false" or a Boolean type directly but I did not further test this) – and, it worked like a charm! The MDX expression I used refers to an ASSP called DynamicSecurity which contains a function called "DynamicVisualTotals" that returns "1" or "0" based on what was passed in the connectionstring using CustomData():

Please note that XMLA always uses the internal IDs which may not be the same as the name of the objects!

In this post I showed how to use AMO and XMLA to modify a security role and use an expression to make the VisualTotals setting dynamic which is not possible using the UI only. This is just a very simplified example but I used this approach on some of my enterprise deployments and it works just fine also for very complex scenarios.

Recently when browsing the MSDN PowerPivot Forums I came across this thread where the question was ask on how to show the value which is directly linked to an element in a parent-child hierarchy instead of the aggregation of all "children". In this post I am going to address this problem finally showing a proper solution.

First of all I would like to start with some background to so called "datamembers". The term "datamember" originates from parent-child hierarchies in multidimensional models. It is a dummy-member that is created automatically below each hierarchy-node to hold values that are linked to a node-member directly. This is of course only possible for parent-child hierarchies.

As we can see, the value of "Amy E. Alberts" is not the Sum of its children but has a difference of $108,000.00 This difference is because of the "datamember" that is also a child of every parent-child node which may be hidden. For multidimensional models his behavior can be changed by setting the MembersWithData-Property to "NonLeafDataVisible":

Executing the same query again we now get this results:

Now we see "Amy E. Alberts" twice – the node and the datamember.

Doing a similar query on a tabular model we will get this result by default:

As you can see, we still have the datamember but without any caption. Though, this can of course be changed when you flatten out the hierarchy using DAX by simply replacing BLANK() by "<my datamember caption>" in your calculated columns.

Anyway, this is not the topic of this post but just some background information. Now we will focus on how to query data that is directly associated to a node or its corresponding datamember in MDX and DAX.

MDX:

For MDX this is quite easy as the datamember can be referenced by simply using ".datamember" of a given hierarchy node:

The DataMember-Measure only shows values that are directly linked to the current node of the parent-child hierarchy. This is pretty straight forward and very easy to accomplish.

DAX:

In this DAX this problem is a bit more tricky as we do not have that built-in ".datamember" function. Also navigation within hierarchies and parent-child hierarchies itself are not really supported in DAX. Though, there are several blogs out there that describe how to handle parent-child hierarchies by Alberto Ferrari (here) and Kasper de Jonge (here). Based on these solutions we can create our datamember-calculation.

First (and only) thing we need is to add a calculated column that stores the path depth of the current row. This can be achieved by using this formula:

Using our previous example and selecting "Amy E. Alberts" together with our new column we get this:

As we can see there are two different Levels below "Amy E. Alberts" where Level=3 holds our datamember-value and Level=4 holds all the other values (=real child members). Lets add a calculated measure to help you make the final calculation more understandable:

MinLevel:=MIN(Employee[Level])

So, for our [DataMember]-calculation we simply have to extend the context to only include rows where [Level] = [MinLevel]

Well, obviously our calculation is not returning the expected results. This is because of a very common pitfall which I am also very prone to. Whenever a calculated measure is used within an other calculated measure, a CALCULATE() gets wrapped around it thereby taking over the current context. In our case this means that ‘Employee'[Level] will always be the same as [MinLevel] as [MinLevel] gets evaluated for each row and doing MIN() on only one row of course always returns the value of the current row which we compared it to!

To solve this issue we have to place the original [MinLevel]-calculation directly into our [DataMember]-calculation:

This change finally makes the calculation return the correct results also when browsing using the hierarchy:

Even though DAX is not very good in handling hierarchies, the [DataMember]-calculation can be accomplished quite easily. This calculation also works for hierarchies of any depth without needing to adopt it. In terms of performance it may be further improved to the following formula – though the impact will be minor: