Santa Loves Power BI and R

This past week I was talking with the big guy up north, jolly old fella, and the discussion came up about his toy production levels. Santa was complaining about how hard it was to measure the performance of all his elves. Naturally I started babbling about how much I enjoy Power BI and that I use it on all kinds of sources of data, google analytics, excel sheets, sharepoint, and SQL data warehouses just to name a few. Now by this point most people would have wandered off looking for another conversation, but I must have struck a chord with Santa. He jumped right in the conversation and told me how he had just moved all his local data centers into Azure and more specifically SQL data warehouses. It was saving him loads of money in addition it has freed up all his I.T. elves to move to more important tasks, building the NES Classic for Nintendo, they are way behind in production. To make a long story longer, I was able to convince Santa to give me a small sample of data so I could show him how to use R to visualize his data in PowerBI. Here is what I came up with:

Santa Production Levels

Needless to say he was very pleased. I explained the chart to Santa, each bar represents the average production volume for each elf. Then the whiskers at the end of the bar represent the +1 and -1 standard deviation away from that mean. It essentially tells you how consistent each elf is able to produce products and what is the average production rate. For example, Buddy the Elf can produce an average 148 items in a day, he has a daily variance of 10 items. Charlie can produce on average more items but has a wider daily variance. Snowflake has the lowest average production level but is one of the more consistent producers. Santa gave me a big smile and said “nice job.”

Let’s walk through how I did this.

Open up PowerBI Desktop, Click the Get Data button on the Home ribbon and select Blank Query. Click Connect to open the Query Editor. ClickAdvancedEditor on the View ribbon. While in the Advanced Editor paste the following code into the editor window, click Done to complete the data load.

Before you exit the query editor Rename the query to Production. It should look similar to the following:

Production Query

Click Close & Apply on the home ribbon.

Add the following measures by click on the New Measure button on the Home ribbon.

Avg = AVERAGE(Production[Prodution Volume])

The Avg measure will determine the height of each bar in the bar chart.

StdDev = STDEV.P('Production'[Prodution Volume])

The StdDev will calculate the standard deviation for each elf.

Ymax = [Avg]+ [StdDev]

The Ymax calculation adds the Avg measure value to the standard deviation for production. This produces the upper arm of the whisker.

Ymin = [Avg]-[StdDev]

The Ymin calculation is subtracts the standard deviation from the Avg measure value. This produces the lower arm of the whisker.

Once you have completed making all the measures you should have a Production table with the following fields:

Added Measures

Add the table visual by click on the Table visual in the Visualizations pane. Add the Fields which are shown below. Your table should look identical to this:

Production Table

Next, add the R Visual from the visualization Pane. When you click on this you will get a message stating “Enable Script Visuals” click Enable to proceed.

Note: If you have not installed R or enabled the preview features of R in Power BI you should follow this tutorial which helps you get everything set up. For this particular visual we are using ggplot2. This is a package for R and should be installed in the R environment. You can follow this tutorial on how to install ggplot2.

Add the following fields into the R visual:

Add Fields to R Visual

Next in the R Script Editor add the following code to generate the R Script.

3 Comments

I am using R and RStudio and following advice from a link you sent us to, Installed MS R too.

However, even though I have installed ggplot2 in R and for MS R too, although the result of that exercise was very confusing as it told me it had installed it and then told me
“Error in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]) :
there is no package called ‘Rcpp’
Error: package or namespace load failed for ‘ggplot2’ "

Secondly, I went through the process of telling Power Bi Desktop that I am now using MS R and told it where it was stored … and proceeded to get your xlsx file … no joy. I did a local search for the file but no luck there either.

I appreciate this is free of charge but I would love to be able to conclude the exercise.

One thing that at least I needed to appreciate was the old Microsoft pitfall of thinking that “” here is the same as “” in any other MS product … I had to change them in R when I worked through this example there since it would not initially work in Power BI.

I got error messages including Rcpp package not installed and all sorts of other annoyances. After a while of worrying and research, I closed R and reopened it and some of these errors went away.

I have installed MS R Open v 3.3 but it is not as intuitive as RStudio but if Power BI needs it, Power BI’s got it now.

Finally, I worked through this in RStudio and got a dynamite chart BUT the vertical axis shows the sum of production for each Elf rather than the average. Here is my code and if you have time and energy can you show me the error of my ways? If not, never mind!

# Dynamite chart

# read in the first worksheet from the workbook xxx.xlsx
# first row contains variable names