Steven Wang - 汪上洲

Steven is a data scientist and a Microsoft Certified Solution Expert on Data Platform, MCSA on SQL server 2008/12 and MCITP on BI Developer, Database Developer and DBA.

He is very passionate about data insights, machine learning and MS SQL server technology and business intelligence and has designed and implemented a number of large scale enterprise BI solutions in a variety of industries.

Steven is an active SQL server community participant. He answers questions at MSDN SQL server forum and speaks at events like TechEd, CodeCamp, SQL User Group etc.

Steven is currently head of data science at QuintilesIMS at Sydney Office.

There are only 3 big players stay in the first group: Tableau, Qlik and Microsoft. Given that the MS did a fantastic job on the new Power BI platform, I'm not surprising MS is waken up and rising in this space. I was surprised that Microstrategy, IBM, SAP and Information Builders are fallen out the first group. Particularly, Oracle is dropped off the radar.

This is getting more and more interesting and look forward to how all these players doing this year.

rCharts is an
R package to create, customize and publish interactive javascript
visualizations from R using a familiar lattice style plotting interface.

All of these
three packages are created and maintained by Ramnath
Vaidyanathan of McGill University. Since all
these three packages are not on CRAN and needs to be installed fromgithubusing theHadley's devtools package.

library(devtools)

install_github('ramnathv/slidify')

install_github('ramnathv/slidifyLibraries')

install_github('ramnathv/rCharts')

As Ramnath mentioned in one of this
Slidify workshop, While the installation process fromgithubis
relatively painless for Mac/Linux/Ubuntu users, for the windows users it
is not that luck. I have to confess that when I used the above code to install
these packages it caused a chaos for me. Not only that the package was not
installed properly but also some other installed packages were broken!!!

Thiago Silva’s post Installing Slidify on a Windows
Machine has provided his advice on how he successfully made a workaround. I
will give a step by step guide on top of his blog, which I have successfully
installed these 3 packages in window 7 and windows 10 machines.

3.Unzip packages and you will notice that all
packages have a “-master” suffix in the folder name. “-master’ will need to be
removed as below:

4.Now we will use the build utility from
dvtools package to convert a package source directory into a single bundled
file. If set binary = FALSE, this creates a tar.gz package that can be
installed on any platform, provided they have a full development environment. Now
we will use the below code to build packages:

Quantiling process is very powerful when 2 dimensional of
sales data is available, like product sales and market sales. In business it is
very common to use quantile matrix to perform the account segmentation. An example
of segmentation definition can be based on a 2 X 2 deciling matrix like below:

Even this is a very common process, it is indeed not a straightforward
simple calculation to get quantile values in SQL server. There are many ways we
can do to calculate such, but I found that use window function in conjunction
with CEILING function is the easiest way to calculate quantile value. Let’s try. I
will use a deciling process to start with.

The first step to calculate decile value for a specific
object (Account, Product etc) is to calculate cumulative values and then
calculate the cumulative percentage.

This will generate the cumulative product sales value and its cumulative percentage. the top 10 records as below:

Now the question is how can we assign a decile value based on the cumulative percentage value? there are many ways to do it. But as I found that there is a very simple way to do it by using SQL CEILING function:

You can randomly check the results:

Based on this deciling calculation, we can generalize the quantiling calculation on any quantile number between 1 and 100.

I hope this tip is useful and please see the attached scripts if you want to have a try.

In the SQL Server 2012 reporting service server event log, you might have these 2 errors:

1.Report Server cannot load the SQLPDW Extension

2.Report Server cannot load the TERADATA Extension

While you are not using the TERADATA/SQLPDW functionalities in your report, basically you can ignore these errors. This is because by default TERADATA and SQLPDW data extension is turned on in the reporting server configuration file.

Even you are not using the TERADATA/SQLPDW functionalities, it is still good idea to suppress the error logged into the event log.

Open the reporting service configuration file rsreportserver.config under your SSRS installation folder. If you installed SSRS 2012 by default, the file will be under:

Sometimes, we need to script out the index definitions and its option settings. If you have a source control then you are lucky, as all index definitions should be kept in your source control system. If you want to just script out indexes for a table, you were also able to do so by selecting all indexes in the object explorer details. But if we have a query can easily script out the index definition and its settings then it will be much easier for testing or index creation automation.

In SQL server, the index definition and option settings are not stored in a single table, you have to join a few system tables or DMVs together to get all those information. Here I will share the script I used to script out all the indexes.

My consideration:

1. Only apply to the user defined indexes;

2. Clustered indexes created with Primary Key constraint is not included though it can be done.

3. Spatial Indexes and XML Indexes are not considered.

4. The Data compression setting for Partitioned indexes is simplified here as it is pretty complicated to be done just in a select statement.

5. The Drop_Existing and Sort_In_TempDB setting is hard-coded in the script, which depending on your situation you can change it.