To install PolyBase on SQL Server 2016 or 2917, there is a dependency on the Oracle Java Runtime Environment (JRE). Oracle made a breaking change to the JRE from version 8 to 9. I published a workaround that is available here.

While the Cosmos DB emulator is a great way to learn Cosmos DB, it’s better to work directly with the full Cosmos DB service in Microsoft Azure. Previously this was a barrier to learning for many people. Now you can get 12 months of free access to Cosmos DB, Azure SQL Database, Linux and Windows virtual machines, and certain types of storage. Other Azure services are now always free including IoT and AI and Cognitive Services such as speech and face APIs.

Many people are staying in shelters after being displaced by the flooding caused by hurricane Harvey. There were requests made for mental health professionals to go to shelters and talk to people who need help. People with professional credentials were willing to help. Floodwaters prevented some of the willing and qualified to be able to get to the shelters. This seems like a great use of telehealth technology.

I’ve had people ask me how to learn SQL without installing a database. This past Sunday afternoon I saw a packed class of current and aspiring data scientists learn SQL in a great class taught by Ted Petrou. You can indeed learn SQL for free without installing anything. Let me paraphrase Michael Pollan, well known for “Eat food. Not too much. Mostly plants.” Here’s my suggestion for anyone who works with data:

Learn SQL. Not too much. Mostly online.

Let me explain. If you work with data, you need to know SQL. Even if you have a bias against or don’t use relational database management systems (RDBMS) such as SQL Server, Oracle, and DB2, you still need to know SQL. SQL syntax or similar syntax is used to query all types of data sources, not just RDBMS – thus my advice to “learn SQL.”

There’s a lot to SQL syntax. Just look at the excellent books from Joe Celko and Itzik-Ben Gan. To get started with SQL, it’s important to keep it simple and focus on just the basics. You can learn simple syntax and accomplish a lot – thus my advice for “not too much” so that you don’t get overwhelmed. Ted Petrou recommended starting with SQL Queries for Mere Mortals after learning the basics online.

There are many great free resources for learning SQL. Here’s a site that you can browse to and start immediately. And here’s another. Neither site requires registration. If you are willing to register for free, data.world is a great site for learning SQL. It’s what Ted used in his class. The advantage of data.world is that it has many interesting datasets you can work with.

Why start learning online? Because you can start immediately and not be troubled with installing database software, database tools, and a database itself. Just navigate to the URL and start doing SQL – thus my advice for “mostly online.”

Once you’ve learned a bit of SQL online, you might want to consider installing your own RDBMS. You can download Oracle for free. You’ll probably want to download the free version of TOAD to access your Oracle database. Mac users may want to install MySQL or PostgreSQL. MariaDB is free and available for Mac, Windows, and Linux. For Windows users, I recommend installing SQL Server Express. It’s free and has great, free tools. That’s really important. Your day to day experiences with SQL are going be experienced through the tool you use to access the RDBMS. For SQL Server, the free SQL Server Management Studio (SSMS) is probably your best choice of tool to use. Unlike other free tools such as TOAD, SSMS is full featured. If you want to do something more advanced involving other languages in addition to SQL, you could use Microsoft’s free Visual Studio Community Edition or Visual Studio Code, both available for download from here. Keep in mind that SQL Server 2017 is available for both Windows and Linux. I think you’ll find SQL Server to be particularly user friendly and easier to work with than other products. You can install SQL Server 2017 on Docker running on Windows, Linux, or Mac. Read more about it here. You can also create a SQL Server database in Azure, which is Microsoft’s name for the cloud. When you create an Azure SQL Database, you can choose to have a sample database installed at the same time.

When you decide to have your own RDBMS to learn with, you’ll need a sample database. The classic Northwind and pubs databases are still available, but I think you should consider something a bit newer. The AdventureWorks database is newer, but it is being phased out. As of SQL Server 2016, Microsoft’s recommended sample database is Wide World Importers which you can obtain from GitHub. Of particular interest to data scientists is SQL Server 2017’s support for R and Python. These are advanced topics. To get started learning SQL or improving your data science skills, I’ll go back to what I said before.

How is your health? How is your job in information technology affecting your health? I want to know! Since I started blogging here almost a decade ago as a new Microsoft Most Valuable Professional, I’ve become a registered nurse and now I’m a full-time psychiatric nurse practitioner student.

There are articles and studies purporting that information technology is a low stress career field. Other articles and studies say that information technology is a high stress career field. I found articles that predated the emergence of the cloud claiming that IT is low stress. Has the cloud changed that? Are you stressed about the rate of change in IT? Or are you not stressed specifically because you have only a few years until retirement from your slowly moving company and you can be oblivious to the rate of change outside of your company? Do you find the rate of change invigorating and the perfect antidote to the monotony of endlessly churning out mundane code?

Remember the other definition of DBA – Doing Business All night. Do you check email after hours? Do you feel like you’re never done? When I work as a nurse, I thoroughly enjoy clocking out at the end of my shift because I’m done. That’s it. No encroachment of that job into my personal time.

How is your weight? Are you exercising? What are you eating? How much are you sleeping? Do you worry about job security? Do you yearn for more social interaction in your job or is it what you are comfortable with? Are you frustrated by your job or work environment? How do you deal with the frustrations?

In school I have to write papers about various topics. I’m wondering if it would be a good use of my time to investigate mental health among IT professionals. Please comment or contact me directly.

Recently Microsoft released a preview version of the SQL Server Diagnostics extension to SSMS. You can read more about it here. I downloaded it from here and you can see the results below. If you install the extension while SSMS is up and running, you’ll have to stop and restart SSMS in order to see the new menu option.

Figure 1. SQL Server Diagnostics extension to SSMS.

If you want to see how the tool works, you’ll need to either wait for a dump or forcibly create one. The instructions on creating a SQL Server dump from 2010 are still current and available here.

The easiest way to force the creation of a dump file is from within Task Manager, but I wanted to direct the output to a location of my choosing, so I used sqldumper.

Figure 2. Creating a dump file from Task Manager.

You’ll need to use an Administrative Command Prompt to run sqldumper. The exact path depends on which version of SQL Server you are running. I’m running SQL Server 2016, so the path is C:\Program Files\Microsoft SQL Server\130\Shared where 130 corresponds to compatibility level of SQL Server 2016. SQL Server 2014 is 120 and so on.

Figure 3. Output from sqldumper /? so you can see the command line options.

You need the pid of sqlservr.exe to run sqldumper. Go to the Details tab in Task Manager.

Figure 4. My pid was 3892. Yours will be something else.

Running sqldumper.exe resulted in the following output:

Figure 5. Output from creating a dump using sqldumper.

I selected the Analyze Dumps option. This is a cloud based service, so be prepared to wait as your large dump file is sent to Microsoft. Notice that you have to pick an Azure data center as your dump’s repository. Can you upload this outside the country where your server is? Do your corporate policies allow a dump file to be shared outside your company?

Figure 6. Uploading dump file for analysis.

After the upload finished, the analysis took several minutes. You have to consider corporate policies and government laws on submitting corporate data to the cloud. If it is allowable to upload your dump to an external cloud server, you might find this new service useful.

There are new HTML 5 enabled custom visualization for making interactive R visualizations in Power BI Desktop. Read the documentation and find the download links here. Because of the dependency on R packages that are loaded with the R library command. The end result is that your required R packages end up installed in your machine’s R library. Some R experts, the people who can help you, may have a tendency to be pedantic about exactly what is a library and what is a package. I point out the terminology in the interest of facilitating communication after reading various posts on this topic. Fortunately, Power BI executes the library command for you and simplifies the installation process. However, problems can occur. After the discussion on installation and configuration problems, this post shows you how to use the new forecasting visualization. A cautionary note: It can take several seconds before the R visualization is rendered. Patience is a virtue.

I originally had my Power BI Desktop point to the version of R installed with SQL Server 2016.

Figure 1. Navigation to Power BI Desktop Options.

I decided that I wanted the full version of Microsoft’s R distribution and that’s what I recommend you use. After all, I wanted to have the exact same R installation that you would have if you only have Power BI desktop and no installation of SQL Server 2016.

Figure 2. Verifying your R home directory.

I went to the official download site for Microsoft R Open. After installing R, I changed the home directory in Power BI Desktop to point to it instead of the version that came with SQL Server 2016. It may not have been necessary, but again, I wanted to have the exact same environment as a business user would have. Most business users will not have SQL Server installed. I did have errors as shown below. I’ve compared the contents of R that comes with SQL Server 2016 to the Microsoft R Open that I installed. It looks like either one should work, but that isn’t what I experienced. I’m going to do further testing. Keep in mind that the primary purpose of this blog post is to show you the capabilities of the Forecasting Visualization and how to use it.

Figure 3. R home directory set to the Microsoft R Open download.

Even with all of the Microsoft R Open packages installed, you will need more R packages installed in your library. As you import custom visuals, you may see notices about needing more R packages. All you need to do is click Install to have the R library command issued for you.

Figure 4. What you see when a custom visualization has dependencies on R package you don’t have.

Once Power BI finishes installing the required packages for you, it shows you a box telling you what was installed.

Figure 5. R packages installed.

If you are still lacking some packages, you’ll see an error message similar to the following when you attempt to use the custom visualization.

Once I imported all four custom interactive R visualizations, I was able to get all of them to work. I needed some time series data, so I downloaded gasoline sales data from the U.S. Census Bureau. Specifically, I downloaded this file. It wasn’t formatted and arranged like I needed, so I shaped the data and made it into a two column Excel file that I’ve shared with you as an attachment.

To import a custom visualization for Power BI desktop, follow the steps shown below.

Figure 7. Import a custom visual by clicking the three dots as shown.

Figure 8. Make sure that the author of the custom visualization is trustworthy!

Figure 9. Select your pbviz file.

Figure 10. Visualization is installed. Click it to use it.

Figure 11. Enable your script visuals.

Figure 12. Using the Time Series Forecasting Chart custom visualization.

Figure 13. Output from the interactive visualization.

Figure 14. Move your mouse cursor to see the interactivity.

Figure 15. From that same position, the left mouse button was clicked and held down while selecting a portion of the visualization.

Figure 16. Result of zooming in to the selected area.

You are able to configure your visualization by altering the parameters.

oHere is a list of links to get you started in understanding Cosmos DB, Microsoft’s new cloud based globally distributed multi-model database. Cosmos DB development started in 2010 as project Florence. When it was initially released to the public as part of Azure, it was called DocumentDB. Multi-model means it is more than just a document database as the screen capture shows:

Figure 1. Database models supported by Cosmos DB.

When you create a Cosmos database, you must specify what type of data model you want. In addition to the original DocumentDB, graph, MongoDB, and key-value pair models are supported. As you can see in the screen capture, choosing a data model is choosing an API.

Graph terminology in Cosmos DB refers to vertices and edges instead of how SQL Server 2017 refers to nodes and edges. The API for Cosmos DB graph is Gremlin. Gremlin is a language for traversing graphs that use a TinkerPop enabled provider such as Cosmos DB. TinkerPop is an open source graph computing framework.

To learn about Azure Cosmos DB, you might want to start with Rimma Nehme’s podcast that I previously blogged about here (direct link to the mp3 is here). At 46 minutes in length, this audio recording is good to listen to while you commute.

Next, there are some Channel 9 videos that you might want to view or download as mp3 and listen to while you commute or exercise.

If you watch the videos, you will notice a graph explorer you’ll need and want to visualize your graphs. Download it from GitHub. Even if you aren’t ready to download and run it, you should at least browse there and take a look at what it looks like and what it can do for you.

Finally, do not despair if you don’t have an Azure account or have exhausted all of your Azure credits. You can download and install the Cosmos DB emulator and start learning about Cosmos DB on your local machine. That’s what I recommend to keep your costs down as you learn. Be sure to watch Kirill Gavrylyuk’s video on that page.

Cosmos DB is Microsoft’s new Azure (i.e. Microsoft cloud) globally distributed multimodel database. There is a good podcast where Rimma Nehme explains it. I’ve long wanted to blog about how to configure USB flash drives to play mp3 files on automotive sound systems. You can downloads Rimma’s podcast as an mp3 available on this page.

To play mp3 files on an automotive sound system, you must have a USB flash drive compatible with your system. You should format your USB flash drive as FAT32. On my car, the maximum size allowed is 32 GB. Larger capacity drives simply will not work.

Figure 1. Format USB flash drive as FAT32.

What I discovered is that the mp3 files need to be placed in a top level folder. I created a folder called Data Platform. Copy your mp3 files into the folder.

Figure 2. Place your mp3 files in a top level folder on your FAT32 USB flash drive. I only have a single top level folder and have not tried having more than that.

Figure 3. Save the mp3 to a top level folder on your FAT32 USB flash drive.

Insert your USB flash device into your car and learn something while you commute! As you can see, it’s easy to download an mp3 file and play it through your vehicle’s sound system.

Figure 4. Listen and learn as you commute.

I’ve had mixed results in general getting the mp3 properties to appear on my car’s sound system. Sometimes the properties appear as can be seen in this example of podcasts used in my graduate studies.

Recent builds of SQL Server Management Studio (SSMS) have a feature called PresentOn that can be accessed from Quick Launch.

Figure 1. Quick Launch location in SSMS.

Enter Present in to Quick Launch and select it.

Figure 2. Select PresentOn.

Using PresentOn is helpful when you give a presentation. It doesn’t increase the size of the results, which I find is a limitation when actually giving presentations. Overall it is a good feature that you might want to use.

Figure 3. Object Explorer and the Query Pane are larger, but not the Results Pane.

To turn off PresentOn, select RestoreDefaultFonts.

Figure 4. Use RestoreDefaultFonts to turn off presenter mode.

If you don’t need to increase the font size of the Object Explorer, which is part of your environment, you can use the zoom feature in the Query Pane. This feature is available all of the time. You don’t need to switch to presenter mode. If the results are in text format instead of grid format, the results will also be magnified.

Figure 5. Using the magnification feature.

If you want more control over the formatting of SSMS, see my instructions found here.

The Northwind database implements the manager to employee hierarchy using an adjacency list, which is a type of graph. This post shows you how to migrate the adjacency list to a graph of one node and one edge. This is a continuation of what was started in this prior post.

In the Employees table, the employee’s manager has an EmployeeID which is stored in the employee’s ReportsTo column. A foreign key relationship is defined where the ReportsTo value stores the primary key of the row containing the manager’s data. For the highest level manager, the ReportsTo column is NULL. The relationship can be queried using a self-join.

Relational databases do not natively support many to many relationships although a junction table is an easy workaround. Graph databases natively support many-to-many relationships. One of the first sample databases Microsoft provided for SQL Server was Northwind. It has three many-to-many relationships as can be seen in the entity relationship diagram shown below. I previously published a list of references for understanding graphs here.

Figure 1. Northwind database ER diagram.

I’m going to focus on one junction table, the EmployeeTerritories table.

Figure 2. Many-to-many relationship implemented with the EmployeeTerritory junction table. The colors carry through to the code samples that follow.

I always tell people to start with obtaining an understanding the data. Let’s run some queries to find out more about the nature of the data.

SELECT * FROM dbo.Employees WHERE EmployeeID NOT IN (SELECT EmployeeID FROM dbo.EmployeeTerritories); SELECT* FROM dbo.Territories WHERE TerritoryID NOT IN (SELECTTerritoryID FROM dbo.EmployeeTerritories);

Not all territories have employees. All employees have territories. As I pointed out in this post, directionality of an edge matters. It’s something you have to pay attention to. The following query shows how to display all of the rows in the many-to-many relationship.

It’s necessary to create a query to retrieve node id pairs to populate the edge table. It’s a modification to the junction table query shown above. I suggest running just the SELECT portion of the following query so you can see what the edge table is being populated with.

The results are the same, although the order is not. There is no implicit order in a relational database. I didn’t use a SORT to make sure you understand that you should not expect the same order of results from different queries.

As you can see, the graph query has very simple code.

I’m pretty sure somebody is going to ask what if the order of the nodes in the MATCH is changed?

No rows are returned because of the directionality of the edge. We defined our graph has having employees belonging to territories and not territories belonging to employees. It’s the direction of the arrow that matters, not the order of the nodes in the MATCH. The following query returns all 49 rows.