Archive for: December 22nd, 2016

If you’re like most aspiring data scientists, you’ll try to learn this code by using the copy-and-paste method. You’ll take this code from a blog post like this, copy it into RStudio and run it.

Most aspiring data scientists do the exact same thing with online courses. They’ll watch a few videos, open the course’s sample code, and then copy-and-paste the code.

Watching videos, reading books, and copy-and-pasting code do help you learn, at least a little. If you watch a video about ggplot2, you’ll probably learn how it works pretty quickly. And if you copy-and-paste some ggplot2 code, you’ll probably learn a little bit about how the code works.

Here’s the problem: if you learn code like this, you’ll probably forget it within a day or two.

This is a thought-provoking article that applies to all disciplines, not just data science.

Running multiple UPDATE STATISTICS commands for different statistics on a single table concurrently has been available under global Trace Flag 7471 since SQL Server 2014 SP1 CU6 and SQL Server 2016 CU1. Microsoft have documented this trace flag here and here.

It sounds like, for the most part, you might not want this flag turned on, but read the whole post.

Don’t try to build job security into what you do. I know many that worry about giving up the knowledge to others. Having the sole “how to” knowledge for some, gives them a sense of job security. While to a point that might be true, it also locks you in to your current position. Many that hoard their knowledge never advance because they find themselves invaluable in their current position. “We can’t move them because they are the only ones who know about such and such”. Why put yourself in that position? If you can’t ever be replaced, you also can’t move up.

As a lone dba, I find this run book to be vital. It allows me to direct someone to the book and I can walk them through running anything I need them to in my absence. It allows me to take a vacation or a day off while giving others the tools to get things done.

Exactly. It’s easy to get caught in the trap that your value is in the specific details of some process that you know, and so the company can’t get rid of you because you’re the only person who knows this. One of the counter-intuitive results of IT culture is that reputation comes from sharing information rather than hoarding it.

With LLAP enabled, Spark reads from HDFS go directly through LLAP. Besides conferring all of the aforementioned benefits on Spark, LLAP is also a natural place to enforce fine grain security policies. The only other capability required is a centralized authorization system. This need is met by Apache Ranger. Apache Ranger provides centralized authorization and audit services for many components that run on Yarn or rely on data from HDFS. Ranger allows authoring of security policies for: – HDFS – Yarn – Hive (Spark with LLAP) – HBase – Kafka – Storm – Solr – Atlas – Knox Each of the above services integrate with Ranger via a plugin that pulls the latest security policies, caches them, and then applies them at run time.

In this module you will learn how to use the Linear Gauge Power BI Custom Visual. The Linear Gauge would often be used to visualize a KPI. It gives you the ability to compare an actual vs target as well as showing up to two trend lines.

This can be a very useful visual. The tricky part is that the bars aren’t scaled the same, so when your eyes want to compare bar lengths, it can get a little confusing.

The trickiest part of wiring a circuit like this is detecting a button press. Most logic boards don’t know if an input circuit should poll at high or low levels. That’s where pull-ups come in. Above, you can see we set one of the pins for the button to be a pull-up (or an input if we were using another board). That means it will pull the current and look for impedance. The other important thing is our debounce. With circuits, one button press can actually turn into lots because as soon as the switch completes (or interrupts) the circuit, it starts sending signals. A debounce is like a referee saying “only look for a signal for this long” and it will filter out extra “presses” based on current that might linger on a press.

Once we detect our button press, we’re calling the function below. All it does is read the current LED pin values, and looks to see which one is currently lit, and then lights the next one.

Go from understanding general purpose input/output pins to calling SMO via a web service all in one post. If you’ve got an itch for a weekend project, have at it.

On-premises data gateway: Formally called the enterprise version. Multiple users can share and reuse a gateway in this mode. This gateway can be used by Power BI, PowerApps, Microsoft Flow or Azure Logic Apps. For Power BI, this includes support for both scheduled refresh and DirectQuery. To add a data source such as SQL Server that can be used by the gateway, check out Manage your data source – SQL Server. To connect the gateway to your Power BI, you will sign in to Power BI after you install it (see On-premises data gateway in-depth).

The “trick” to making this work is to encapsulate the DBCC command as a string, and to call it with the EXECUTE () function. This is used as part of an INSERT INTO / EXECUTE statement, so that the results from DBCC PAGE are inserted into a table (in this case a temporary table is used, although a table variable or permanent table can also be used). There are three simple steps to this process:

Create a table (permanent / temporary) or table variable to hold the output.

Insert into this table the results of the DBCC PAGE statement by using INSERT INTO / EXECUTE.