A major drawback of modulo hashing is that the size of the cache pool needs to be stable over time. Changing the size of the cache pool will cause most cache keys to hash to a new server. Even though the values are still in the cache, if the key is distributed to a different server, the lookup will be a miss. That makes changing the size of the cache pool—to make it larger or for maintenance—an expensive and inefficient operation, as performance will suffer under tons of spurious cache misses.

For instance, if you have a pool of 4 hosts, a key that hashes to 500 will be stored on pool member 500 % 4 == 0, while a key that hashes to 1299 will be stored on pool member 1299 % 4 == 3. If you grow your cache by adding a fifth host, the cache pool calculated for each key may change. The key that hashed to 500 will still be found on pool member 500 % 5 == 0, but the key that hashed to 1299 be on pool member 1299 % 5 == 4. Until the new pool member is warmed up, your cache hit rate will suffer, as the cache data will suddenly be on the ‘wrong’ host. In some cases, pool changes can cause more than half of your cached data to be assigned to a different host, slashing the efficiency of the cache temporarily. In the case of going from 4 to 5 hosts, only 20% of cache keys will be on the same host as before!

The first thing to notice is that our values are not normalized. The number of visitors is a number and gets larger and larger. To normalize it, we simply divide it by 100, since all numbers are below 1. The same holds for the lag. Most of the lags are lower than 30. Therefore, I will divide the lag size by 30.

Notice that there are many more approaches for normalizing the data! This is just a quick normalization on the data, but feel free to use your own normalization method. My normalization process is closely related to the MinMaxScalar normalization which can be found in sklearn (scikit-learn).

With just a few lines of Python code we can create a Multi-Layer Perceptron (MLP):

The way it works is that you take the size of the database and then divide that number against the total size of all databases. You then use the replicate function with the | (pipe) character to generate the ‘graph’ so 8% will look like this ||||||||

You can use this for tables with most rows, a count per state etc etc. By looking at the output the graph column adds a nice visual effect to it IMHO

It does the job and doesn’t require you to go out to a different product, so it works pretty well for occasional administrative queries.

In the Chicago perf class last month, we had a student ask if partition level locks would ever escalate to a table level lock. I wrote up a demo and everything, but we ran out of time before I could go over it.

Not that I’m complaining — partitioning, and especially partition level locking, can be pretty confusing to look at.

If you really wanna learn about it, you should talk to Kendra — after all, this post is where I usually send folks who don’t believe me about the performance stuff.

MSDN defines Deterministic encryption as always generates the same encrypted value for any given plain text value. Which means that if you have a birthdate of 01/03/1958 it will always be encrypted with the same value each time such as ABCACBACB. This allows you to index it, use it in WHERE clauses, GROUP BY and JOINS.

Randomized encryption per MSDN- uses a method that encrypts data in a less predictable manner. This makes Randomized encryption more secure, because using the example above each encrypted value of 01/03/1958 will be different. It could be ABCACBACB, BBBCCAA, or CCCAAABBB. All three encrypted values are subsequently decrypted to the same value. Since the encrypted value is random you cannot perform search operations etc. as you can with Deterministic.

Part 1 is about building the certificates and keys needed to encrypt data.

Common advice here is to set the clock backwards. My problem with that is that you’re probably doing this on an unsupported unknown black-box flaming garbage can of a system set up by someone who wasn’t meant to do it – because otherwise they wouldn’t be using the evaluation edition. So what are the repercussions of setting the clock backwards? Perhaps their application spawning silently in the background and trashing this or other databases with bad date information? Perhaps you’ll lose your RDP connection and then be unable to connect back in because of the SSPI error generated by a clock mismatch?

What I like about this is that it can be easily dropped into a job scheduler (e.g.- Jenkins) and then you’ve got a way to routinely check (and correct) all the configuration settings of the SQL instances that you monitor.

Pester would not have been my first thought for configuration checking, but it does serve as another useful option.