Archive for: June 1st, 2016

Even though I’m taking out UPDLOCKS, the following race condition pattern can still occur

Session A takes out an UPDLOCK, sees that key 266 does not exist, releases its UPDLOCK, and prepares to insert

Session B takes out an UPDLOCK, sees that key 266 does not exist, releases its UPDLOCK, and prepares to insert

Session A runs its insert

Session B attempts to run its insert, but fails because of a duplicate key error

We need to hold that lock.

Understanding concurrency is one of the toughest parts of being a database developer, especially because it’s historically been difficult to test it. I like what Kendra’s done here, making the process easy to follow.

1. Work backwardsJot down when you were first notified of the breach and start to retrace the events that led to you being notified. This may mean investigating logs on databases, firewalls, routers and everything else in between. It’s a massive job to sift through all of the log data trying to find that one event that led you to receive that dreaded call. Fortunately, the data analytics field has been used in recent years to speed up the investigative work by pulling together multiple log files and analyzing it for anomalies.

While presenting at SQLDay in Wroclaw, Poland, on the Query Store, I was asked a pretty simple question, which takes precedence, the Query Store or a Plan Guide?

One of my favorite answers to questions is “I don’t know” because it gives me the opportunity to learn. Let’s figure this one out together.

I’ll post the code to recreate this experiment within AdventureWorks at the end of the article. I’m doing this because the code for forcing execution plans using Plan Guides can be pretty doggone long (you may need to generate your own XML from a plan on your own system, fair warning).

The answer is not quite as clear-cut as I would have expected, and I’ll be interested to see what others find.

The batch layer stores all the data with no constraint on the schema. The schema-on-read is built in the batch views in the serving layer. Creating schema-on-read views requires algorithms to parse the data from the batch layer and convert them in a readable way. This allows input data to freely evolve as there is no constraint on their structure. But then, the algorithm that builds the view is responsible to manage the structural change in order to still deliver the same view as expected.

This shows a coupling between the data and the algorithms used for serving the data. Focusing on data quality is therefore not enough and we may ask the question of the algorithm quality. As the system lives and evolves, the algorithms may become more and more complex. These algorithms must not be regarded as black boxes, but a clear understanding of what they are doing is important if we want to have a good data governance. Moreover, during the batch view creation, data quality transformations could be done so as to provide data of better quality to the consumer of the views.

Lambda is an interesting architectural concept, as it tries to solve the age-old “fast or accurate?” problem with “both.” Get your fast estimates streamed through a speed layer, but your accurate, slow calculations handled through the serving layer. Definitely check out this article.

Single-page app: The initial page loads very quickly and asynchronously fetches the list of tables, table statistics, data sample, and partition list. Subsequent navigation clicks will trigger only 1 or 2 calls to the server, instead of reloading all the page resources again. As an added bonus, the browser history now works on all the pages.

These are some nice changes. I still don’t think a web app replaces quality tooling (like Management Studio), but if a web app is what you have, it should at least be nice.

What is a good size? I usually try to get it to roll over around 10 MB. I use a monitoring tool and when the large error log alert is triggered, I have it run sp_cycle_errorlog for me so mine always stay a healthy size. You don’t need fancy tools to do this though. If you know about how fast your logs grow, you can set up a SQL Agent job to run it on a schedule to keep your logs healthy.

How many logs should I keep? This is completely up to you, but since I keep my logs so small, I try to keep 15 of them. Why so many? I do it so I can go back and see issues further back if needed. You can adjust the amount you keep by right clicking on SQL Server Logs in SSMS and selecting “Configure”

I’d personally prefer to keep more logs—at least 32-45 days worth—but that’s going to depend upon the environment.

1. Conditional Formatting on Tables

This one is awesome and has been a feature that we’ve all been waiting on for a while. We now have the ability to apply conditional formatting on a field based on a numeric value, as seen in the screenshot below.

Unfortunately, the mathematical approach has flaws. First of all, Power(10, 38) exceeds the range of any numerical data type in SQL Server. There is no way to store or work with this value in calculations. Secondly, once you try to raise 10 to a power greater than 30, you start seeing floating point calculations (the values are approximate). For example, select Power(Cast(10 as decimal(38, 0)), 31) — casting as decimal(38, 0) because it exceeds int or bigint — yields 9999999999999999600000000000000. That’s clearly an approximated value and is not going to work for calculations where we’re expecting a precise value. So, that leaves the hacky version I didn’t want to do because I just cannot count on the mathematical approach working. Here is the base query using the hacky calculation