March 26, 2009

Linchi Shea recently asked whether people were using database snapshots or not. The consensus seems to be that they’re used during testing and rarely in production.

I use them for testing as well. They’re very good when used this way. Being able to revert to an snapshot is what makes the feature useful. It’s not that hard to understand really. It’s very very much like a database transaction.

Here’s the template that I use, just cut and paste into management studio and type Ctrl+Shift+M in order to replace the template values. It’s not quite foolproof, you do need to understand snapshots and edit the script to specify data file names. But maybe you’ll find it useful.

March 12, 2009

SQL Server has its own share of jargon. Some cool sounding words, some not so much.

Here are some words that you will never hear me pronounce:

UPSERT a portmanteau of update and insert. There’s just something that seems wrong about the sound of it. So if pressed, I’d use the full phrase, “insert or update”. With SQL2008’s new MERGE command, hopefully I’ll have less of a need to.

SARGable a contraction of Search Argument Able. For example “The query is not sargable because the where clause has an expression in it that contains a function.” Again, I just don’t like the sound of it. Unfortunately, it’s hard to come up with a better word and so a phrase has to be substituted as in: “The query needs to perform a scan because the where clause contains a function”.

Here are some words that I am always looking for excuses to pronounce.

HOBT an acronym of Heap or B-Tree. Essentially it’s the name given to the physical structure of every index or table. Microsoft explains it better than I can. It shouldn’t be too hard to guess why I might like the word. I’m a fan of the book. In fact, fellow Canadian blogger Aaron Alton writes a (high quality) blog called the HOBT.

Q-BERT Okay, this one really isn’t an existing acronym for any SQL Server related term, but that doesn’t mean it couldn’t be in the future. Maybe one of the following could catch on:

March 3, 2009

Not too long ago, I wrote an article on the aggregate function CHECKSUM_AGGREGATE. This function works very well in conjunction with CHECKSUM or BINARY_CHECKSUM.

When evaluating the functions HASHBYTES, BINARY_CHECKSUM and CHECKSUM, I discovered a few limitations that were useful to know when deciding on a solution.

The most obvious limitation to these checksum functions is well documented in Books on-line. For example, the documentation for BINARY_CHECKSUM says that BINARY_CHECKSUM(*) return different values for most but not all changes to a row.

The documentation says that these functions don’t support non-comparable database types such as text, ntext, image, cursor, and xml. What’s interesting is that NVARCHAR(max) columns are supported, but I discovered some interesting limitations using a script similar to the following:

BINARY_CHECKSUM(@longString)
This works, but only up to 255 n-characters (0xff n-characters). This was the most surprising result. 255 seems awfully short. I’ll have to doublecheck my tests and then I plan to create a microsoft.connect suggestion that BINARY_CHECKSUM supports longer strings.

CHECKSUM(@longString)CHECKSUM also works but it truncates strings to 4000 n-characters.Update Mar. 6, 2008: As it turns out, CHECKSUM doesn’t truncate strings, it’s the REPLICATE function that truncates strings to 8000 bytes!

HASHBYTES(@longString)I tested this with ‘MD5’. HASHBYTES works as expected except that like CHECKSUM, it truncates characters to 4000 n-chars before applying the function.Update Mar. 6, 2008: HASHBYTES doesn’t truncate strings to 8000 bytes, but it does raise an error if the string exceeds that length. The error is String or binary data would be truncated.

It looks like for 2008, Books On-Line has added a remarks section to HASHBYTES. The remarks explain that the input is limited to 8000 bytes. It also turns out that there is a microsoft.connect issue (issue 273429) that suggests that HASHBYTES should support longer strings. This issue has a few workarounds that get around the limitations. It’s one of the few issues on Microsoft.Connect that actually contain useful information in the workaround section. It’s good to see.

So there you are. I’m confident that the existing behaviour is good enough for most uses. But it would be dangerous to recommend solutions based on these functions ithout understanding the limitations of the procedures.

Comments Off on Limitations on HASHBYTES, BINARY_CHECKSUM and CHECKSUM