Poking about with DBCC PAGE (Part 1 of ?)

(It's been a while since I last posted - summer fun's been in the way obviously and I've been busy picking up some new and exciting (for me) SQL skills, partially to help with four upcoming TechEds I'll be doing in China and Hong Kong. The posting frequency should increase towards the end of the week but in the meantime, this one's a special request.)

I've been getting lots of requests for more info about how to use DBCC PAGE since my Paul-tells-all post a while back and I had been planning to wait until I'd posted more about the internals of DBCC CHECKDB and moved on to various examples of corruptions before starting to use DBCC PAGE again in the blog. However, a friend made me an offer I couldn't refuse and so I've caved in and I'm starting the series now :-)

I'm going to start with cracking a record from scratch. Yes, I know that option 3 of DBCC PAGE will do it for you - but how does it do it? Sometimes you may not have table metadata and so DBCC PAGE won't be able to use option 3 - in that case you'll need to crack it yourself. Also, there are plenty of curious people who like to be immersed in the bits and bytes and besides, cracking records manually is fun (if you're twisted as well as curious like I am :-)

I've created an example database, dbccpagetest, on SQL Server 2005 which I've zipped up and attached to this post. This way you can all use the same data and pages as me without having to go through the tedium of working out where pages are using the PFS pages (as I explained in the initial DBCC PAGE post).

You won't be able to attach it to SQL Server 2000 but you can always download SQL Server 2005 Express Edition and play with it on that. Unzip the files and attached them using the CREATE DATABASE ... FOR ATTACH syntax:

variable number of bytes to store one bit per column in the record, regardless of whether the column is nullable or not

this allows an optimization when reading columns that are NULL (see here for more info)

variable-length column offset array

two bytes for the count of variable-length columns

two bytes per variable length column, giving the offset to the start of the column valu

versioning tag

this is in SQL Server 2005 only

this is a 14-byte structure that contains a timestamp plus a pointer into the version store in tempdb

So, let's go through the record from the dump fragment above and figure out what everything means. I'm not going to give all the excruciating detail on every possible value of every byte, but will give you enough to be able to crack records yourself (if you're twisted like me...)

Byte 0 is the TagA byte of the record metadata. Its 0x30, which corresponds to 0x10 (bit 4) and 0x20 (bit 5). Bit 4 means the record has a NULL bitmap and bit 5 means the record has variable length columns. If 0x40 (bit 6) was also set, that would indicate that the record has a versioning tag. If 0x80 (bit 7) was also set, that would indicate that byte 1 has a value in it.

Bits 1-3 of byte 0 give the record type. The possible values are:

0 = primary record

a data record in a heap that hasn't been forwarded or a data record at the leaf level of a clustered index.

1 = forwarded record

a data record in a heap that's been updated and was too large to fit in-place on its original page and so has been moved to another page. A forwarding record is left in its place and points to the new location of the record. This is done to avoid having to update any non-clustered index records that point back directly to the original physical location of the record.

2 = forwarding record

sometimes also called a forwarding stub.

3 = index record

an index record in the tree of a clustered index, or in any level of a non-clustered index.

4 = blob fragment

5 = ghost index record

6 = ghost data record

these two are self-explanatory

7 = ghost version record

a special 15-byte record containing a single byte record header plus a 14-byte versioning tag that is used in some circumstances (like ghosting a versioned blob record)

In our example, none of these bits are set which means the record is a primary record. If the record was an index record, byte 0 would have the value 0x36. Remember that the record type starts on bit 1, not bit 0, and so the record type value from the enumeration above needs to be shifted left a bit (multiplied by two) to get its value in the byte.

Byte 1 is the TagB byte of the record metadata. It can either be 0x00 or 0x01. If it is 0x01, that means the record type is ghost forwarded record. In this case its 0x00, which is what we expect given the TagA byte value.

Bytes 2 and 3 are the offset of the NULL bitmap in the record. This is 0x0008 (DBCC PAGE presents multi-byte values in hex dumps as least-significant byte first). This means that there's a 4-byte fixed length portion of the record starting at byte 4. We expect this because we know the table schema.

Bytes 4 to 7 are the fixed length portion. Again, because we know the table schema, we know to interpret these bytes as a 4-byte integer. Without that knowledge, you'd have to guess. The value therefore is 0x00000005, which is what we'd expect to see as the value of the duration column.

Bytes 8 and 9 are the count of columns in the record. This is 0x0003 which is correct. Given that there are only 3 columns, the NULL bitmap of one bit per column will fit in a single byte.

Byte 10 is the NULL bitmap. Hey - it's value is 0xF8 - what's up with that? Convert it to binary and what do you get? 11111000 This makes perfect sense - bits 0-2 represent columns 1-3 and they're all 0, meaning the columns aren't NULL. Bits 3-7 represent non-existent columns and they're set to 1 for clarity. So the 0xF8 value makes sense - phew.

Bytes 11 and 12 are the count of variable length columns in the record. That value is 0x0002, which we again know to be correct. This means there will be two two-byte entries in the variable length column offset array. These will be bytes 13-14 and 15-16, having values of 0x0016 and 0x0021 respectively.

Now, the explanation that the variable length column offset array stores the offsets to the start of the column value is over-simplified. The entries actually point to the start of the following column value - this is done so that we know how long each column is (they are variable length after all). Notice that the offset of the first variable length column value isn't stored - it doesn't need to be because by definition it must begin right after the last offset in the variable length column offset array.

So, the final offset is bytes 15 and 16, which means the offset of the start of the first variable length column must be byte 17 (or 0x11 in hex), which agrees with the DBCC PAGE dump. The offset of the second variable length column is 0x0016, so the first value is from byte 17 to byte 21 inclusive. This value is 0x42616E6666. We know from the table metadata that this is the first varchar column, destination. Checking our handy ASCII conversion table we find that this translates to 'Banff'.

Using similar logic, the second value is from byte 22 to byte 32 inclusive and has the value 'sightseeing'. Both of these match the data we're expecting.

And that's it. We've just cracked a record from scratch. Now you try it with the second row in the table. In future posts I'll start making things more complicated (maybe trace into the version store or add in some off-row LOB values).

Hopefully this will satisfy all of those who've been waiting for more info on using DBCC PAGE and now I'm looking forward to getting my side of the bargain from my friend...

I believe the first 00 in 00000909, and the 00 in the next line are the null bitmap, for 9 columns, 2 bytes. And no variable length columns here.

Is this right?

The second question is based on the analys to the first one, it appears that SQL 2000 will show each four bytes reversed.
Is DBCC PAGE showing the bytes in a different order on SQL 2000 and SQL 2005, or is the physical record stored different in these versions?

Based on the wornderful explaination that you have given. I have one question for you regarding index performance.

Consider there is a lookup table that holds state codes as fixed length char(2) e.g. WA, AL, NY etc. This column is going as a foreign key into one big table called Citizen table (has an entry per citizen with column StateCode coming from look up table)

1. Would it be fine to make StateCode as the primary key in lookup table and foreign key in Citizen table or

2. Should we make a surrogate key as Int in the lookup table and have that as the foreign key in the Citizen table.

Since the storage for the fixed datatype is same, would query performance be different if we choose char as the key instead of surrogate int ?