Seek and You Shall Find

I've written about the inner workings of SQL Server for many years. In some of my writing, I describe the methods by which I've discovered some of these internal operations. One of my favorite techniques is to use the sp_helptext stored procedure to examine the definitions of the system stored procedures. This technique lets me see how SQL Server gets the information it presents when you run a system procedure such as sp_help or sp_helpindex. Frequently, understanding how the system procedures work can let you write your own procedures that return a more-useful subset of system information.

As I started exploring the insides of SQL Server 2005, I assumed I'd be able to employ similar techniques. But I was in for some surprises as I tried to discover exactly how SQL Server 2005 stores and manipulates metadata. And, the product was a moving target—every time I thought I'd figured something out, the next build I got my hands on exhibited different behavior. I experimented with my first build of SQL Server 2005 back in January 2003. I discovered to my shock that this build offered no way of using T-SQL to extract system procedure definitions. I could use sp_helptext as usual to extract the definition of my own procedures in user databases, but not the system procedure definitions. The sp_helptext procedure returns the definition of procedures that exist in the local database, and I quickly discovered that the system procedure definitions aren't stored in any accessible database.

SQL Server 2005 uses a new, hidden system database called the resource database. You can see the physical files that this database uses—mssqlsystemresource.mdf and mssqlsystemresource.ldf—in SQL Server's data directory, which by default is \Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data. However, when you connect to SQL Server and try to use a database called mssqlsystemresource, you receive a message saying it doesn't exist. When you use sp_helpdb to list all the databases in your instance, no database listed is using these files.

Microsoft made the resource database inaccessible to ensure quick, clean upgrades and to allow rollbacks of intermediate updates or bug fix releases. The development team realized that implementing the system catalog as a set of views rather than as a directly accessible base table gives Microsoft the flexibility to adjust the catalog schema in the future without affecting existing applications. The problem of customers directly updating the system tables has been a concern from the beginning. And because the system catalog in SQL Server 2005 has so much more coverage than previous releases, the potential for catastrophic error increased.

Allowing uncontrolled system catalog updates by a system administrator would make it impossible for Product Support Services (PSS) and the SQL Server team at Microsoft to accurately diagnose problems in the field. I can't say I disagree with their decision, but I sure wasn't happy about it. As frustrated as I was at not being able to see the system procedure definitions, I wasn't going to give up so easily. I eventually figured out that if I create a SQL Profiler trace to capture my stored procedure call and the statements ex-ecute within a stored procedure—including the system procedures (by using the SP:Starting and SP:StmtStarting events)—I can see the system stored procedure definitions and save them for later analysis.

However, my small success wasn't as rewarding as I'd anticipated. Months later, around the time of Beta 2, Microsoft provided a function called object_definition(), which returns the text of any programmable object, including the system stored procedures and views. For example, I can see the definition of the procedure sp_help, which returns information much like the information it gives in SQL Server 2000. To see the definition of sp_help, I execute the following code:

SELECT object_definition(object_id
('sp_help'))

I won't show you the result; you can run it for yourself when you install SQL Server 2005. I will point out that if you look in the FROM clauses in different procedure definitions, you'll see some brand-new system tables and views.

When I first saw some of the new system table names, I was amazed at how non-descriptive they were. I've been working with SQL Server for a long time, and most of the system-table names have seemed self-documenting—these new names weren't. Exploring the contents of these tables was my next step. But here I realized how shallow my victory was—when I tried to access these tables, I received a message that they didn't exist. For example, one of the new tables I discovered was called syspalvalues. When I ran

SELECT * FROM syspalvalues

I received error message 208: "Object not found." Did these tables, too, exist only in the hidden resource database? I guessed not, but I wasn't sure. I had more research to do.

According to SQL Server 2005 Books Online (BOL), metadata is available through a collection of catalog views, which replace the system tables. These catalog views have names similar to the old system table names but with one significant difference: the separation of users and schemas. In my May 2004 article, "Inside SQL Server 2005 Security" (InstantDoc ID 42031), I described the SQL Server 2005 concept of schemas and the separation of users from schemas. Objects are referenced through the schema that contains them, not through the user that owns them. Every database has a schema called sys, through which you can access all the catalog views.

The view that contains the names of all the objects in a database is called objects. (At least the view names are self-explanatory!) To see the data in this view, you could issue

SELECT * FROM sys.objects

Because views are based on underlying tables and running the above statement produces a different list of objects in every database, I assumed that each database has its own collection of system tables, which (like the resource database) are hidden. As with system procedures, I can see the definition of the catalog views by using the object_definition() function as follows:

SELECT object_definition(object_id
('sys.objects'))

This statement returns a query that selects from a table called sys.objects$. But like syspalvalues, when I try to select from that table, SQL Server tells me it doesn't exist.

In addition to a completely new suite of catalog views, SQL Server 2005 has a set of views for backward compatibility called compatibility views. SQL Server 2000 documents the fact that accessing system tables is permitted and supported, so some production applications include queries against the SQL Server 2000 system tables. So that these applications will work after an upgrade, Microsoft has created views of the same names as the old tables. So in addition to the catalog view called sys.objects, SQL Server 2005 has a compatibility view called sysobjects. Selecting from the sysobjects view shows data in the same form as in the SQL Server 2000 sysobjects system table, but the sys.objects catalog view returns different information. (Since the compatibility views can also be accessed through either the sys schema or the dbo schema, you could also select from sys.sysobjects or dbo.sysobjects. Both sys and dbo are default schemas.)

Besides sysobjects, many other system tables from earlier SQL Server releases appear in SQL Server 2005 as compatibility views. These views are for backward compatibility only; they don't expose any metadata related to features Microsoft introduced in SQL Server 2005. Therefore, when using new features, such as Service Broker or partitioning, you need to use the catalog views.

Another reason for using catalog views instead of compatibility views is that some compatibility view columns may return NULL or result in an arithmetic overflow. For example, this garbled information can result from accessing compatibility views that return user IDs and type IDs. The problem occurs because, in SQL Server 2005, you can create more than 32,767 users, groups, roles, and data types, which exceeds the capacity of the smallint data type. You can't display any SQL Server 2005 user ID greater than 32,767 by using the compatibility views.

Although the catalog views aren't system tables, they're at least as useful as those tables. The column names in these views are frequently more meaningful, and they're easily accessible. While exploring a new SQL Server 2005 feature called included columns, I realized that SQL Server 2005 has no supplied procedure or view to return a simple list of which columns in an index are included columns and which are regular key columns. (I mentioned included columns in my May 2005 article, "10 Things to Love About SQL Server 2005.")

A column in the sys.index_columns view contains a 0 or 1 depending on whether the column is an included column, but I wanted output that was easier to read. The code in Listing 1 creates the get_index_columns procedure. If you supply a table name as the only parameter in the current database, the procedure returns all keys for all indexes on that table, along with the word KEY or INCLUDED. Alternatively, you can supply an index name and get the columns in that index. For example, to see the columns in the IX_Address_City index of the Person schema's Address table, I'd run the following code:

You could expand Listing 1's procedure to do some error checking and to account for the XML indexes. But this procedure gave me exactly what I wanted and let me start exploring the use of the new SQL Server 2005 metadata, which is a benefit in itself.