Microsoft's beefy remake brings hefty new capabilities and a slew of new tools to master

To call SQL Server 2000 a 90-pound weakling because it lacks certain high-end features is a bit like calling Hercules a wimp because he never ran a marathon. Not every strongman can perform every feat, and not every company needs every heavyweight feature. Many enterprises, large and small, have been running their businesses on SQL Server for years.

Still, there’s no denying that SQL Server always fell short of Oracle Database when it came to true enterprise features, especially in the areas of high-availability and disaster recovery. Whereas Oracle met the demands of mission-critical, 24/7 operation with features such as online re-indexing, snapshot-based restores, and fail-over log shipping, SQL Server couldn’t provide true zero-percent data loss without the help of third-party solutions. For customers who didn’t need these protections, SQL Server offered a solid database with lower licensing fees. But maintenance-related downtime was a fact of life.

Until now. More than five years in the re-making, SQL Server 2005 not only addresses shortcomings in high-availability and disaster recovery, but brings sweeping improvements in almost every area. Management tasks have been streamlined, and monitoring and diagnostics have been beefed up. A default locked-down configuration, data encryption, and other new features strengthen security. Analysis Services, Reporting Services, and Data Mining are now fully integrated and mature. DTS (Data Transformation Services) has been rewritten from the ground up, emerging as the more powerful and flexible SQL SSIS (Server Integration Services). Integration with Visual Studio and the .Net/CLR opens SQL Server internals to .Net developers and .Net to SQL developers, meaning the power of .Net can now be tapped for stored procedures, data flows and transformations, BI applications, and more (see “Taking the database beyond SQL”).

SQL Server 2005 also promises better performance. I didn’t run performance benchmarks for this review, but in addition to allowing DBAs to partition the database to balance load, Microsoft has done a lot of work to improve memory management, indexing strategies, and query optimization. Users will see the difference.

Many of the changes in SQL Server 2005 will take current SQL developers and DBAs some time to learn. One key high-availability feature, database mirroring, is for practical purposes still “beta.” Microsoft still has work to do, especially in integrating and refining SQL Server’s many tools. Microsoft’s extreme makeover closes the gap between SQL Server and Oracle, however, and makes SQL Server a true contender even for very large databases and the most demanding enterprise environments.

A new way to move your data

SSIS, the revamped set of tools for moving and manipulating data, not only provides greatly enhanced performance, reliability, and functionality over DTS, but also offers many new prebuilt tasks that reduce the amount of coding needed to perform the most common operations, including backing up databases, re-indexing tables, and running integrity checks. More important, SSIS completely changes the modeling paradigm that DTS uses, logically dividing its building blocks into separate control-flow and data-flow components.

Control-flow containers are graphical representations of operations that would otherwise take developers dozens of lines of code and many hours of debugging to create. Some of the operations you can perform with the new containers (namely Sequence, For Loop, and Foreach) can’t even be done in DTS or would be too difficult to manage.

SSIS also extends the reach into different platforms, and not just different databases. For example, whereas DTS provided limited visibility into OLAP and practically none into data mining, these are natively supported in SSIS, meaning you can run data-mining queries, work with slowly changing dimensions, and send the data wherever you like, even to Reporting Services reports. In fact, SSIS serves as a bridge between SSAS and SSRS and any ODBC-compliant data source.

Package configurations are another huge enhancement. These allow you to define certain elements of the package -- file locations, database connections, special log-ins, and such -- to be read in from an external source (like an XML file) and applied at run time. This function lets you deploy packages very quickly because you don’t have to change them to move them from one system to another; you simply change the dynamic elements in the configuration and the package will run on the new server.

Last but not least, the new .Net integration releases SSIS from DTS’s dependence on the limits of VBscript and moves serious coders into the richness of VB.net. For this reason, file and string manipulations, math operations, and Win32 calls are much easier and faster.

As you would expect, the increased power and flexibility of SSIS come at the price of increased complexity. SSIS presents a learning curve that can seem daunting, but after you learn how to use it, SSIS will change the way your company thinks about its data. Systems that couldn’t communicate before are now perfectly integrated and have the full power of .Net behind them. Complex data load operations into warehouses and disparate systems will take a fraction of the time to build, execute, and support. Other complicated tasks can also be performed much faster, which means deadlines will be met more easily and projects won’t be held up because of the limited functionality of DTS and the way operations have to be taped together.

Maximum DBA

Microsoft has also made some dramatic changes to its management tools, replacing Enterprise Manager and Query Analyzer with the new SSMS (SQL Server Management Studio). SSMS serves as both the database command line, with which you perform tasks such as creating databases and managing user accounts, and the tool used to write database code. SSIS and Profiler were left out of this merger.

These changes won’t leave users completely satisfied. For starters, doing normal query tasks, such as viewing execution plans and even just running queries, is much slower. Screens can become quite unresponsive while queries run, and it takes much longer to pull up execution plans, sometimes causing the window to seize up. Microsoft also changed the layout of the execution plans, so larger ones are much harder to read. Your favorite keyboard shortcuts have been taken away, and right-clicking on a view -- which used to allow you to edit it -- now takes you directly to query builder. A lot of DBAs won’t like this.

But these annoyances are greatly outweighed by richer management capabilities overall. One place Microsoft has succeeded beyond expectations is in the management of SQL error logs. SSMS not only displays the SQL error logs, but combines them with the Windows NT event logs in the same pane, allowing DBAs to see all the relevant information in one place. While we’re on the topic of combined logs, this is a good place to mention the enhancements to Profiler, which now combines SQL trace logs with Perfmon logs. This makes it much easier to correlate performance counters with specific queries. In fact, it’s very easy to find exactly what you’re looking for because all you have to do is click on the performance spike in the Perfmon log and it automatically takes you to the relevant place in the Profiler log.

At the heart of SQL’s new management capabilities are the DMVs (Dynamic Management Views) and functions, which give admins much deeper insight into database operations. Microsoft used to hide the deeper aspects of the database, but meeting the demands of managing large databases means giving DBAs more control. DMVs provide greater visibility into memory, index usage, table stats, server stats, security, and on and on. So many things you couldn’t see at all before are now at your fingertips.

In addition, DBAs get ready-made reports on detailed database and server stats, including schema changes, memory and index usage, resource consumption, blocking, and many others. SSMS relies on SSRS for this functionality, and I find that it sometimes bogs down when bringing up a report on a busy server, or on a large database with a lot of objects. I’m also a little disappointed that Microsoft didn’t provide the RDL (Report Definition Language) code. As a result, you can’t change the layouts of the canned reports, nor can you add your own reports or set up subscriptions, as you normally can in SSRS. You can manually export the reports to Excel or PDF, however.

One key design goal for SQL Server 2005 was tighter security out of the box. To reduce the surface area for attacks, the most dangerous features are disabled until the DBA turns them on. This is a help, but not earth-shattering to experienced DBAs, who mostly have their lockdown scripts written and don’t find it much trouble to run them on install. The truly significant security improvements revolve around password management and rights assignment. Native SQL passwords can now be set to respect Windows password policies, which means that you can require your SQL passwords to have the same password strength, expiration, history, lockout threshold, lockout duration, and age policies as your Windows accounts. You can even require the user to change the password at the next log in. Other important security enhancements include more control over schema changes (discussed under coding enhancements below) and the capability to encrypt data at the column level.

SQL Server 2005 also brings some exciting new features for increasing availability and smoothing disaster recovery. In the area of HA (high availability), Microsoft has introduced database mirroring and made some important changes to replication. Now schema changes will be replicated automatically to the subscribers; no longer must the DBA run code by hand to do it. Snapshots no longer need to be restarted after failures, but will automatically resume after network errors or other glitches, allowing you to get your HA scenario up and running much faster. Also, logical record replication, which allows you to define parent and child data that will either be replicated together or not at all, ensures more consistent HA scenarios and eliminates logical errors on the subscriber side.

Mirroring allows you to specify a database on another server to be kept in synch with a primary database. Doing this allows you to automatically fail over to the mirror server should the primary fail, providing SQL Server’s first zero data loss HA scenario. This configuration can be expensive to applications and networks, so you’ll need to test it thoroughly, but the capability is finally here. Note, however, that mirroring doesn’t yet qualify as a production feature in this release. Microsoft is shipping SQL Server 2005 with mirroring turned off, and the feature is unsupported.

Mirroring performed solidly in my tests, handling fail-overs of a 100GB database without a hitch. The monitoring around mirroring, however, is still very primitive, consisting of only a small number of DMVs and performance counters, so it’s not especially easy to find out how much data has been replicated, how far behind the secondary system is from the primary, or the cost of the replication. The difficulty in gaining visibility into the process may fray some nerves.

As for disaster recovery, SQL Server 2005 slashes the downtime necessary when restoring a backup. The new online restore capability allows users to begin working in the database as objects become available. With SQL Server 2000, the entire database must be restored before it can be made available to users. Given the time it takes to completely restore a multi-terabyte database, online restores could mean the difference between operating productively and being crippled for hours.

Similarly, you can now fix database corruption with less impact on users. Page-level restore allows you to restore a single corrupted page without hindering the availability of the rest of the database.

Developing for a new age

Code enhancements can be divided into two basic categories: those for developers and those for DBAs. Apart from the .Net/CLR integration, CTEs (Common Table Expressions) are arguably the most significant for developers. Besides elegantly solving the problem of recursive queries, which I won’t get into, CTEs allow developers to simplify code by using inline query definitions in place of views or derived tables. CTEs not only streamline complex queries, but perform the same as derived tables, so there isn’t any extra overhead to implementing them.

Another important enhancement: The TOP clause, which allows you to define that only a certain number of a result set is returned (such as the top 10 salespeople by revenue in the United States), can now take a parameter as an argument. I know it doesn’t sound like much, but this is one of those cases where a relatively small change can make a huge difference in development effort. In short, instead of having to include a static value (TOP 10, for example) in the code, you can now include a variable, allowing users to report the TOP 10 or TOP 100 (or any number they like) on the fly.