Tips from the SQL Server MVPs - 10 May 2001

Editor’s Note: Welcome to SQL Server Magazine’s monthly, Web-only series of SQL Server tips brought to you by the Microsoft SQL Server Most Valuable Professionals (MVPs). Microsoft introduced the MVP program in the 1990s to recognize members of the general public who donate their time and considerable computing skills to help users in various Microsoft-hosted newsgroups. MVPs are nominated by Microsoft support engineers, team managers, and other MVPs who notice a participant's consistent and accurate technical answers in various electronic forums and other peer-to-peer venues. For more information about the MVP program, see http://support.microsoft.com/support/mvp/. The MVPs donate their SQL Server Magazine author fees for these tips to the World Food Programme. To donate free food to a hungry person today, visit http://www.thehungersite.org.

How do you copy permissions from a user to a database role or vice versa?

The CopyPermissions stored procedure in Listing 1 directly accesses the sysusers and syspermissions system tables. The sysusers table contains the details on all defined users for that database. The syspermissions table contains the permissions (e.g., SELECT, UPDATE) that users have on objects (e.g., tables) within that database. Note that Microsoft doesn't recommend directly accessing system tables because these tables can change between releases. But this stored procedure successfully copies permissions between users and roles. For example, to copy permissions from the abc user to the abcRole role, you execute the following statement:

EXEC CopyPermissions 'abc', 'abcRole'

Note that the sysusers system table contains both roles and users. Also, note that the CopyPermissions stored procedure doesn't directly modify the sysprotects virtual table—you don't update virtual tables; you update the underlying physical table (syspermissions). Sysprotects retrieves its data from the syspermissions system table, so sysprotects automatically reflects any changes to syspermissions. Previous versions of SQL Server used the sysprotects table; therefore, a lot of Microsoft and user-written stored procedures utilize sysprotects. When Microsoft decided to use a different table layout for permissions, the company created a new physical syspermissions table and created a virtual sysprotects table for backward compatibility.

How do you normalize repeating groups?

Listing 2 shows the code that creates a sample Marks table in which each row contains one student's ID and that student's grades in courses numbered 300, 301, and 302. To populate the Marks table with the IDs and grades of four students, for example, execute the following code:

The output should resemble the table in Figure 1. The Marks table doesn't adhere to the first normal form of database design, which doesn't allow repeating groups. The courses in Figure 1 are the repeating groups; NULLs indicate courses the student didn’t take; not all students take all courses.

Listing 3 shows the CREATE TABLE statement for the normalized table MarksNormalized. You can use the INSERT statement in Listing 4 to copy the data from the Marks table to the MarksNormalized table.

To create a derived table on the fly that contains three rows, each of which has one of the possible course IDs, you'd use SELECT statements to set up literals with the course IDs, then use UNION ALL between the SELECT statements. Let’s give this derived table the alias C. Then, you perform a cross-join between the Marks table and the C table. The result of the cross-join is a table in which each row from the Marks table appears three times—once for each course ID in the C table.

The cross-join also returns the studentid and courseid columns. A third column called mark holds the result of the CASE expression, which returns in each row only the grade corresponding to the courseid in the C table. The alias T denotes the derived table that wraps the cross-join result, so eliminating any rows with NULL in the mark column is easy. Then, you insert the output into the MarksNormalized table. If you issue the following query against the MarksNormalized table

As you’re building an index on a table, does SQL Server block access to that table?

SQL Server doesn't block read access to the table that you’re building an index on unless you’re building a clustered index. However, if someone tries to update the table’s data, SQL Server blocks the update—regardless of the type of index you're creating—because you can't get an Intent Exclusive (IX) lock at the table level. The IX lock conflicts with the Intent Shared (IS) lock that the CREATE INDEX statement holds.

When SQL Server finishes building the index, it must modify its own system tables to reflect the changes; the two most obvious system tables modified here are sysindexes and sysobjects. Because you don't use "normal" SQL to build the index, you can't trace SQL Server's actions.

In this last phase, the IX locks appear not on the table but on the system table entries that correspond to the table that SQL Server is indexing. Typically, the final phase is short because SQL Server has already sorted the data and extracted the row pointers (physical row locators or clustered index keys). The only remaining task is to modify the system tables, which is a quick process. If the IX locks on system tables are in place, you can't execute a query on your table because you can't read the required information from the sysindexes and sysobjects system tables.

Why does data disappear from the subscriber when you run a horizontally partitioned merge publication?

Magically disappearing data has been a problem since merge replication's release with SQL Server 7.0. Merge replication gives you the ability to process a transaction on a subscriber database and have the data propagate back to the publisher database. Such reverse propagation presents a unique problem when you deal with a horizontally partitioned publication.

When you apply a horizontal partition, you essentially add a WHERE clause to a query. This WHERE clause determines which table rows to send to a particular subscriber. Suppose you split a contact database for your mobile salespeople so that each mobile salesperson's database corresponds to the regions for which that salesperson is responsible. For example, if Joe Smith were responsible for the Eastern region, his database would contain only Eastern region contacts. The horizontal partition enforces this restriction.

Now, suppose that the regional responsibilities change and Joe becomes responsible for the Southeastern region. Joe's database contains data for the Eastern region, which conflicts with the horizontal-partitioning rules. At the next merge, Joe will receive all the Southeastern region contacts so that his database complies with the partitioning rules. However, now Joe has data from both the Eastern and Southeastern regions, creating a conflict.

Merge replication won't let you have data that violates your partitioning rules. Therefore, when the merge job sends the Southeastern region data to Joe, it also removes the Eastern region data from Joe's database. This resolution causes an apparent loss of data, and resolving the conflict creates a significant amount of network traffic whenever the partitioning rules change.

When data conflicts with the partitioning rules, SQL Server removes that data to comply with the partition. However, SQL Server doesn't restrict the data that enters the database. Joe can add a new contact outside his region. SQL Server saves this data in Joe's database, but the data won't remain there after the next merge. Thus, users appear to lose data from their databases, but SQL Server hasn't actually lost the data. SQL Server stores that data in the publisher database and eliminates any data from the subscriber database that doesn't match the horizontal filter.

In SQL Server 7.0, if you make any changes at the subscriber database, SQL Server simply dumps the changes and overwrites them. Therefore, you have to ensure that everything is in sync before changing the partitions or you can lose data. SQL Server 2000 includes a flag that forces the upload of all changes before realigning the data with the partition. You can accomplish this pre-alignment upload by specifying reinit with upload before you realign the data.

Is using the TOP N clause faster than using SET ROWCOUNT N to return a specific number of rows from a query?

With proper indexes, the TOP N clause and SET ROWCOUNT N are equally fast, but with unsorted input from a heap, TOP N is faster. With unsorted input, the TOP N operator uses a small internal sorted temporary table in which it replaces only the last row. So, if the input is <I>nearly sorted</I>, the TOP N engine must delete or insert the last row only a few times. Nearly sorted means you're dealing with a heap with ordered inserts for the initial population and without many updates, deletes, forwarding pointers, and so on afterward.

A nearly sorted heap is more efficient to sort than sorting a huge table. In a test that used TOP N to sort a table with the same number of rows but with unordered inserts, TOP N wasn't as efficient anymore. Usually, the I/O time is the same both with an index and without; however, without an index SQL Server must do a complete table scan. CPU time and elapsed time, which are lower for the TOP N case in the output when you run the code in Listing 5, show the efficiency of the nearly sorted heap. The I/O time is, of course, the same; SQL Server must read all the rows either way.

How do you get information about default constraints through Information_Schema views?

You can get information about all types of constraints through the Information_Schema views, with the exception of default constraints. So, you can find the missing views by using the code in Listing 6. However, maybe a better idea would be to not create the master database views with the Information_Schema as owner; then, you would update the system tables directly. Note that Microsoft doesn't support directly updating system tables.

From the Blogs

Many organizations today cannot use public cloud solutions because of security concerns, administrative challenges and functional limitations. However, they still need a centralized platform where end users can conduct self-service analytics in an IT-enabled environment....More

It is crucial to move away from data and analytics stored on individual desktop computers. Today’s solutions must promote holistic, collective intelligence. The strong, continued alliance between Microsoft and Pyramid Analytics helps make all this possible....More

To become a truly data-driven enterprise, many business leaders recognize that they must extend the capabilities of self-service business intelligence (BI) and analytics to more of their business users. Many BI tools tackle part of this need, but they don’t offer a complete enterprise solution....More