Thursday, December 4, 2008

When I first started using SQL Server as a novice, I was initially confused as to the differences between clustered and non-clustered indexes. As a developer, and new DBA, I took it upon myself to learn everything I could about these index types, and when they should be used. This article is a result of my learning and experience, and explains the differences between clustered and non-clustered index data structures for the DBA or developer new to SQL Server. If you are new to SQL Server, I hope you find this article useful.

As you read this article, if you choose, you can cut and paste the code I have provided in order to more fully understand and appreciate the differences between clustered and non-clustered indexes.

Part I: Non-Clustered Index

Creating a Table To better explain SQL Server non-clustered indexes; let’s start by creating a new table and populating it with some sample data using the following scripts. I assume you have a database you can use for this. If not, you will want to create one for these examples.

Create Table DummyTable1(EmpId Int,EmpName Varchar(8000))

When you first create a new table, there is no index created by default. In technical terms, a table without an index is called a “heap”. We can confirm the fact that this new table doesn’t have an index by taking a look at the sysindexes system table, which contains one for this table with an of indid = 0. The sysindexes table, which exists in every database, tracks table and index information. “Indid” refers to Index ID, and is used to identify indexes. An indid of 0 means that a table does not have an index, and is stored by SQL Server as a heap.

Now let’s add a few records in this table using this script:

Insert Into DummyTable1 Values (4, Replicate ('d',2000))GO

Insert Into DummyTable1 Values (6, Replicate ('f',2000))GO

Insert Into DummyTable1 Values (1, Replicate ('a',2000))GO

Insert Into DummyTable1 Values (3, Replicate ('c',2000))GO

Now, let’s view the contests of the table by executing the following command in Query Analyzer for our new table.

Select EmpID From DummyTable1GO

Empid

4613

As you would expect, the data we inserted earlier has been displayed. Note that the order of the results is in the same order that I inserted them in, which is in no order at all.

Now, let’s execute the following commands to display the actual page information for the table we created and is now stored in SQL Server.

DBCC page(dbid, fileno, pageno, option)Where:dbid = database id.Fileno = fileno of the page. Usually it will be 1, unless we use more than one file for a database.Pageno = we can take the output of the dbcc ind page no.Option = it can be 0, 1, 2, 3. I use 3 to get a display of the data. You can try yourself for the other options.Run this script to execute the command:DBCC TRACEON (3604)GODBCC page(@DBID, 1, 26408, 3)GOThe output will be page allocation details.DBCC TRACEON (3604) GOdbcc page(@DBID, 1, 26255, 3)GO

The data will be displayed in the order it was entered in the table. This is how SQL stores the data in pages. Actually, 26255 & 26409 both display the data page.

I have displayed the data page information for page 26255 only. This is how MS SQL stores the contents in data pages as such column name with its respective value.

Now, we see two more rows than before, which now contains index page details. Page 26408 displays the page allocation details, and pages 26255 and 26409 display the data page details, as before.

In regard to the new pages, page 26411 displays the page allocation details of an index page and page 26410 displays the index page details.

MS SQL generates a page (pagetype = 10) for an index and explains the page allocation details for an index. It shows the number of index page have been occupied for an index.

Let us see what would be the output for page 26411, that is page type = 10

IAM: Single Page Allocations @0x308A608E

-----------------------------------------Slot 0 = (1:26410)

Let us view page 26410 to see the index page details.

DBCC TRACEON (3604)GO

DBCC page(10, 1, 26410, 3)GO

SQL populates the index column data in order. The last column (?) is pointed to the row locator.

Here are the results, using two different methods:

Method I FileID PageID EMPID ?

1 26410 1 0x8F66000001000200

1 26410 3 0x2967000001000000

1 26410 4 0x8F66000001000000

1 26410 6 0x8F66000001000100

The row location display in one of two ways:

If the table does not have a clustered index, the row locator will be combination of fileno, pageno and the no of rows in a page. If the table does have clustered index, the row location will be clustered index key value.Non-clustered indexes are particularly handy when we want to return a single row from a table.

For example, to search for employee ID (empid = 3) in a table that has a non-clustered index on the empid column, SQL Server looks through the index to find an entry that lists the exact page and row in the table where the matching empid can be found, and then goes directly to that page and row. This greatly speeds up accessing the record in question.

Select EmpID, EmpName From DummyTable1 WHERE EMPID = 3

Now, let’s insert some more rows in our table and view the data page storage of our non-clustered index.

Triggers are special types of Stored Procedures that are defined to execute automatically in place of or after data modifications. They can be executed automatically on the INSERT, DELETE and UPDATE triggering actions.

There are two different types of triggers in Microsoft SQL Server 2000. They are INSTEAD OF triggers and AFTER triggers. These triggers differ from each other in terms of their purpose and when they are fired. In this article we shall discuss each type of trigger.

First of all, let's create a sample database with some tables and insert some sample data in those tables using the script below:

AFTER TriggersThe type of trigger that gets executed automatically after the statement that triggered it completes is called an AFTER trigger. An AFTER trigger is a trigger that gets executed automatically before the transaction is committed or rolled back.

Using the below script, first we shall create a trigger on the table USER_MASTER for the INSERT event of the table.

By looking at the output, we can conclude that before the transaction is rolled back or committed, the AFTER trigger gets executed automatically.

A table can have several AFTER triggers for each of the three triggering actions i.e., INSERT, DELETE and UPDATE. Using the below script, we shall create two triggers on the table User_Master for the INSERT triggering action.

From the output we can conclude that when the user tries to insert data in the table USER_MASTER, three triggers are executed automatically. That is, you can write several AFTER triggers on one table for each of the three triggering actions.

Similarly, we can write several AFTER triggers on DELETE and UPDATE triggering actions.

Note: If a table has multiple AFTER triggers, then you can specify which trigger should be executed first and which trigger should be executed last using the stored procedure sp_settriggerorder. All the other triggers are in an undefined order which you cannot control.

An AFTER trigger can be created only on tables, not on views.

Using the script below, first we shall create a simple view [vwUserMaster] which will fetch the Username and Password from the table USER_MASTER.

From the Output we can conclude that we cannot create an AFTER trigger on views.Like stored procedures and views, triggers can also be encrypted.

The trigger definition is then stored in an unreadable form. Once encrypted, the definition of the trigger cannot be decrypted and cannot be viewed by anyone, including the owner of the trigger or the system administrator.

Since the trigger trgEncrypted is created with the option WITH ENCRYPTION, the trigger definition is hidden and there is no way that one can easily decrypt the trigger code.

We all know that the DML statements change or modify data. Sometimes it becomes necessary for the triggers to have the access to the changes being caused by the DML statements. SQL Server 2000 provides four different ways to determine the affects of the DML statements. The INSERTED and DELETED tables, popularly known as MAGIC TABLES, and update () and columns_updated() functions can be used to determine the changes being caused by the DML statements.

The below table depicts the contents of the INSERTED and DELETED tables for three different table Events

EVENTINSERTEDDELETED

InsertContains the inserted rowsEmpty

DeleteEmptyContains the rows to be deleted

UpdateContains the rows after updateContains the rows before update

Note that the Magic Table does not contain the information about the columns of the data-type text, ntext, or image. Attempting to access these columns will cause an error.

The update() function is used to find whether a particular column has been updated or not. This function is generally used for data checks.

CREATE TRIGGER trgUddate ON User_DetailsFOR UPDATEASIf UPDATE(FName)BEGINPRINT('AFTER Trigger [trgUddate] - Executed - First Name has been updated')ROLLBACK TRANSACTIONENDelse If UPDATE(LName)

Depending upon the column updated, a message will be displayed. With this feature we can determine which column in the table has been updated, and then proceed with the business rules to be implemented further.

Columns_Update() function returns a varbinary data type representation of the columns updated. This function return a hexadecimal values from which we can determine which columns in the table have been updated.

INSTEAD OF Triggers

A trigger which gets executed automatically in place of triggering actions i.e., INSERT, DELETE and UPDATE is called an INSTEAD OF trigger.

INSTEAD OF triggers gets executed automatically before the Primary Key and the Foreign Key constraints are checked, whereas the traditional AFTER triggers gets executed automatically after these constraints are checked.

UserID 100 does not exist in the User_Master table, so the Foreign Key constraint has been checked and an error message is displayed. What we can conclude is: AFTER triggers gets executed automatically after the PK and FK constraints.

From the output, it is clear that you cannot create two INSTEAD OF triggers on the view/ table for the same event.

Note: An important point to be noted is that INSTEAD OF DELETE and INSTEAD OF UPDATE triggers cannot be defined on tables that have corresponding ON DELETE or ON UPDATE cascading referential integrity defined.

At last, how would you know what are the triggers associated with the table and what type of the trigger it is? Whether AFTER or INSTEAD OF?

The solution for this question is sp_helptrigger. This stored procedure gives all the information about the triggers such as Event on which the trigger gets executed, the type of the trigger, etc.

Triggers can be used in the following scenarios, such as if the database is de-normalized and requires an automated way to update redundant data contained in multiple tables, or if customized messages and complex error handling are required, or if a value in one table must be validated against a non-identical value in another table.

Triggers are a powerful tool that can be used to enforce the business rules automatically when the data is modified. Triggers can also be used to maintain the data integrity. But they are not to maintain data integrity. Triggers should be used to maintain the data integrity only if you are unable to enforce the data integrity using CONSTRAINTS, RULES and DEFAULTS. Triggers cannot be created on the temporary tables.

If the current user does not have the admin permission.Please refer to the following suggestion . Open Microsoft SQL Server 2005 -> Management Studio Express Edition , right click Management Studio Express Edition and choose Run as Administrator to connect SQL Server with your administrator,select Security->Logins ,then double-click the user account you want to fix , select Server Roles section and check sysadmin check box, click OK to save it. You can also open Microsoft SQL Server 2005-> Configuration Tools-> SQL Server Surface Area Configuration with your current user, click Add New Administrator ,double-click the Member of SQL Server SysAdmin role on yourinstantname in Available priviledges , click OK to save it. Hope this helps.

Monday, December 1, 2008

When a new version is released or when existing software is upgraded there is always a learning curve and getting used to new ways of doing things. This article demonstrates how to use SQL Server 2005's catalog views in comparison with using system tables in SQL Server 2000.

Catalog views are a storehouse for static metadata. They contain data about the server, database, objects, logins, permissions, etc.

Catalog views are the general interface to the catalog metadata, and we should start using them rather than accessing the system tables directly.

All of the SQL Statements that are used in SQL Server 2000 can still be used in SQL Server 2005. It is advisable to start using the catalog views since the underlying system tables might change in future releases and catalog views are the general interface to the catalog metadata.

It contains system catalogs that keep information about disk space, file allocations,usage, system wide configuration settings, login accounts, the existence of otherdatabase, and the existence of other SQL Servers (for distributed operations).

ModelIt is a simply a template database. Every time you create a new database, SQL Servermakes a copy of model to form the basis of the new database.

TempdbTemporary database, tempdb, is a workspace. SQL Server’s tempdb database is uniqueamong all other databases because it is recreated not recovered every time SQL Serveris startedPubsThis is a sample database used extensively by much of SQL Server documentation.It’s available to everyone in the SQL Server community

NorthwindThis is a sample database that was originally developed for the use of MicrosoftAccessMsdbThis database is used by the SQL Server Agent Service, which performs scheduledactivities such as backups and replication tasks.

Database Files: A database file is nothing more than an operating systemfile. SQL Server 2000 allows the following three types of database files:

System Tables-------------IntroductionWhen a SQL Server object is created, its properties are called metadata.The metadata is stored in special System Tables. For example, in SQL 2000, when a new column was created, the column name and data type could be found in an internal System Table called syscolumns.All SQL objects produce metadata. Every time SQL 2000 Enterprise Manager or SQL 2005 SQL Server Management Studio is browsed, the information displayed about database, tables, and all objects, comes from this metadata.There are many uses for this metadata, including gathering performance statistics, discovering table and column similarities and differences during a database upgrade, and obtaining lock information. In previous versions of SQL Server, these System Tables were exposed and could be queried like any standard table.However, starting with SQL 2005, System Tables are hidden and they cannot be directly queried. Even with full DBA rights, System Tables are restricted. Although not directly accessible, there are built in views and procedures for extracting metadata. Some of these are new in SQL 2005; others were carried forward from pervious versions.Most have the advantage of being more readable and self-describing than querying System Tables. If you have legacy scripts directly referencing System Tables, there are many new System Views that will directly take their place.System ViewsSystem Views are predefined Microsoft created views for extracting SQL Server metadata. There are over 230 various System Views. To display all the views in SQL 2005, launch the SQL Management Studio; expand Databases, System Databases, and select master, Views, System Views.These System Views will be automatically inserted into any user created database. The System Views are grouped into several different schemas. In SQL 2005, schemas are used as security containers. There can be several different schemas inside a single database. This is a better ANSI implementation of schemas compared to their use in SQL 2000. See Marcin Policht's excellent article; SQL Server 2005 Security, at http://www.databasejournal.com/features/mssql/article.php/3481751 for a detailed explanation of SQL 2005 schemas and security.

Information SchemaThe first group of System Views belongs to the Information Schema set. Information Schema is an ANSI specification for obtaining metadata. There are twenty different views for displaying most physical aspects of a database, such as table, column, and view information.Information Schema views were available in SQL 2000 and should continue to appear in future versions of SQL. They are a few ANSI terms that translate differently in SQL. An ANSI "Catalog" is a SQL "Database"; an ANSI "RowVersion" is a SQL "Timestamp"; and an ANSI "Timestamp" is a SQL "DateTime." Aside from this, Information Schema views are easy to implement.For an example, we will create a small table with a few columns.CREATE DATABASE TestGOUSE TestGOCREATE TABLE MyTable(Col1 int,Col2 varchar(10),Col3 datetime)GOTo use Information Schema views, select them like any standard view.

The following TSQL will display column and table information on the new database;SELECT *FROM INFORMATION_SCHEMA.TABLESSELECT *FROM INFORMATION_SCHEMA.COLUMNSWHERE TABLE_NAME = 'MyTable'

Most of the Information Schema view names are self-explanatory. INFORMATION_SCHEMA.TABLES returns a row for each table. INFORMATION_SCHEMA.COLUMNS returns a row for each column. A few though, refer to ANSI names. INFORMATION_SCHEMA.COLUMN_DOMAIN_USAGE contains a row for each column created with a user-defined type, and INFORMATION_SCHEMA.DOMAIN lists a row for each user-defined type. INFORMATION_SCHEMA.ROUTINES shows a record for each stored procedure or function. A benefit to Information Schema views is that because they are an ANSI standard, you will find them in many other database packages.

Catalog ViewsNew in SQL 2005 are Catalog Views. Microsoft recommends them as the most general interface to the catalog metadata. They are efficient and all user available catalog metadata is exposed. The amount of views is impressive. Best of all, many of the columns returned by Catalog Views are self-describing.Documentation organizes Catalog Views into several different groups:

The views we need to gather table and column information, like the previous example, are grouped under "Objects Catalog Views". This group includes views on tables, columns, indexes, constraints, and triggers to name a few. Our example requires two views, "sys.tables" and "sys.columns." The columns view will need to be joined on the table view as shown below.SELECT *FROM sys.tablesSELECT *FROM sys.columns INNER JOIN sys.tables ONsys.tables.object_id = sys.columns.object_idWHERE sys.tables.name = 'MyTable'

Sys AllThere are four views in a Sys_All group. These views contain information about the System Views as well as user created objects. The views are sys.all_columns, sys.all_objects, sys.all_parameters, and sys.all_views.

Dynamic Management ViewsThe last groups of views are called Dynamic Management views, or DM. They are used to gather statistics stored in memory but not persistent on disk such as thread information, memory usage, and connection details. These offer administrators a fast and reliable method for obtaining performance numbers. For example, to show the statistics for cached queries, execute this DM statement:

SELECT *FROM sys.dm_exec_query_stats

System Stored ProceduresIn addition to the System Views, there are many System Stored Procedures that can be used for administrative purposes. These pre-made procedures return results similar to System Views. They are located under each database, Programmability, Stored Procedures, and System Stored Procedures. They belong to sys schema.

To obtain column information using a System Stored Procedure, execute sp_columns with the following script:

EXEC sys.sp_columns 'MyTable'

ConclusionFor obtaining SQL Sever metadata information, SQL 2005 offers a large variety of pre-made views and procedures. They are easy and fast to implement and usually return information that is far less cryptic than the tools provided in previous versions.

This set of tables maintained by SQL Server is sometimes called the system catalog. We can identify a system table by its name or the object_id or type column in sysobjects.SELECT name from sysobjects where type=’V’ and name like ‘sys%’