Tuesday, September 30, 2008

Administrator & Monitoring Change Data Capture in SQL Server 2008

SQL Server 2008 introduces a new feature called Change Data Capture (CDC). CDC Captures DDL and DML activities on a SQL Server Table and places the changes in a separate SQL Server relational table. In this article we will see how DBA can administer and monitor CDC in SQL Server 2008. The CDC feature is disabled by default at the database level. A member of the sysadmin server role must enable a database for change data capture. Once the database is enabled for CDC any member of the dbo fixed database role can enable a table within the database for change data capture.

Overview of Change Data CaptureOnce CDC is enabled at the database level, the next step is to enable CDC for a specific table for which the change needs to be captured. The CDC feature gathers the changed data from the database transaction log file and inserts the change information in an associated change table which is created during the setup and configuration process of CDC. There is a one to one relationship that exists between the source table and the change capture table. You can have a maximum of two change tables for a single source table. As the CDC feature needs to continuously read the transaction log file it’s obvious that for CDC to work SQL Server Agent should be running. As this feature in entrenched in the transaction log architecture and thus a lot of the metadata in CDC is related around the concept of a Log Sequence Number (LSN)

Definition of Log Sequence Number (LSN)Every record in the Microsoft SQL Server transaction log is uniquely identified by a log sequence number (LSN). LSN's are ordered such that if LSN2 is greater than LSN1, the change described by the log record referred to by LSN2 occurs after the change described by the log record LSN.

How to Enable CDC for a SQL Server 2008 Database1. Connect to a SQL Server 2008 instance using SQL Server Management Studio2. In the query window, type the following TSQL Query to create a ChangeDataCapture Database:

Use MasterGo

IF EXISTS (SELECT name FROM sys.databases WHERE name = N'ChangeDataCapture')DROP DATABASE ChangeDataCaptureGO

USE [master]GOCreate Database ChangeDataCaptureGo

3. Once the database is successfully created you need to enable the change data capture feature for the database, this can be done by executing the following TSQL Query:

Use ChangeDataCaptureGoEXEC sys.sp_cdc_enable_dbGO

4. Execute the query below to check whether the database is enabled for CDC:

Select [name] as DBName, is_cdc_enabled from sys.databases

The value of 1 for the is_cdc_enabled column means that the database is enabled for CDC and value of 0 means that the database is not enabled for CDC.

5. Once the database is enabled for CDC, you will see a new cdc schema, cdc user and a few metadata tables and other system objects created in the ChangeDataCapture database. The most important things which a DBA needs to keep in mind when enabling CDC for a database is to make sure that there are not a cdc schema or cdc user existing in the database before configuring CDC. If there was a cdc schema or cdc user existing in the database then the configuration of CDC will fail, so a DBA needs to remove or rename any previously existing cdc schema or user from the database before configuring CDC.

How to Enable CDC for a SQL Server 2008 Database Table1. Now let’s create a Currency Table in the ChangeDataCapture Database by executing the following TSQL Query:

4. Execute the query below to check whether the table is enabled for CDC:

Use ChangeDataCaptureGoSelect [name], is_tracked_by_cdc from sys.tablesGO

The value of 1 for the is_tracked_by_cdc column means that the CDC is enabled for the table and the value of 0 for is_tracked_by_cdc column means that the CDC is disabled.

5. Once you have enabled CDC for the Currency table, another table is created for keeping changed data and the information about the changes in the source table. The new table created will have the cdc.dbo_Currency_CT name as highlighted in the above snippet:

How to disable CDC for a SQL Server 2008 Database TableDatabase Administrators can run the TSQL Query below to disable CDC on a table:

How to disable CDC for a SQL Server 2008 DatabaseDatabase Administrators can run the TSQL Query below to disable CDC on a database. The SQL Server Agent should be running when DBA a decides to disable CDC for the database. Once the TSQL Query below has been executed successfully you could see that the SQL Server Agent Jobs which monitor the changes happening to table by reading the transaction log are deleted and also all the system tables related to CDC are also dropped.

Use ChangeDataCaptureGoEXEC sys.sp_cdc_disable_dbGo

Growth of Change Data Capture System TablesOnce in every three days there is an automatic cleanup process that occurs. For more intense environments you can leverage the manual method using the system stored procedure: sys.sp_cdc_cleanup_change_table. When you execute this system procedure you specify the low LSN and any change records occurring before this point are removed and the start_lsn is set to the low LSN you specified.

Change Data Capture System TablesOnce the DBA has enabled CDC for the Currency table, SQL Server creates the tables below along with a schema named CDC. CDC enables change tracking on tables so that Data Manipulation Language (DML) and Data Definition Language (DDL) changes made to the tables can be captured. The system tables that store information used by change data capture operations are mentioned below:

Table Name

Description

cdc._CT

It returns one row for each change made to a captured column in the CDC enabled source table

cdc.captured_columns

It returns one row for each column tracked in a capture instance

cdc.change_tables

It returns one row for each change table in the database

cdc.ddl_history

All the Data Definition Language (DDL) change made to tables that are enabled for CDC are captured

cdc.lsn_time_mapping

It returns one row for each transaction having rows in a change table. This table is used to map between log sequence number (LSN) commit values and the time the transaction committed

cdc.index_columns

It has information about all the indexes which are associated to a CDC enabled table

cdc.cdc_jobs

This table is created in MSDB database and has the configuration parameters for CDC agent jobs

dbo.systranschemas

This table is used to track schema changes in articles published in transactional and snapshot publications. This table is stored in both publication and subscription databases

Change Data Capture Dynamic Management ViewsDatabase Administrators can monitor the change data capture process to determine if changes made are being written correctly and with a reasonable latency to the change tables. The following below mentioned DMV can be used to identify if there are any errors:1. sys.dm_cdc_log_scan_sessions2. sys.dm_cdc_errors

Change Data Capture System FunctionsChange data capture records DDL & DML activity applied to SQL Server 2008 tables, supplying the details of the changes in an easily consumed relational format. Column information that mirrors the column structure of a tracked source table is captured for the modified rows, along with the metadata needed to apply the changes to a target environment. The following functions are used to return information about the changes:1. cdc.fn_cdc_get_all_changes_2. cdc.fn_cdc_get_net_changes_3. sys.fn_cdc_decrement_lsn4. sys.fn_cdc_get_column_ordinal ( 'capture_instance' , 'column_name' )5. sys.fn_cdc_get_max_lsn6. sys.fn_cdc_get_min_lsn7. sys.fn_cdc_has_column_changed8. sys.fn_cdc_increment_lsn9. sys.fn_cdc_is_bit_set10. sys.fn_cdc_map_lsn_to_time11. sys.fn_cdc_map_time_to_lsn

Change Data Capture SQL Server Agent JobsThere are two SQL Server Agent Jobs which are created when CDC is enabled for a particular table.

ConclusionDatabase Administrators can use this feature to monitor DDL and DML changes which are happening to tables.