Data, Design, and SQL Serverhttps://blogs.msdn.microsoft.com/dfurman
Fri, 30 Sep 2016 14:56:48 +0000en-UShourly1Database restore failure when restoring from URLhttps://blogs.msdn.microsoft.com/dfurman/2015/10/23/database-restore-failure-when-restoring-from-url/
https://blogs.msdn.microsoft.com/dfurman/2015/10/23/database-restore-failure-when-restoring-from-url/#respondFri, 23 Oct 2015 07:12:51 +0000https://blogs.msdn.microsoft.com/dfurman/2015/10/23/database-restore-failure-when-restoring-from-url/SQL Server 2014 and later includes a feature named Managed Backup, which automates scheduled database backup tasks. Managed Backup backs up databases to an Azure storage account using Backup to URL. As a part of its functionality, Managed Backup attempts to automatically resolve operational errors such as broken backup chains. Specifically, if an out of band backup is created by something other than Managed Backup itself, it will try to copy that backup to the storage account, in order to avoid breaking the backup chain.

Recently I have encountered a scenario where this Managed Backup behavior caused an apparent failure to restore a database from backup. A simple workaround exists, and this blog is to describe the issue and the workaround.

I was adding a database to an AlwaysOn availability group using the Add Database wizard. As a part of the wizard, a full backup of the database is taken, and stored on a local disk or share. When a backup is created on a local disk, its block size by default matches the sector size of the disk media. In my case, that was 4096 bytes. Once that backup was created by the wizard, Managed Backup detected that it was an out of band backup, and copied it to the storage account.

Later, I had to restore the database from this backup. However, I received an error:

Msg 3268, Level 16, State 1, Line 46

Cannot use the backup file ‘https://<>.blob.core.windows.net/5777ec70fa184f75a1cb84c660bd97b8/Test1_139d0c9ccecb453d8ab6a4e3fcf6440c_6a40325f71614c949532603886de9270_20151021122734.bak’ because it was originally formatted with sector size 4096 and is now on a device with sector size 65536.

Msg 3013, Level 16, State 1, Line 46

RESTORE HEADERONLY is terminating abnormally.

When backing up to URL, SQL Server uses the 64K block size by default because the sector size presented by Azure blob storage is 64K. Therefore, regular scheduled backups to URL created by Managed Backup have the same block size as the media sector size, and this issue does not occur. But when a backup created on a local disk with 4K sector size is copied to the storage account, and then an attempt is made to restore from it, SQL Server blocks it with the above error message.

The workaround in this case is to explicitly specify the original block size for the backup (4K in this case) using the BLOCKSIZE parameter of the RESTORE statement:

Incidentally, the same problem also causes the restore UI in SSMS to fail when enumerating backupsets within such a backup. It uses the SMO method that executes RESTORE HEADERONLY on the backup blob, which does not include the BLOCKSIZE parameter, and fails with the same error.

I would like to thank my colleague Nacho Alonso Portillo (blog) for his help in debugging this problem.

]]>https://blogs.msdn.microsoft.com/dfurman/2015/10/23/database-restore-failure-when-restoring-from-url/feed/0How to view query plans from Query Store in Management Studiohttps://blogs.msdn.microsoft.com/dfurman/2015/08/28/how-to-view-query-plans-from-query-store-in-management-studio/
https://blogs.msdn.microsoft.com/dfurman/2015/08/28/how-to-view-query-plans-from-query-store-in-management-studio/#respondFri, 28 Aug 2015 14:27:27 +0000https://blogs.msdn.microsoft.com/dfurman/2015/08/28/how-to-view-query-plans-from-query-store-in-management-studio/Query Store is a great new feature in SQL Server 2016 and SQL Azure Database. Among other things, it captures and stores query plans for queries that execute on a SQL Server instance, so that they can be reviewed later, i.e. for performance troubleshooting. The plans are exposed in the sys.query_store_plan view, in the query_plan column. Even though the plan is stored in its XML representation, the data type of this column is nvarchar(max). This avoids some problems with using the xml data type. So far, so good.

But when I tried to copy the plan from the SSMS grid output and paste the XML into a new window, I quickly found that for larger plans, the XML text becomes truncated. This is an SSMS limitation. Luckily, there is a simple workaround that, somewhat ironically, uses XML support in the SQL Server query engine. It is based on a trick that I learned from SQL Server MVP Adam Machanic (blog).

This returns a single row, single column result set that contains an XML value that is clickable in the SSMS output grid. When clicked, this opens a new window, which contains the full query plan XML text wrapped as an XML tag, looking like this:

<?query_plan<ShowPlanXML … </ShowPlanXML>?>

At this point, you can remove <?query_planand ?>, and save the resulting text in a file with .sqlplan extension. Close and re-open this file to view the graphical query plan.

]]>https://blogs.msdn.microsoft.com/dfurman/2015/08/28/how-to-view-query-plans-from-query-store-in-management-studio/feed/0Collecting performance counter values from a SQL Azure databasehttps://blogs.msdn.microsoft.com/dfurman/2015/04/02/collecting-performance-counter-values-from-a-sql-azure-database/
https://blogs.msdn.microsoft.com/dfurman/2015/04/02/collecting-performance-counter-values-from-a-sql-azure-database/#commentsThu, 02 Apr 2015 12:32:00 +0000https://blogs.msdn.microsoft.com/dfurman/2015/04/02/collecting-performance-counter-values-from-a-sql-azure-database/Performance Monitor is a well-known tool for SQL Server diagnostics and troubleshooting. However, when working with SQL Azure, it is obviously not available. That said, in the latest SQL Azure update (V12), the sys.dm_os_performance_counters DMV is now available. This DMV provides the values for all SQL Server (but not Windows) performance counters, and can replace PerfMon in many scenarios.

Below is a script that collects performance counter values for a SQL Azure database and stores them in a table in the same database. I selected a set of counters that are in my view most useful for SQL Azure diagnostics, however the script can be easily modified to collect any other counters, as needed.

I am thankful to Jason Strate (blog) who previously posted a description of different counter types available in sys.dm_os_performance_counters. It was very helpful in writing the script below.

/*Dimitri Furman (dfurman@microsoft.com)2015-03-30

DESCRIPTION:This script provides SQL performance counter values based on data in sys.dm_os_performance_counters DMV.It is intended to be used for SQL Azure performance monitoring and diagnostic data collection,where the traditional methods of collecting PerfMon data are not available. Azure SQL V12 is required.

The script will execute for the specified number of iterations, which can be set to a large number for a quasi-indefinite capture similar to default PerfMon behavior. At the beginning of each iteration, asnapshot of cumulative counters is taken, followed by a wait interval, and then by a second snapshot.At that time, counter values are recorded in the dbo.perf_counter_log table, and a new iteration starts.

This script is partially based on http://www.jasonstrate.com/2012/11/querying-performance-counters-in-sql-server/,which provides an overview of different counter types available in sys.dm_os_performance_counters,and a description of the approach to collect and interpret performance counter data provided by that DMV.

USAGE:1. Use Results To Grid option in SSMS.2. Set the @SnapshotIntervalSeconds and @MaxIterationCount parameters. Optionally, edit the script to define the counters to be collected.3. Run the script.4. Collected data will be found in dbo.perf_counter_log table in the current database.

*/

-- External parametersDECLARE @SnapshotIntervalSeconds int = 10; -- The interval between first and second snapshot during each iterationDECLARE @MaxIterationCount int = 100000; -- The number of iterations. Use a large number for indefinite capture duration.DECLARE @SourceCounter table (object_name nvarchar(128) not null,counter_name nvarchar(128) not null,instance_name nvarchar(128) not null,base_counter_name nvarchar(128) null,PRIMARY KEY (object_name, counter_name, instance_name) ); -- The set of collected counters, to be defined below

]]>https://blogs.msdn.microsoft.com/dfurman/2015/04/02/collecting-performance-counter-values-from-a-sql-azure-database/feed/4Database migration to SQL Azure using a bacpac – a blocker and a workaroundhttps://blogs.msdn.microsoft.com/dfurman/2015/03/25/database-migration-to-sql-azure-using-a-bacpac-a-blocker-and-a-workaround/
https://blogs.msdn.microsoft.com/dfurman/2015/03/25/database-migration-to-sql-azure-using-a-bacpac-a-blocker-and-a-workaround/#commentsWed, 25 Mar 2015 16:04:20 +0000https://blogs.msdn.microsoft.com/dfurman/2015/03/25/database-migration-to-sql-azure-using-a-bacpac-a-blocker-and-a-workaround/Update 2016-09-30: With recent versions of SqlPackage, the workaround for non-PRIMARY filegroups described below is no longer needed. When validating database during export, SqlPackage will no longer complain about non-PRIMARY filegroups. When importing a bacpac with such filegroups into Azure SQL Database, all references to them will be automatically replaced with references to the PRIMARY filegroup.

There may be other scenarios where validation blocks bacpac export, yet using dacpac+data allows a database to be imported. If you encounter any such scenarios, please post a comment on this blog.

One of the ways to migrate an on-premises database to SQL Azure today is by creating a bacpac file, and then importing it into a SQL Azure server. This is described in various documentation, e.g. in the Migration Cookbook for SQL Azure.

I was doing this recently for a customer when I ran into a problem. In the source database, a filegroup other than PRIMARY was present, and all tables were created on that filegroup following a well-known best practice. When creating the bacpac for this database, SqlPackage failed with this error:

*** Error exporting database:One or more unsupported elements were found in the schema used as part of a data package.

Indeed, only the PRIMARY filegroup can exist in SQL Azure databases. In this case, even though it would be perfectly fine to have all tables on the PRIMARY filegroup and not on the original FG_NB filegroup once the database was migrated to SQL Azure, SqlPackage did not allow me to create the bacpac in the first place.

Luckily, there is a workaround that Steven Green from the SSDT team has pointed out. The workaround is as follows:

Instead of creating a bacpac file, create a dacpac+data file. A dacpac+data is very similar to a bacpac, but there are a few minor differences: SQL Azure V12 bacpacs have a higher model schema version, and there is a flag that indicates whether the file is a bacpac. The important difference between them, in this case, is that when SqlPackage creates a bacpac, it validates whether the exported database fits within the feature set of Azure SQL DB, but that check is not done for a dacpac+data. Here is a sample command to create a dacpac+data:

The last two parameters are optional, but may be needed if the database contains any users for Windows authentication logins (Windows authentication is not supported in SQL Azure today), or any other server scoped elements that are not supported in SQL Azure. The SqlPackage.exe executable is found in the DacFx framework directory, e.g. in C:\Program Files (x86)\Microsoft SQL Server\120\DAC\bin. The latest DacFx framework should be used.

Create a bacpac, but from a schema-only copy of the database that does not have the custom filegroup. There are several ways to do this. One is to reverse-engineer the source database into an SSDT project, remove all references to the custom filegroup, and publish the project to a new empty database. Now, when I said that this copy of the database has to be schema-only, it wasn’t entirely correct. At least one row of data must be present in any table (it doesn’t matter which table, and it doesn’t matter what this data is). Once this empty and nearly-schema-only database exists, create the bacpac. Here is a sample command:

Now that both the dacpac+data and the bacpac files are created, it is time for a bit of surgery. As it happens, both dacpac and bacpac files are actually ZIP archives with a different extension. Changing the extension to .zip lets you open the archives in Windows Explorer (or any archive utility that supports ZIP archives, in which case it may not be necessary to change the extension at all). Open the bacpac archive, and copy the two files named model.xml and Origin.xml to a temporary location. Then, open the dacpac+data file, and replace these two files with the ones from the bacpac. Finally, change the extension of the dacpac file to .bacpac. This produces a bacpac file that can be successfully imported into a SQL Azure server.

]]>https://blogs.msdn.microsoft.com/dfurman/2015/03/25/database-migration-to-sql-azure-using-a-bacpac-a-blocker-and-a-workaround/feed/2Using Storage Spaces on an Azure VM cluster for SQL Server storagehttps://blogs.msdn.microsoft.com/dfurman/2014/04/27/using-storage-spaces-on-an-azure-vm-cluster-for-sql-server-storage/
https://blogs.msdn.microsoft.com/dfurman/2014/04/27/using-storage-spaces-on-an-azure-vm-cluster-for-sql-server-storage/#commentsSun, 27 Apr 2014 12:10:26 +0000https://blogs.msdn.microsoft.com/dfurman/2014/04/27/using-storage-spaces-on-an-azure-vm-cluster-for-sql-server-storage/Storage Spaces is a feature of Windows Server that lets you create a virtual disk (also known as a storage space) out of a number of underlying physical disks. The Performance Guidance for SQL Server in Azure VMs whitepaper mentions (on pages 15 and 26) that in Azure VMs, Storage Spaces can be used to improve storage performance for SQL Server workloads. This is achieved by creating a virtual disk from a striped set of physical disks. In Azure VMs, physical disks use Azure Blob storage, which has certain performance limitations. However, creating a storage space on top of a striped set of such physical disks lets you work around these limitations to some extent. With larger VMs that support many (up to 16) physical disks, a very significant increase in storage performance is possible. This can enable Azure as a viable platform for many SQL workloads that would be too constrained by the I/O performance of a single Azure disk otherwise.

In some scenarios, i.e. when creating a new database, creating a storage space may not be necessary. In that case, it will be easier to design the database with multiple data files, and place each file (or a group of files, to enable future expansion) on a separate physical disk, which will achieve a similar result of improving I/O performance. There are, however, some scenarios where this is not feasible or not possible, e.g. when an existing database that uses one large data file needs to be migrated to Azure as-is.

On a standalone Azure VM, creating a storage space is straightforward. Server Manager provides a simple UI to accomplish this task. However, doing this on a failover cluster that uses Azure VMs as nodes presents certain complications. This is relevant for SQL Server because AlwaysOn Availability Groups, which is probably the most commonly used HA solution for SQL Server in Azure VMs, requires an underlying failover cluster. Each SQL Server instance hosting an availability replica needs local storage, which in Azure may need to use a Storage Spaces virtual disk to improve I/O performance, while at the same time running on a node of a failover cluster. This is where the complications arise. When a virtual disk is created on a cluster, Storage Spaces makes the disk and the underlying storage pool accessible to every node in the cluster. It also creates cluster resources (a pool and a disk) to let the cluster manage storage spaces, i.e. to allow one node read-write access while other nodes have read-only access. This cluster awareness is needed in particular to support Clustered Shared Volumes (CSVs), which is arguably the primary scenario for using Storage Spaces on a failover cluster.

However, as it happens, this scenario is not supported in Azure VMs. Disks based on Azure Blob storage do not support persistent reservations, therefore when Storage Spaces creates a new storage pool and a virtual disk on top of that pool, and then creates corresponding cluster resources, these resources fail to come online.

Nevertheless, using Storage Spaces to create better performing local storage on cluster nodes is possible, simply by managing storage pools and virtual disks manually instead of letting the cluster do that. The following steps describe the procedure of adding a Storage Spaces virtual disk to a node of a two node failover cluster. This cluster can then be used to host a deployment of SQL Server Availability Groups with two availability replicas, with the database(s) residing on Storage Spaces virtual disks.

Create a failover cluster using Azure VMs. The detailed procedure is not described here – there are many tutorials, walkthroughs, and scripts available online. The cluster in this example is named CL01, and the two nodes are named CL01N1 and CL01N2. A cluster with more than two nodes can be used as well.

In Azure management portal, attach three empty disks to each node. In this example, three disks are used for simplicity only. The actual number of disks to use depends on the performance requirements of the SQL Server workload, and is limited by the size of the VM. When attaching disks, consider creating disk blobs in separate storage accounts, to avoid a shared performance cap of a single storage account. Using a non-default storage account to create a new empty disk is currently not supported in the Azure management portal, but the Add-AzureDataDisk PowerShell cmdlet can be used to do this. There is no downside in using the maximum size for each disk. Storage in Azure is thin-provisioned and is billed based on actual space usage.

For each node, enable Remote Management. This can be done in Server Manager under the properties of local server. Without Remote Management enabled, server inventory in Server Manager will not complete, and storage pools and disks in the cluster will not be manageable from Server Manager.

In Server Manager on each node, go to File and Storage Services, Disks, and initialize the new disks just added. If disks do not appear, refresh the view first (this can take several minutes). You can also initialize disks from the Storage Management MMC console. Do not create any volumes once disks have been initialized.

Go to File and Storage Services, Volumes, Storage Pools. You should see two Primordial storage pools listed, one for each node. As you click on each Primordial pool, you will see the disks that belong to the pool listed under Physical Disks.

In the first screenshot, the Primordial pool on the CL01N1 node is selected. This is not obvious until you look at the second screenshot that shows the disks in the pool, and notice that the node name is listed in parentheses after each disk name. Note that in the first screenshot, both pools are managed by and available to CL01, which is the cluster name. This is an indication that with Clustered Storage Spaces, it is the cluster that will manage the pools and the virtual disks by default.

Create a new storage pool on one of the nodes (CL01N1 in this case). Right-click on one of the Primordial pools, taking note of the node to which it belongs by checking the list of physical disks, and select New Storage Pool. Click through the wizard, giving the new storage pool a name and selecting the disks that you want added to the pool. In this example, the pool name is StoragePool01, and it is using all three disks that have been added to the node.

Click Create, and close the wizard once all actions are complete. Note that the wizard will not let you create a storage pool based on less than three disks. If a two-disk pool is needed, then a workaround is to create a three-disk pool first, and then remove a physical disk from the pool before creating a virtual disk on top of that pool.

Back in Server Manager, you will see that a new storage pool named StoragePool01 now exists, however there is a warning sign next to it, blocking the creation of a virtual disk on top of this pool. The same warning sign is displayed next to each of the three disks in this pool.

There is no clear indication in the Server Manager why this warning is displayed. However, if you open Failover Cluster Manager and go to Storage, Pools, you will see that a new cluster resource named Cluster Pool 1 is now listed, and its status is Failed. As noted earlier, the cluster fails to bring the resource online because the disks in this pool do not support persistent reservations.

Delete the Cluster Pool 1 resource. Back in Server Manager, refresh the view. Once refresh completes (which could take a few minutes), note that the warning sign next to the storage pool is still displayed, however the warning signs next to each disk are gone. Notably, the storage pool now appears as managed by individual cluster nodes (CL01N1, CL01N2) instead of by the cluster as it was earlier. Note that the value in the Read-Write Server column is blank, meaning that neither node can write to the pool. Therefore, virtual disk creation is still blocked.

Right-click on the storage pool, and select Set Read-Write Access. Be sure to select the node where the disks in this storage pool are attached. This makes the pool writeable on that node, making it possible to create a virtual disk on this pool. Note that the warning sign next to the storage pool is now gone, and the node name (CL01N1) appears as the Read-Write Server.

Right-click on the storage pool just created, and select New Virtual Disk. Click through the wizard, giving the new virtual disk a name, selecting Simple as the layout, and using the maximum size for the pool. In this example, the name of the virtual disk is VirtualDisk01. The Simple layout will stripe all disks is the storage pool, which provides maximum performance but no redundancy, similar to RAID 0. This may sound like a bad idea for SQL Server storage, but keep in mind that in this case, disks are backed by Azure Blob storage, which already provides triple redundancy.

Click Create. The wizard will run through the tasks, however the Initialize Disk task will fail:

As it happens, the reason for the failure is again due to failing cluster resources. Open Failover Cluster Manager, and notice that there are now two new resources added. One is again a pool named Cluster Pool 1, and one is a disk named Cluster Virtual Disk (VirtualDisk01). The disk has the Failed status, and the pool is offline. Delete both resources, first the disk and then the pool.

Back in the Server Manager, refresh the view. You will see that both the storage pool and the new virtual disk have warning signs next to them. Set read-write access on the storage pool to the node where its disks are attached again, in the same way as in the previous step. This will remove the warning sign next to the storage pool, but the sign next to the virtual disk will still be there. This is because the virtual disk is not yet attached to the node that has read-write access on the underlying storage pool.

Right-click on the virtual disk and select Attach Virtual Disk. After confirming the prompt, you will see that the warning sign is gone. Now, if you go to File and Storage Services, Volumes, Disks in Service Manager, you will see the new virtual disk in the list of all disks.

The new disk is also listed in the Disk Management MMC console.

Initialize the new disk, and create a new simple volume. This can be done either in Server Manager, or in Disk Management console. This is no different from the regular process of creating a new volume on a new disk. Using Disk Management may be simpler because it does not depend on Remote Management being available.

At this point, it may look as if you are done, and the new disk is ready for use. Actually, a very important step still remains. Note that in a previous step, we had to attach the new virtual disk manually. By default with Clustered Storage Spaces, virtual disks are not attached automatically when they become available or when the server is started, because they are managed by the cluster. But we had to delete cluster resources, effectively taking the management of virtual disks upon ourselves. This means that we have to ensure that the virtual disk remains available upon server restart. Storage Spaces provides a virtual disk property named IsManualAttach. By default in Clustered Storage Spaces, this is set to True. For the disk to be automatically attached upon server restart, this property must be set to False.

Server Manager does not provide a UI to change the IsManualAttach property, although it provides a read-only view of all virtual disk properties, including this one. Therefore, we have to use PowerShell (incidentally, all previous tasks can be accomplished in PowerShell as well). Open an administrator PowerShell command prompt, and execute the Get-VirtualDisk cmdlet. This will output the properties of the disk that you just created. Note that the IsManualAttach property is set to True. Type and execute the following command: Set-VirtualDisk -FriendlyName VirtualDisk01 -IsManualAttach $False. Execute Get-VirtualDisk one more time to confirm that the property has been changed.

At this point, the disk is ready for use. In this example, we created a single virtual disk out of three physical disks, but other configurations are certainly possible. On larger Azure VMs, up to 16 physical disks can be used to create one or more storage pools and virtual disks.

Finally, repeat the steps on all other nodes of the cluster.

In closing, a caveat needs to be added. Microsoft Azure is a fast changing environment, and this blog is written based on its current state as of this writing (Spring 2014). Future developments may change or invalidate some of the information presented in this blog.

Update 2014-07-14: New guidance on Azure storage disk striping has been recently published as a part of Performance Best Practices for SQL Server in Azure Virtual Machines help topic. Please review this topic, specifically to the Disks and performance considerations section, before provisioning Azure storage to be used by SQL Server.

Update 2014-08-08: One clarification to make is that these steps apply to the scenario where a cluster has already been created, and local Storage Spaces now need to be added on every node. If a cluster is being built from scratch, it would be easier to add local Storage Spaces to each node before the cluster is formed, and then answer No to the prompt to add that storage to the cluster.

Update 2014-11-16: In the scenario where a new node is added to an existing cluster with the above Storage Spaces configuration, if the option to add available storage to the cluster is selected, then the cluster attempts to add the storage pools and virtual disks from the existing nodes, which then promptly become unavailable for local access. To resolve this, follow similar steps as above, deleting cluster objects, allowing write access on a storage pools, attaching the virtual disks (and setting IsManualAttach to False again), and bringing the disks online.

]]>https://blogs.msdn.microsoft.com/dfurman/2014/04/27/using-storage-spaces-on-an-azure-vm-cluster-for-sql-server-storage/feed/17Row count discrepancy between statement level and batch level eventshttps://blogs.msdn.microsoft.com/dfurman/2014/03/31/row-count-discrepancy-between-statement-level-and-batch-level-events/
https://blogs.msdn.microsoft.com/dfurman/2014/03/31/row-count-discrepancy-between-statement-level-and-batch-level-events/#commentsMon, 31 Mar 2014 13:28:00 +0000https://blogs.msdn.microsoft.com/dfurman/2014/03/31/row-count-discrepancy-between-statement-level-and-batch-level-events/Both XEvents and SQL Trace include events that fire at the completion of a statement and at the completion of a batch. For XEvents, they are sqlserver.sql_statement_completed and sqlserver.sql_batch_completed, and for SQL Trace they are SQL:StmtCompleted and SQL:BatchCompleted. When troubleshooting a DML query recently, I noticed that the row count reported in the sql_batch_completed event was greater than the row count reported in the sql_statement_completed event by one or two rows. Now, in general, this is not necessarily surprising. If a batch includes multiple statements, the row counts for all statements are added up and the sum is reported at the batch level. Sometimes, the fact that the batch contains multiple statements may be less than obvious. For example, if a DML statement targets a table with an AFTER trigger, the row counts from statements in the trigger will be added to the batch level row count. In my case, however, the batch contained a single UPDATE statement, and the target table did not have any triggers.

I began to suspect a bug in SQL Server, when my colleague Nacho Alonso Portillo (blog) pointed out that there are actually additional statements in the batch. These additional statements are SELECT queries executed internally by SQL Server when during query optimization, it finds that statistics are missing or out of date. These autostats queries look similar to the one below, and can be seen in either XEvents and SQL Trace sessions if the sqlserver.sp_statement_completed or SP:StmtCompleted events are included.

SELECT StatMan([SC0]) FROM (SELECT TOP 100 PERCENT [ColumnName] AS [SC0] FROM [dbo].[TableName] WITH (READUNCOMMITTED) ORDER BY [SC0] ) AS _MS_UPDSTATS_TBL

The number of rows produced by these queries will be added to the batch level row count.

While this behavior is by design, the fact that the batch level row count includes row counts from internal SQL Server queries may be unexpected. This does not seem to be documented or mentioned anywhere (please add a comment below otherwise). Hopefully this blog will help others who may be puzzled by the discrepancy.

]]>https://blogs.msdn.microsoft.com/dfurman/2014/03/31/row-count-discrepancy-between-statement-level-and-batch-level-events/feed/1Marked transactionshttps://blogs.msdn.microsoft.com/dfurman/2013/11/15/marked-transactions/
https://blogs.msdn.microsoft.com/dfurman/2013/11/15/marked-transactions/#respondFri, 15 Nov 2013 12:34:59 +0000https://blogs.msdn.microsoft.com/dfurman/2013/11/15/marked-transactions/A little known and perhaps underappreciated feature of SQL Server is the ability to assign a name and a description to a transaction. The name of the transaction is recorded in the transaction log, and the transaction is called a marked transaction. The syntax is BEGIN TRANSACTION TransactionName WITH MARK 'TransactionDescription'(the description is optional). The MSDN topic for BEGIN TRANSACTION has additional details. Marked transactions for all databases on the instance are also recorded in the dbo.logmarkhistory table in the MSDB database.

As the MSDN topic notes, one of the reasons to use marked transactions is to be able to restore a set of related databases, possibly residing on different SQL Server instances, to a logically consistent state. This could be achieved by using the STOPATMARK or STOPBEFOREMARK arguments of the RESTORE LOG statement. Another reason to use marked transactions, and perhaps a more common one, is to simplify recovery and minimize potential data loss when deploying database schema or data changes, i.e. during a release. The approach is to make all database changes in a marked transaction (the assumptions here are that all changes are transactional, and are sufficiently small to not cause concurrency or resource problems due to a large or long running transaction). If the change causes some unexpected results (also known as the “Oh no!” moment), it would be possible to RESTORE LOG … WITH STOPBEFOREMARK to restore the database to the point just before the erroneous transaction. This would be more accurate than restoring to a timestamp just before the change, which may not even be known.

None of this is really new or groundbreaking, and the main reason for this post is to describe some slightly unexpected results I noticed when testing a database change implemented with a marked transaction. The change script was completing successfully, however no marked transaction was recorded in the dbo.logmarkhistory table. This was puzzling, because I’ve used this approach before and did see the marks in that table. One difference in this case was that the target database was just created. The recovery model of the database was Full, but no full backup was taken yet, which means that the database was using the “pseudo-Simple” recovery model. Apparently, even though a marked transaction can be successfully committed against such a database (and the mark is actually recorded in the log, as seen in the output from fn_dblog()), it is not recorded in the dbo.logmarkhistory table. On reflection, this makes sense – if a database is still using the Simple recovery model, then it is not possible to restore the database to a point in time, or to a log mark, anyway, so recording a marked transaction in the dbo.logmarkhistory table would be misleading. As soon as I took a full backup of the database, marked transaction started appearing in the dbo.logmarkhistory table.

]]>https://blogs.msdn.microsoft.com/dfurman/2013/11/15/marked-transactions/feed/0SQL Server 2012 CDC for Oracle – a Review of One Implementationhttps://blogs.msdn.microsoft.com/dfurman/2013/04/15/sql-server-2012-cdc-for-oracle-a-review-of-one-implementation/
https://blogs.msdn.microsoft.com/dfurman/2013/04/15/sql-server-2012-cdc-for-oracle-a-review-of-one-implementation/#commentsMon, 15 Apr 2013 18:38:00 +0000https://blogs.msdn.microsoft.com/dfurman/2013/04/15/sql-server-2012-cdc-for-oracle-a-review-of-one-implementation/SQL Server 2012 shipped with a new feature named SQL Server 2012 Change Data Capture for Oracle By Attunity (shortened to SQL 2012 Oracle CDC in this blog). This feature allows using Change Data Capture in a SQL Server database, with an Oracle database as the source of data. One notable advantage of this feature is that very low data latency can be achieved; another is that knowledge of Oracle is not required for ETL development. Overall, this feature can significantly improve the quality of SQL Server data warehousing projects that source some or all of their data from Oracle. The feature was developed by Attunity and is shipping as a part of SQL Server 2012.

This blog is a review of SQL 2012 Oracle CDC implementation on a data warehousing project at one of Microsoft’s customers. I will concentrate here on implementation details that are not very obvious, are not described in existing documentation, and may take some time to figure out on your own. I will also mention a few design patterns we used that may be helpful to someone implementing this feature. This blog assumes that the reader is familiar with the basics of the “native” SQL Server CDC, as well as with the basics of SQL 2012 Oracle CDC. If you are new to this topic, then it is highly recommended to first review the MSDN SQL Server CDC documentation, and then a number of other sources of information on SQL 2012 Oracle CDC that are currently available:

SQL 2012 Oracle CDC documentation, available on the machine where the feature is installed. Look for two help files, named Attunity.SqlServer.XdbCdcSvcConfig.chm and Attunity.SqlServer.XdbCdcDesigner.chm, located by default in C:\Program Files\Change Data Capture for Oracle by Attunity

The implementation described in this blog is based on the requirements of a specific data warehousing project. Not everything mentioned here will necessarily be applicable or even optimal on other such projects. This blog is not a comprehensive step-by-step guide, and what I present here is not intended as the “official” implementation recommendations for SQL 2012 Oracle CDC; however, my hope is that most of this will be useful to others implementing this new feature of SQL Server 2012, particularly when dealing with advanced design and troubleshooting scenarios.

As of this writing, the most current version of SQL 2012 Oracle CDC is available as a part of SQL 2012 SP1 CU7. These bits contain several important fixes. Note that unlike other SQL Server components, SQL 2012 Oracle CDC ships as two standalone MSI files; therefore you only need these MSIs from a SQL Server SP or CU to fully install the product. For example, for this update, this means downloading the files named 2012_SP1_AOrcleSer_CU7_2894115_11_0_3393_x64 and 2012_SP1_AOrcleDes_CU7_2894115_11_0_3393_x64. Over time, updated versions of SQL 2012 Oracle CDC may be released in a similar fashion as part of regular SQL Server 2012 updates.

SQL 2012 Oracle CDC can be installed on a Windows server running on commodity hardware. This can be, but does not have to be, the ETL/SSIS server, or the data warehouse server. The server where SQL 2012 Oracle CDC is installed will run one or more instances of the CDC service (xdbcdcsvc.exe). The CDC service connects to an Oracle database, captures change data, and saves that data in a SQL Server CDC enabled database. That database, called the change data database in this blog, can be placed on any SQL Server 2012 instance, not necessarily on the same server where SQL 2012 Oracle CDC is installed. In our project, both the CDC services and the SQL Server instance hosting change data databases were installed on the same midrange server with 8 cores and 192 GB of RAM. I have not found the CDC service to be a big consumer of either CPU or RAM, so in all likelihood, a less powerful server would suffice as well.

Once the current product version is installed, two MMC consoles can be found under All Programs, Change Data Capture for Oracle by Attunity. One is the CDC Service console, typically used once at setup time to configure the CDC service(s). The other is the CDC Designer console, used to create and monitor CDC instances.

CDC Service Configuration

Setting up the CDC service is the first step to be done after installing the product. Documentation describes that process step by step, so here I will only point out a few additional details.

A current version of the Oracle client must be installed on the server where the CDC service will be running. The bitness of the Oracle client must match the bitness of the CDC service (although it’s ok to also install another Oracle client of different bitness if needed for other purposes). If installing on a cluster, Oracle client must be installed identically on all nodes.

Most CDC service configuration and status info is stored in a small database named MSXDBCDC, hosted on the same SQL Server instance where the change data database(s) will be hosted. This database is created when you use the Prepare SQL Server action from the CDC Service console. It is created with default settings of the model database – consider adjusting some of those (recovery model, owner, etc.) as required in your environment.

If necessary, multiple instances of the CDC service can be set up on the same server. In our project, we created two service instances, because we were sourcing data from two independent Oracle databases. This both improved scalability and avoided a single point of failure.

When installing under Windows Server 2008 R2 or newer, the CDC service by default does not use a service account. Instead, permissions are granted to the service identity using per-service SID. This is a good thing, as there is one less account to manage. Because of that, the default user name shown in the New Oracle CDC service dialog is NT Service\OracleCDCService1. Note that the Password box in this case is still enabled, which is a little misleading, since a password is not applicable when per-service SID is used. In this scenario, you may also run into a kind of a chicken and egg problem. The service identity (NT Service\OracleCDCService1) requires a login on the SQL Server instance where the change data database(s) will be hosted. However, that login cannot be created until the Windows service itself is created and the SID associated with NT Service\OracleCDCService1 becomes known to Windows. But, when the CDC Service console creates the service, it also tries to connect to the SQL Server instance, which fails because the login does not exist yet. The workaround is simple – after getting the Login failed error, manually create a login for NT Service\OracleCDCService1 (there is no need to grant it any additional privileges). At that time, the service has already been created in Windows, so its SID is known. Then, hit OK one more time in the New Oracle CDC service dialog. This will create the service successfully.

CDC services can be clustered as a Generic Service resource. In our project, we installed the CDC services on the same cluster where the SQL Server instance hosting change data databases was installed, and added the CDC services to the SQL cluster group for that instance, making them dependent on the SQL resource. This provided the same level of high availability for the CDC services as was there for SQL Server. When clustering the CDC service, you need to use a domain account to run the service (just like you need to use a domain account to run SQL Server on a cluster). As a side benefit, this avoids the above problem with service creation, as long as you create a login for the CDC service domain account before creating the service in the New Oracle CDC service dialog. When clustering the CDC service, it must be first installed in an identical fashion on all nodes of the cluster. After that, open the Failover Cluster Management console on the cluster node owning the SQL Server group, add the Generic Service resource(s) for the CDC service(s) to that group, and make them dependent on the SQL Server resource. This only needs to be done on one node of the cluster.

In the same New Oracle CDC service dialog, you have to enter the CDC service master password. This is the password that protects an asymmetric encryption key, created in each change data database. That key is used to encrypt the password of the Oracle log mining account. One caveat here is that the asymmetric key password is subject to password complexity rules in effect on the Windows server. However, the password is not validated against those rules by the CDC Service console when the CDC service is created. If the master password is not complex enough, the CDC service will be created successfully, but will later fail to create the asymmetric key in the change data database.

Preparing Oracle Database for Data Capture

Before a CDC instance can be configured, the source Oracle database must be prepared for data capture. As described in greater detail in product documentation, you have to enable and configure Oracle Log Miner, create the log mining account that will be used by the CDC service to connect and capture changes from Oracle redo logs, and grant a number of privileges to that account. These steps require holding sufficient privileges in the Oracle database; therefore, you may need to work with an Oracle DBA who can run corresponding scripts.

To enable Log Miner, the source Oracle database must be in ARCHIVELOG mode, with supplemental logging enabled at the database level. This is done by executing the following:

ALTER DATABASE ARCHIVELOG;ALTER DATABASE ADD SUPPLEMENTAL LOG DATA;

We used the following script to create the Oracle log mining account (named DW_ETL here) and grant it required privileges:

CREATE USER DW_ETL IDENTIFIED BY &Password DEFAULT TABLESPACE USERS TEMPORARY TABLESPACE TEMP PROFILE DEFAULT ACCOUNT UNLOCK;GRANT CREATE SESSION TO DW_ETL;GRANT SELECT ANY TRANSACTION TO DW_ETL;GRANT EXECUTE ON DBMS_LOGMNR TO DW_ETL;GRANT SELECT ON V_$LOGMNR_CONTENTS TO DW_ETL;GRANT SELECT ON V_$ARCHIVED_LOG TO DW_ETL;GRANT SELECT ON V_$LOGMNR_LOGS TO DW_ETL;GRANT SELECT ON V_$LOG TO DW_ETL;GRANT SELECT ON V_$LOGFILE TO DW_ETL;GRANT SELECT ON V_$DATABASE TO DW_ETL;GRANT SELECT ON V_$INSTANCE TO DW_ETL;GRANT SELECT ON V_$THREAD TO DW_ETL;GRANT SELECT ON V_$PARAMETER TO DW_ETL;GRANT SELECT ON DBA_REGISTRY TO DW_ETL;GRANT SELECT ON ALL_INDEXES TO DW_ETL;GRANT SELECT ON ALL_OBJECTS TO DW_ETL;GRANT SELECT ON DBA_OBJECTS TO DW_ETL;GRANT SELECT ON ALL_TABLES TO DW_ETL;

Depending on Oracle database configuration, some of these privileges may have to be granted on views starting with V$ rather than V_$, e.g. V$THREAD instead of V_$THREAD. The documentation mentions that, however there is an apparent typo where instead of V_$ it has V $.

Even though the log mining account does not query the source Oracle tables directly, it still requires the SELECT privilege on each source table, so a separate script granting that has to be executed. If SELECT privileges are missing, data capture will fail.

In addition to enabling supplemental logging at the database level, it has to be also enabled for each source Oracle table. The CDC Designer console will generate a script to do that (more on this below).

CDC Instance Design Workflow

Once the Oracle database is prepared for data capture, the CDC Designer console can be used to create a CDC instance. A CDC instance refers to a combination of a source Oracle database, a CDC service, and a SQL Server CDC enabled database that stores change data. Note that the same CDC service may be associated with more than one CDC instance (in our case we could have used a single CDC service to capture changes from both of our source Oracle databases). It is also possible to create multiple CDC instances for the same source Oracle database, and include a different set of source tables under each CDC instance.

To design our CDC instances, we used a specific workflow that resulted in a set of deployment scripts. We then used these scripts to quickly deploy CDC instances to various environments, instead of having to use the CDC Designer console interactively every time a new CDC instance needed to be created. The workflow steps follow. For each step, I will mention some details that are relevant in the context of the step.

Step 1

Start the New CDC Instance wizard and follow through the steps. The wizard does the following:

If the wizard is canceled, it will drop the change data database it has created.

The change data database created by the wizard uses the default database options of the model database, which may or may not be optimal for your purposes.

When specifying Oracle connection info, you can enter it in the format suggested on the wizard page (host:port/servicename); however, you can also enter a TNS name from TNSNAMES.ora file on the server where the CDC service is running. The latter option is particularly useful when connecting to a RAC cluster, since additional connection options such as multiple addresses of RAC nodes can be specified in the TNSNAMES.ora file.

It is not necessary to add all (or any) source tables to the CDC instance as a part of running the wizard. It is entirely possible to create an empty CDC instance and add tables later.

Similarly, it is also not necessary to run the Oracle supplemental logging script as a part of running the wizard. In fact, there is a reason to postpone this step until later (more on this below). Also, since that script must be executed with sufficient privileges, there is a good chance that it will have to be handed over to an Oracle DBA.

Step 2

If you have not added any source tables while running the wizard, do that now. For a given CDC instance, use the Properties command under Actions, click on the Tables tab in the Properties dialog, and then click Add Table. Note that the Table Selection dialog may appear a bit slow and unresponsive with large Oracle databases. When searching for a table, you can use % as a wildcard in the Table name pattern box. If you have many tables to add, you don’t have to add them all at the same time. You can add tables in batches – each batch will be saved with the CDC instance.

Step 3

While in the Properties dialog, click on the Advanced tab and add any advanced options you might need (available options are described in product documentation). In our implementation, we added two options:

cdc_stop_on_breaking_schema_changes. When set to 1, this will stop data capture with an error if a schema change occurs in the Oracle database such that accurate data capture can no longer be guaranteed. If this option is missing or is set to 0, then in the event of such schema change, the corresponding capture instance in the SQL Server change data database will be silently dropped, while data capture continues for the rest of the tables. In our project, capturing consistent data was more important than capturing the latest data, therefore we chose to stop the capture process in the event of a schema change in the Oracle database, rather than continue without some table(s). It is important to understand the mechanism that is used to detect schema changes – more on that below. Note that advanced option names are case sensitive, and this particular option is incorrectly spelled as CDC_stop_on_breaking_schema_changes in the documentation. The correct option name is all lower case; otherwise the option will not be enabled.

trace. By default, SQL 2012 Oracle CDC writes basic status information to the cdc.xdbcdc_trace table in the change data database. This includes status changes, errors, etc. More detailed tracing can be enabled by setting the trace option to 1. This can be done even while the CDC instance is running. An even more detailed trace can be created by setting this option to SOURCE. Note that in the latter case, a large amount of trace data will be generated quickly, so it is not recommended to keep this on for a long time. Trace data is very helpful in troubleshooting CDC instance configuration and operation problems. When tracing to the cdc.xdbcdc_trace is not possible, i.e. because the SQL Server instance is down, trace data will be written to the Application event log on the server where the CDC service is running. We added this option with the value of 0, to not trace detailed information by default, but to make the option visible in case tracing will be required later. Note that by default, the CDC service prunes the cdc.xdbcdc_trace table to contain only the last 24 hours of trace data.

Step 4

Once all tables and advanced options have been added to the CDC instance, click on CDC Instance Deployment Script under Actions. This generates a SQL Server script that will create the change data database and configure it for data capture. We used this script as a starting point to create our own CDC instance deployment scripts, described below. A few notes on the generated script:

The change data database is always created with the SQL_Latin1_General_CP1_CS_AS collation, which may not match the server collation and is not necessarily the optimal choice.

Even though only SQL Server 2012 is supported, the script sets the compatibility level of the change data database to 100. This should be 110.

The script adds an extended database property named Description. Even though the value of this property may be blank, it still must be present – removing this part of the deployment script will break CDC Designer console when it works against the deployed CDC instance.

The password of the Oracle log mining account is not scripted. This makes sense from security perspective, but it will cause problems if the CDC instance is deployed using the deployment script just as it was generated by the console (more on handling Oracle account password below).

Step 5

Split the generated deployment script into three parts:

Script 1

The first script creates the change data database, sets its options as required, and enables the database for CDC. Here we adjusted database file sizes and filegroup configuration, used an appropriate collation, set database recovery model as required, and changed the database owner to a specific server principal. Other parts of the generated deployment script remained unchanged.

Script 2

The second script creates mirror tables and capture instances in the change data database. This is a good point to take a step back and describe some relevant details of SQL 2012 Oracle CDC architecture.

Mirror tables are tables in the SQL Server change data database that correspond one-to-one to the source tables in the Oracle database. Each mirror table has exactly the same column names as the Oracle table, these columns are in the same order as in the Oracle table, and have data types that match the Oracle data types as closely as possible.

Mirror tables will remain empty at all times. In fact, to ensure that, the generated deployment script denies all DML permissions on each mirror table. Conceptually, this makes sense – the change data database will contain only changes, not the entire contents of source tables in the Oracle database. Mirror tables are used to create capture instances in the change data database by calling the sys.sp_cdc_enable_table stored procedure, but actual data changes occur in the corresponding Oracle tables, not in SQL Server. This is really the “secret sauce” of SQL 2012 Oracle CDC: The CDC service captures changes in the Oracle database and writes them to the SQL Server change data database, however it writes them not to the mirror tables, but to the corresponding system tables in the cdc schema (the ones with the _CT suffix), which are created when each mirror table is enabled for CDC. With this design, any consumer of change data uses the well-established API of SQL Server CDC (i.e. the fn_cdc_get_all_changes and fn_cdc_get_net_changes functions), even though the actual changes occur in an Oracle database. In other words, consumers of change data can be completely unaware that changes actually originate in an Oracle database – working with the change data database is no different from working with the regular SQL Server CDC enabled database.

One reason for having to keep all columns of the Oracle table in the mirror table is to enable detection of schema changes in Oracle. For every captured column in a mirror table, the column_id value in the cdc.captured_columns table must match the COLUMN_ID value of the corresponding column in the corresponding Oracle table (as it appears in the SYS.ALL_TAB_COLUMNS Oracle view). If the mirror table had fewer columns than the corresponding Oracle table, then its column_id values would not match the Oracle COLUMN_ID values, which would be interpreted as a schema change in the Oracle database. As noted earlier, this either causes the capture instance for the table to be dropped, or stops data capture altogether.

Now, just because the mirror tables must contain all columns of the corresponding Oracle table does not mean that the capture instance must actually capture changes in all of these columns. If you are only interested in a subset of columns in a table, you can modify the call to the sys.sp_cdc_enable_table stored procedure and change the @captured_column_list parameter to only include the columns of interest. This is what we did to avoid the overhead of capturing many unneeded changes.

Finally, we made another rather important change to the mirror table definitions. When a capture instance is created for a table in a CDC enabled database, the sys.sp_cdc_enable_table stored procedure will enable the net changes mode for the capture instance (i.e. will create the fn_cdc_get_net_changes function) only if the table has a primary key/unique constraint or a unique index. The net changes mode can simplify and improve performance of the ETL code that consumes changes in the change data database and applies them to the destination database. In the all changes mode, which is the only option when the source table does not have a unique constraint or unique index, each change must be applied to the destination database individually, one row at a time, which is both relatively complex and inefficient. In the net changes mode however, all changes processed in an incremental ETL run can be applied as a set (i.e. using the MERGE statement) – this is both faster and simpler. Now, if the source Oracle table has a primary key constraint defined, then the corresponding mirror table in the generated deployment script will have that primary key constraint as well, and the net changes mode will be enabled. However, in our source system, most tables did not have a declared primary key, but did have one or more unique indexes, typically on a sequence based not nullable numeric column. Therefore, in the corresponding mirror tables, we made the matching columns not nullable, and added primary key constraints on these columns. This allowed us to use the net changes mode for most of the source tables and greatly simplified our ETL code.

Script 3

The third script drops the CDC capture job, which is created automatically when a database is enabled for CDC but is not used with SQL 2012 Oracle CDC. We added a call to the sys.sp_cdc_change_job to increase the retention period for change data from the default of three days. Then, we modified the UPDATE statement that writes CDC instance configuration info to the cdc.xdbcdc_config table in the change data database to include the encrypted password for the Oracle account (if you recall, this is not included in the generated script). The password must be encrypted by the asymmetric key that the CDC service creates in the change data database, which is done by calling the ENCRYPTBYASYMKEY() function. The CDC service will use the same key to decrypt the password in order to use it for connecting to Oracle. However, the caveat here is that it may take some time after the change data database is created for the CDC service to create the key. So by the time this script runs, the key may or may not exist. Therefore, we added a WAITFOR loop to the script to wait until the key appears in the database, before attempting to write configuration info. The part of the third script that does all this looks like this:

On a couple of occasions, we noticed that the WHILE loop would run indefinitely, waiting for the key to be created. While we did not find out the root cause for the CDC service not creating the key for a new change data database, in all such cases restarting the CDC service caused the key to be created, and allowed this script to finish.

Step 6

At this point, we have the customized scripts needed to recreate the CDC instance with the desired configuration, so we can delete the existing “prototype” CDC instance in the CDC Designer console. Note that deleting the CDC instance also drops the associated change data database. Once deleted, we can go ahead and execute the three scripts described above in the same sequence, which should result in a new CDC instance, configured as desired. The instance status will be INITIAL (you may need to refresh the CDC Designer console to see it).

Step 7

Next step is to create the Oracle supplemental logging script using the corresponding action in the CDC Designer console. Note that we deliberately postponed this step until now. If we did this as a part of initial CDC instance design, the generated script would enable supplemental logging for all columns of every source table, causing unnecessary overhead on the Oracle side. However, at this point the supplemental logging script will only enable logging for those columns that are being captured in each table, i.e. the columns specified in the @captured_column_list parameter of the sys.sp_cdc_enable_table stored procedure. The Oracle supplemental logging script can now be handed over to an Oracle DBA to be executed. Note that when running this script, database activity against a table may prevent a log group for that table from being created. If this happens, temporarily suspend the Oracle database workload, if possible, or wait until a period of low activity in the Oracle database. Then, rerun the script. The script generated by the CDC Designer console is idempotent and can be executed against the same Oracle database as many times as needed while the CDC instance is in the INITIAL status. If you have to manually edit the Oracle supplemental logging script, be aware that each Log Miner supplemental log group for a table cannot contain more than 32 columns. If you are capturing more than 32 columns per table, create additional log groups for the same table. The CDC Designer console handles this correctly, so we chose to simply regenerate the script when a change was needed. Also note that if for some reason a column that is added to a capture instance is omitted from the Oracle supplemental logging script, everything will work without error, however the column will always be NULL in the change data.

Step 8

Once supplemental logging is enabled for each source table, you can start the CDC instance. It may take about a minute to start change capture – the status will be changing from INITIAL to IDLE to PROCESSING, and the monitoring counters on the console will start showing non-zero values. If you see LOGGER as the status, it could mean two things. One is that the CDC instance encountered a temporary delay mining Oracle logs, i.e. due to a spike in the amount of changes. If so, LOGGER should change to PROCESSING fairly soon, typically within seconds. However, if you see LOGGER for a relatively long time, i.e. for longer than a couple of minutes, it probably means that there is some problem with mining Oracle logs. If so, check the cdc.xdbcdc_trace table for errors. If necessary, set the trace parameter to either 1 or SOURCE to troubleshoot further.

Step 9

Once the CDC instance is processing changes, you can start the initial load from the source Oracle database into the destination SQL Server database, and once that completes, start running incremental loads. For both types of loads, we used the new CDC components in SSIS 2012 (CDC Control task, CDC Source and CDC Splitter transformations) to manage LSN ranges and to consume change data from the change data database. Since these components use the standard SQL Server CDC APIs, using them for Oracle-sourced data is no different from using them against a native SQL Server CDC enabled database. For SSIS implementation details, refer to the CDC Flow Components MSDN documentation, and also to Matt Masson’s TechEd 2012 presentation, which describes several design patterns for working with CDC components in SSIS 2012.

Here are some other operational details related to the CDC Designer console:

While the console is typically installed on the same server where the CDC service is running, it does not have to be. Similar to SQL Server client tools, it can be installed on any machine that has connectivity to the SQL Server instance where the MSXDBCDC database and the change data database(s) are hosted. When you start the CDC Designer console, it prompts you for a SQL Server name to connect to. If you deployed SQL 2012 Oracle CDC in multiple environments, then you could manage all environments from multiple instances of CDC Designer console on the same administrator machine (but see the next point).

The console does not display the name of SQL Server instance it is connected to. If you open multiple instances of the CDC Designer console to manage SQL 2012 Oracle CDC in multiple environments (i.e. QA and PROD), then you may not be able to easily tell which is which. Needless to say, you don’t want to make a change in the wrong environment.

Neither the CDC Designer console, nor the CDC Service console displays the product version. As of this writing, the only way to determine which version is installed is to check the properties of files on disk in C:\Program Files\Change Data Capture for Oracle by Attunity.

All configuration and status information that the CDC Designer console uses or displays is stored in the change data database(s), in several tables in the cdc schema prefixed with xdbcdc_ (note that these appear under System Tables in SSMS). This means that you can monitor each CDC instance by querying these tables. For us, low data latency was very important, therefore we set up a data latency monitor job that would run the following script every minute and raise a warning if data latency exceeded a predefined threshold:

Here I will mention several problems that we ran into, and what the solution was for each. This list is by no means comprehensive, but may provide some helpful troubleshooting ideas and techniques.

Character data types in the deployment script generated by the CDC Designer console were doubled. For example, for a VARCHAR(10) column in the Oracle table, the mirror table had nvarchar(20). This was a bug in the product, fixed in the latest version referenced at the top of this post.

Occasionally, data capture would become permanently suspended with the LOGGER status. From the trace, we could see that a call to Log Miner resulted in the “SCN not found” error (in Oracle, SCN stands for System Change Number, which is analogous to LSN in SQL Server). This had everyone stumped for a while, until we noticed that this was consistently happening in one particular environment only. The root cause turned out to be rather simple – that environment had the original RTM version of the product deployed (this is something that isn’t easy to notice unless you check file versions on disk). Installing the latest bits resolved the problem.

Change data values for a non-nullable column in an Oracle table were all NULL in the capture instance _CT table. This turned out to be because the column was mistakenly omitted from the Oracle supplemental logging script.

As mentioned earlier, we were sourcing data from two independent Oracle databases. Both systems were running the same version of the same off-the-shelf Oracle application, with each system being used by a different regional organization of our customer. Our initial assumption was that the database schemas of the two Oracle databases must be the same, too. Therefore, we used the CDC instance deployment script generated for the first Oracle database to deploy the CDC instance for the second database, too. However, we found that as soon as we started the second CDC instance, it would log this error to the trace: “ORACDC323I:The metadata validation for Oracle table cdc.<table name>_CT failed. Column <column> index was changed“, and then immediately drop the capture instance for the table (the cdc_stop_on_breaking_schema_changes option was not enabled at that time). It turned out that our initial assumption was incorrect – each database had slightly different schemas due to customizations supported by the Oracle application that each organization has made. Specifically, column order in some tables was different. Because the method used to detect schema changes is based on comparing column order between Oracle tables and mirror tables (see above), this resulted in the above error. Note that the error message may be slightly misleading on the first read – you may think it’s referring to an index on the referenced column. In fact, the word “index” refers to the column ordinal position in the table. The solution here was to maintain separate CDC instance deployment scripts for the two source databases.

Occasionally, data capture would become suspended with the LOGGER status. Unlike the previous similar problem, the error in the trace was different. We saw that a call to Log Miner resulted in the “missing logfile” error. There were three distinct causes for this. In one case, it was happening because the backup job on the Oracle database server would delete archived logs while they were still being mined by SQL 2012 Oracle CDC. The solution was to increase the retention period for archived logs. In another case, running out of disk space prevented Oracle from archiving an online log once it has rolled over to a new log file. This resulted in an inactive online log without a corresponding archived log. SQL 2012 Oracle CDC does not use inactive online logs, so when it requested a particular SCN range from Log Miner that happened to be in that inactive online log, that range was not found in any log file. This problem was resolved once Oracle DBAs freed some disk space and allowed the inactive online log to be archived. Finally, we have seen the same error in a non-production environment when Oracle DBAs had to reset the logs as a part of database maintenance work. In this case, the only way to resolve the problem was to stop capture and reset CDC using the Reset action in the CDC Designer console. This removes all change data. If your requirement is not to miss any changes, you will need to restart data capture and run the initial load again.

If you run into a problem and your initial troubleshooting is unsuccessful, you may need to open a support case. If so, it is highly recommended to collect a set of diagnostic information for the support engineers (actually this may help you in your troubleshooting efforts as well). The CDC Designer console provides a convenient Collect Diagnostics link, which will create a text file with diagnostics collected from both Oracle and SQL Server databases. Note that this includes the contents of cdc.xdbcdc_trace table. If you have set the trace option to SOURCE sometime prior to using Collect Diagnostics, then the trace table may be so large that Collect Diagnostics will fail because of a timeout retrieving all that data. Therefore, while it is recommended to have trace set to SOURCE while Collect Diagnostics is running, it’s best to do that just prior to clicking on the link.

In addition to the output from Collect Diagnostics, if the problem seems to be related to the Oracle database and Log Miner, then it is also helpful to include the output from the following Oracle queries, which shows the state of online and archived logs:

SELECT * FROM V$LOG;SELECT * FROM V$LOGFILE;SELECT * FROM V$ARCHIVED_LOG;SELECT * FROM V$DATABASE;

To conclude, SQL 2012 Oracle CDC is still a relatively new feature of SQL Server, as of this writing. The purpose of this blog is to make it easier for SQL Server developers and DBAs who are working on data warehousing projects to discover, implement, and support this feature. Please post your comments and questions below.

]]>https://blogs.msdn.microsoft.com/dfurman/2013/04/15/sql-server-2012-cdc-for-oracle-a-review-of-one-implementation/feed/32Getting out of single user modehttps://blogs.msdn.microsoft.com/dfurman/2012/01/20/getting-out-of-single-user-mode/
https://blogs.msdn.microsoft.com/dfurman/2012/01/20/getting-out-of-single-user-mode/#commentsFri, 20 Jan 2012 19:58:26 +0000https://blogs.msdn.microsoft.com/dfurman/2012/01/20/getting-out-of-single-user-mode/Sometimes, brute force is the best way to get something done.

Imagine a busy production server with about thirty databases. Somehow, during an administrative task, one of the databases ended up in single user mode. The application using the database is multithreaded and normally opens and closes multiple connections to the database at a high rate, so the only available connection to the database was constantly in use by sessions with different SPIDs. All we had to do to fix this was put the database back into multi user mode, but whenever we tried to execute ALTER DATABASE AppDB SET MULTI_USER, we would always get a deadlock between the ALTER DATABASE session and the application’s session already exclusively using the database. Invariably, the ALTER DATABASE session was chosen as deadlock victim.

We couldn’t disable the login used by the application, because the same login was used to access all databases on the server. We also couldn’t kill the session using the database because the SPID was changing randomly all the time, and we couldn’t risk killing the wrong process on a production server. Then I thought I had the solution: we would open a transaction, make data modifications to generate a lot of transaction log, and then run ALTER DATABASE AppDB SET MULTI_USER in the same transaction. The idea was to have the other session selected as the deadlock victim, since SQL Server generally resolves deadlocks by killing the session with the least amount of generated transaction log. This would let the ALTER DATABASE command complete successfully. I thought this was an elegant way to resolve this, before recalling that ALTER DATABASE cannot be executed in a transaction.

The solution that actually worked was much simpler and rather straightforward. We executed this from SSMS:

USE AppDB;

GO 1000

The number after the GO batch separator tells SSMS to execute the preceding batch that many times. After getting about 700 error messages saying that the database is in single user mode and can only be used by one user at a time, the USE command succeeded, we got the only available session to ourselves, and put the database back into multi user mode.

]]>https://blogs.msdn.microsoft.com/dfurman/2012/01/20/getting-out-of-single-user-mode/feed/11Yet another cause of Kerberos authentication failure connecting to SQL Serverhttps://blogs.msdn.microsoft.com/dfurman/2011/09/29/yet-another-cause-of-kerberos-authentication-failure-connecting-to-sql-server/
https://blogs.msdn.microsoft.com/dfurman/2011/09/29/yet-another-cause-of-kerberos-authentication-failure-connecting-to-sql-server/#commentsThu, 29 Sep 2011 16:03:00 +0000https://blogs.msdn.microsoft.com/dfurman/2011/09/29/yet-another-cause-of-kerberos-authentication-failure-connecting-to-sql-server/Yesterday I was troubleshooting a rather common problem. A query running on an instance of SQL Server (INST1) was using the OPENROWSET function to retrieve a rowset from another instance (INST2). This is known as “double-hop authentication”, and Kerberos authentication is required for this to work. When executed from a particular client machine, the query was failing with the infamous “Login failed for user ‘NT AUTHORITY\ANONYMOUS LOGON'” error.

The troubleshooting steps for this problem are well known and are described in multiple sources. My favorite is the My Kerberos Checklist post by Adam Saxton on the CSS SQL Server Engineers blog. I went through the list and confirmed that the basics were configured correctly: the proper SPNs were registered, delegation for the SQL instance service account was enabled, the client account was not marked as sensitive, etc. Then I ran the following query on INST1:

SELECT auth_scheme, client_net_addressFROM sys.dm_exec_connections;

I noticed that for some clients, auth_scheme was KERBEROS, yet for that particular client machine it was NTLM. This seemed to indicate that the instance was correctly configured for Kerberos, and the problem was likely client related.

I started looking at the all parts of the picture once again, including the service account of the INST1 instance. I noticed that for that account, the “Use DES encryption types for this account” option was enabled, yet that option was disabled for other SQL Server service accounts in the domain. Once I disabled that option and restarted the INST1 instance, the query immediately worked.

The root cause of this problem is explained in KB977321. In Windows Server 2008 R2 and Windows 7, DES encryption is disabled by default, and AES encryption is used. The client machine in this case did run Windows Server 2008 R2, unlike other clients connecting to INST1, which ran older versions of Windows. Since the INST1 service account only supported DES encryption, Kerberos authentication failed because a common encryption type between the client and the server did not exist. Another possible solution could be to enable the (weaker) DES encryption on the client, as described in the KB article.