I am working on implementing a data archiving and purging strategy for a client and I wanted to run something past you all to get feedback. The client has a VLDB (4+ TB and growing rapidly) with an OLTP workload that has over a decade of data. They want to end up with an OLTP system that has the last year of data but they want an identical system with complete history kept in synch with the OLTP. What that means is that all overlapping transactions are replicated to the "historical archive", with the exception of the records that are purged as part of the cleanup cycle.

What has been proposed so far (not by me) is to have an outage and establish a 1 year version of the database, then use SQL Server replication to keep the old environment synchronized. All UIs and queries would be pointed to the new 1 year OLTP. The purging job would be setup to replicate the stored procedure execution and it would have conditional logic that would prevent it from running on the historical archive. Reinitialization would not be possible, as it would destroy the historical data.

It's this last point on want to focus on. This solution is being proposed primarily because it is out of the box functionality and it's simple to understand. However, I am extremely concerned about this statement that we cannot reinitialize. In my experience working with replication, it is not fail proof. At some point, it can go down and the databases can get out of synch. If this happens, the client would need to synchronize the environments manually, which will require significant effort and cost a lot of time.

Where I need help from you is remembering the scenarios where replication can / will fail. For instance, I believe if the historical archive environment had a hardware failure, it would begin filling up the transaction logs and replication would have to be stopped once the logs were close to filling the disk. What other reasons would cause replication to get out of synch and require a re-initialization?

Agreed. You need to set up the archive database in such a way that the archived data is never replicated, and only the OLTP data is. That way if replication goes south you can reinitialize any time without impacting the archive.

You have to coordinate this with your table partitioning by purging from the OLTP system AND switching the replicated data into an archived partition at the same time. I can't say I've done anything like that but it shouldn't be too difficult. I would think a subscription could be arranged on a partitioned table that can be switched out. You can add filters to publications to limit the data that's replicated, these filters should have similar conditions as the partitioning functions used in the archive.

I'm not fully understanding how partitioning + replication could solve this problem, so bear with me a second. From the article, it doesn't appear that I can replicate on a specific partition, only a partitioned table. If that's the case, wouldn't a reinitialization blow away the historical data? Also, the article isn't even available for SQL 2005 so I'm not sure what the support is for partition replication in that edition.

On the topic of partitioning, we actually started out evaluating partitioning for purely performance reasons but our testing did not reveal a compelling case to implement it. Even under best case scenario (changing queries to use partition elimination), the performance results were essentially a wash.

Some of the reasons this may have happened: the environment is SQL Server 2005 (first version of partitioning), MAXDOP is 1 (best practice for an OLTP server but my research indicates that partitioning really needs to be parallelized to be fast), and there are a lot of applications that access the database that we don't have organizational bandwidth to change in order to take advantage of partition elimination.

To throw another wrench in the gears, we cannot select a purely date based partitioning column due to business requirements (records should be < 1 year old, but there are exceptions).

That said, our evaluation was done strictly within the context of performance, so I appreciate the suggestion to reassess it within the realm of replication. I am just unsure how that would work.

Addressing some specific points mentioned by Rob & Russell:

Rob, you hit the nail on the head: "You need to set up the archive database in such a way that the archived data is never replicated, and only the OLTP data is. That way if replication goes south you can reinitialize any time without impacting the archive." Now I just need to clarify how we can achieve this with partition + replication (or any other way).

There are restrictions in place about partition switching and replication as noted in the article above. One thing that caught my eye was this statement "You can also turn on allow_partition_switch without replicating the DDL. This is useful in the case where you want to roll old months out of the partition but keep the replicated partition in place for another year for backup purposes at the subscriber." Therefore, this might be feasible but I'd still need a way to reinitialize the main partition safely. Perhaps that's where filtering comes in?

Don't replicate delete statements - Not a valid option as there are legitimate deletes performed by the application. Need to identify purging deletes from application deletes. Perhaps this could be done with a filter, but that attribute would need to available and maintained on all entities to make that feasible.

Keep Existing Object Unchanged - I've tried researching this option but there isn't a lot out there on it. From what I've gathered, this means the object is not affected when reinitializing? Doesn't that mean the data will be out of synch?

Since there are business rules about what data can be archived and what data can't, let me run something past you since I haven't worked with partitioning much.

1.) I create a new column on all my tables called something generic like "PartitionNumber". Current data is initially set to PartitionNumber 2 but the default on the column is 1. 2.) A partitioning function and scheme are created using this field with three partitions: 1 (current data) , 2 (archived data), 3 (empty). 3.) I write a stored procedure to identify archivable transactions and store this metadata in a separate table.3a.) I have a one-time script that updates all non-archivable transactions to 1. I am not sure if these records automatically move to partition 1, but I assume they do. 3b.) (long term process) A separate stored procedure is created and scheduled via a Job to update newly identified "archivable" transactions to PartitionNumber 2 and update the meta data to indicate this was complete.4.) I create a new OLTP environment and import only data from Partition 1, establishing the baseline OLTP of 1 year's worth of data.5.) I setup transactional replication from the new OLTP to the old environment. Here's where I am confused. If I apply a filter where PartitionNumber = 1, updates that switch the transactions over to PartitionNumber = 2 (the archive partition) won't make it over to the subscriber (AFAIK). If I don't filter on the PartitionNumber field, when I reinitialize it will affect the archive partition on the subscriber which is not what I want to do. I'm still not sure how "keep existing object unchanged" will help, assuming that it keeps the structure (does not drop the object) but synchronizes the data. Maybe if I was allowed to have an extra partition on the subscriber, it would leave that partition untouched?

I could enable allow_partition_switch without replicating the DDL, but I still need a way to make sure the archived data makes it to the right spot on the subscriber without being affected by reinitializations.

quote:Now I just need to clarify how we can achieve this with partition + replication (or any other way)

I should clarify: You would replicate to a table in the archive (partitioned or not) but it would be SEPARATE from the archived table. This would be the OLTP data only. For queries that need to include historical and current data, you'd query a view that UNIONs the OLTP and archive. This can be a partitioned view, and with the proper constraints (and indexes) could provide improved performance and partition elimination.

It sounds redundant but it's not, because the partitioning on the subscriber table allows you to switch newly archived data into the archive tables, and therefore make it safe/impervious to any replication effects.

quote:Don't replicate delete statements - Not a valid option as there are legitimate deletes performed by the application. Need to identify purging deletes from application deletes. Perhaps this could be done with a filter, but that attribute would need to available and maintained on all entities to make that feasible.

The operations I just mentioned would have to be performed the same time you delete the data from the OLTP system, but before those deletes replicate and remove it from the OLTP tables in the archive. This would ensure data is safely archived. "Legitimate" deletes may complicate this a bit, but you could either modify the database layer to flag instead of delete, or post-process the delete to remove it from the archive if necessary.

In reading the article you posted, I almost think you could have the same structure in both systems, with archive tables and live tables. The difference is that the archive partitions/tables would never be replicated, and in the OLTP system, simply truncated after the data is switched into them. Replication should handle the partition switch in the archive system, but exclude the truncate operation. This is totally theoretical BTW, I have no idea if that's how it works (not a replication expert), and the article is not clear about whether both tables need to be published & subscribed in order for the partition switch to work.

Thanks Rob, you have been extremely helpful. There is, however, one requirement I haven't mentioned: the client wants the ability to query the archive server without modification to existing queries.

I think I could accommodate that by replicating to a different database name like "DatabaseName_repl". On the archive server, I'd have a database with the same name as the original, aka "DatabaseName", but all of the tables would be replaced with views that union (all) the OLTP tables from "DatabaseName_repl" with their corresponding archive tables. Schema should be identical. Thoughts on this approach?

Another option proposed was to have a separate instance and use linked servers, but my thoughts are that a single instance solution would perform significantly better.