Running RBS Maintainer

Running RBS Maintainer

We are getting some questions on this through this blog and our codeplex site and I thought that this subject needs a detailed blog post, so here is some information on running RBS maintainer.

Connection Strings

RBS maintainer takes connection strings from a CLR config file (Microsoft.Data.SqlRemoteBlobs.Maintainer.exe.config) that is present in the same directory as the maintainer executable. You will need to add your connection strings to this file. You will need to add one connection string per database that you want to run RBS maintainer on. It is recommended that the connection strings be encrypted in case you are using SQL authentication (since the password is part of the connection string). If you are using Windows Authentication, you dont need to encrypt your connection strings.

The config file created by RBS installer contains the connection string in encrypted form. CLR config files have the limitation that all the connection strings need to be either encrypted or plaintext - you cannot have a combination of some connection strings encrypted and some plaintext. So, if you want to add more connection strings to the config file already created by RBS installer, you will need to either write a program to encrypt them or use a utility (aspnet_regiis.exe) to do it for you. See [1] for more details on how to do this. If you do not want to encrypt your connection strings, feel free to delete the encrypted connection string already present in the config file and add your connection strings in plaintext.

Command Line Parameters

Once you have all the connection strings in the config file, you can run maintainer executable with command line parameters telling it which database to run on and what set of tasks to do. Here is a brief description of the command-line parameters that you can get by running maintainer without any parameters:

Usage (Available Options):

ConnectionStringName - This parameter is the name of the connection string.

This parameter has a default value of All, and can take the following blob store arguments: <BlobStoreName1 [BlobStoreName2 [...]] Default:All>

Takes the names of blob stores as arguments.

The different operations available in maintainer are:

1.GarbageCollection. See [2] for details on the different phases of garbage collection. Pool Slicing is a new feature added in RBS 2008 R2 that allows GC of pools to happen incrementally, i.e. one slice at a time instead of the whole pool at one go. The slicing is done based on the create timestamp of the blob. This allows making incremental progress on garbage collecting huge pools that contain hundreds of millions of blobs without requiring a long maintenance window. Slicing applies only to the Orphan Cleanup phase and is used only if the provider implements time filtered enumeration (EnumerationOptimizationLevel is higher than Basic). The RBS Filestream provider implements this. Config keys that can be used to tune Garbage Collection operation are: delete_scan_period, orphan_scan_period, garbage_collection_time_window. If you want to completely remove all deleted blobs immediately, set all the 3 config items above to 'time 00:00:00' and run maintainer for GC phases RS and DP (rd). This is useful if you want to unregister a blob store / uninstall a provider.

The command line switch GarbageCollectionPhases is required for this operation. Examples (these examples assume that you have a connection string in your config file named RBSMaintainerConnection, whch is the default string added by the RBS installer):

2.ConsistencyCheck. This does a consistency check on the RBS internal tables and can optionally try to repair any issues found. If you specify this operation, you also need to specify ConsistencyCheckMode. Optionally, you can also specify ConsistencyCheckExtent to choose whether to check metadata only or check for the validity of each BlobID (default is metadata). Config keys that can be used to tune it are: max_consistency_issues_found, max_consistency_issues_returned. Examples:

oThis does consistency check and attempts repair on RBS metadata as well as each BlobId.

3.ConsistencyCheckForStores. This does consistency check on blob stores associated with the RBS database. For this option to work, the provider needs to implement consistency checks (its ConsistencyCheckLevel must not be None). This is a new feature added in RBS 2008 R2 and has been implemented by the RBS Filestream Provider. Config keys that can be used to tune it are: . Examples:

oThis runs consistency check on the two specified blob stores (if their providers support it).

4.Maintenance. This does some maintenance on RBS internal tables, which mainly consists of reorganizing indexes. It is a good idea to run maintainer with this operation once in a while to ensure good performance. Examples:

5.ForceFinalize. RBS Maintainer is designed to make incremental progress on garbage collection operations. If a phase of GC is not completed before the allotted time is up, the progress is saved and it is picked up again the next time RBS Maintainer is run for that operation. This mechanism is explained in [2] below as well. Sometimes it may be desired to “forget” this saved information about in-progress garbage collections and start fresh. The ForceFinalize operation serves that purpose: it removes saved information about one or more GC phases. The next time maintainer is run with those GC phases, they start from the beginning. The command line switch GarbageCollectionPhases is required for this operation. Examples:

oThis does ForceFinalize for the RS and DP phases of garbage collection.

Options:

TimeLimit. Currently this is the only option available. If this is specified, RBS maintainer will try to stop after the specified duration even if all its tasks are not complete. If it is not specified, maintainer will continue to run until all the specified tasks (operations) are completed. This option allows scheduling RBS maintainer to run during regular maintenance windows (e.g. from midnight to 2 AM) and stop at a predictable time. If maintainer had to stop before the tasks were completed, they will be picked up the next time maintainer is run with those operations specified. [2] talks about how this option interacts with some config keys.

You can combine multiple operations from above in one command line. You can specify upto 3 operations at one time (the usage message pasted above says upto 4, but there is a bug currently which limits it to 3) Examples:

oThis does the 3 phases of garbage collection, consistency check on RBS metadata only with attempt to repair and maintenance of RBS indexes and puts a time limit of 2 hours for processing all these things. If they don’t complete in approximately 2 hours, maintainer saves its progress and stops.

Monitoring Progress

The view mssqlrbs.rbs_blob_details shows a list of all the blobs that RBS thinks are currently in use by the application. Running the Reference Scan (r) phase of GC moves blobs out of these and marks them internally for deletion during the Delete Propagation (d) phase after the backup/restore SLA time window (garbage_collection_time_window) has elapsed. The view mssqlrbs.rbs_consistency_issues shows the list of consistency issues found by the maintainer. For measuring performance, troubleshooting or just for curiosity, you can look at views mssqlrbs.rbs_history and mssqlrbs.rbs_counters to see the progress of maintainer tasks.

Scheduling RBS Maintainer

RBS does not come with its own way of scheduling maintainer. RBS maintainer is a standalone executable and you will need to schedule it yourself. One way to do that is to use Window Task Scheduler - this is what the RBS installer brings up if you select that option while installing RBS. [1] has some sample steps to schedule it using Windows Task Scheduler. Another way to schedule it is to have a job in SQL Server Agent. You are also free to use your own mechanisms to do the scheduling.

References

[1] http://sqlrbs.codeplex.com/Thread/View.aspx?ThreadId=204627 - this thread has sample steps to add encrypted connection strings to the maintainer config file and to schedule maintainer task using Windows Task Scheduler. Running maintainer once will only perform the tasks on one database. So if you have multiple databases, you will need to run maintainer multiple times (once for each database).