Monday, October 6, 2008

MS CRM Data Import (Cleaning up the Mess)

Before importing data into CRM it, I think it is important to know how to get it back out, how long it will take to get it back out, or how to get back to a clean state.

Why is that?

1. Time Estimation! Creating a CRM record is FAST. Deleting a CRM record can be VERY slow! There is a reason for this. When you create a new Account, there is a new record added to a Base and Extension base table. That's pretty simple! When you delete an Account, CRM checks for all of the dependencies and potentially deletes a number of child records in other tables, verifies that there are no records that restrict the deletion of a parent and finally deletes the Account.

To drive this home, using my import framework, I recently imported about 34,000 contacts from an Access database into a CRM system running on a VPC image on my laptop. It took 24 minutes to import these contacts, while updating associated accounts with primary contacts where applicable. However it took about 8 hours just to delete these Contacts! Granted, a VPC image running off of a single core of a laptop is going to be very slow in comparison to most production systems, but many imports are MUCH larger than 34,000 contacts.

2. Some data doesn't want to be deleted! Some CRM entities end up in a state where the CRM web service can not delete them. Let's say that you import a company's Contracts with the expiration dates set. MS CRM being the smart system that it is will start changing the state of all contracts with a past expiration date to "Expired". This is a neat feature, but now you can not delete that entity. In fact you can not even set the state of that contract back to draft using web services so that you can delete it. There is a work around for this, but it is unsupported and requires using SQL.

3. Data must be removed in sequence! Because there are lots of relationships between entities in CRM, removing data typically means deleting data in the reverse order of your imports. So that the last things that you can typically delete are the Contacts and then the Accounts. You must delete Quotes, Orders, and Invoices before the related Opportunities, etc. You must be aware of the data's hierarchy.

4. The Deletion Service. When you delete an entity using the CRM web service, the entity is being marked with a flag to be deleted after the Deletion Service runs. With MS CRM 4.0 that happens once a day by default, however you can use the CRM 4 Scale Group Job Editor to adjust that. With CRM 3.0 the default was every two hours and it could be adjusted with a registry setting. Knowing when this service runs is important if you want to back up a clean database.

Best Practices - Data Import Cleanup

Strong Suggestion! Use a Test system or VPC image for data import testing. Some of the methods I am going to cover for data cleanup are unsupported by MS. It is much better than you do all of your import testing on something other than the production CRM system.

The Quick Reset to Ground Zero

When your base configuration and data are clean, and the Deletion service has finished running, back up the database pair MSCRM_CONFIG, and <BaseBUName>_MSCRM to a safe place.

When you need to clean out your imported data, export your Customizations to a safe area then restore both databases and import and publish your customizations.

This is a great way to do things as long as you can keep track of any configuration changes between resets. However sometimes you may have hours invested in getting everything imported to a certain state and you just need to remove just the last thing that you imported and it would take more time to reset your system and import everything up to that point.

Proceed at your own risk! To reset your database bringup SQL Server Management Studio under Management-> activityMonitor to find the processes blocking your CRM database. Stop all of the CRM Services and IIS, then refresh the activityMonitor to see what is left. If there is anything left, then kill that process and when the activityMonitor is clear of references to your databases you can restore like below.

This is when knowledge of the relationships between your entities and the requirements to put an entity into a state that allows deletion are important.

Example: Cleaning up Contracts

Below you can see the BackgroundWorker DoWork method called from my CRM Import Winform Application. Every Import has a "Delete All" button, and a progress bar to allow an easy check the progress by glancing over at it.