Normalizing the data prior to the merge (i.e. pass the addresses through the USPS API to turn [Av Ave Avenue] => Ave)

Humans do this best, outsource or Mechanical Turk it.

Interesting Things

The after_commit plugin allows you to hook events to after the transaction commits. This is really useful when kicking off threads that expect to have access to the data in the database. Note: using after_save can cause you to have a race condition if the other thread attempts access to the data before the original thread has a chance to commit the transaction.

If you are storing a marshaled object in the database, you should make that field a blob type, it is smaller to store and if you leave it as a text or varchar you can corrupt the binary data you are storing in there. If you don’t have a choice about field types you should at least base64 encode the marshaled data before storing it.

ctrl+z

It seems that NewRelic was not the cause of the problem but helped in exacerbating the problem by holding the transaction open long enough to create a race condition that still shows up when the system is put under enough load. To fix our problem we moved the trigger that launches the background process from and after_save to an after_commit see plugin. We also re-added NewRelic.