How Clean is Your Data?

By Bob Miller Be aware, however, that if you feed garbage data to a CASS process what you will wind up with is deliverable garbage data. August 30, 2004 -- This is not an idle question, as clean data is usually better quality data, which helps make for better VDP documents. Having described at some length how data gets dirty, I think it’s time to talk about how clean data comes about. As always, this is a practical view, based on my own experience. I break the cleanup process into five categories. Get It Right the First Time What a concept! I noticed years ago that every time I called Land’s End to place an order they wouldn’t let me get to first base until they had my customer number. Of course, the structure of their business at that time assured that they had correct billing and delivery information for me at least as of my last order. However, establishing my unique customer number was essential or else they might have wound up with multiple Bob Millers. If you can pull it off, this is by far the best way to clean data. But you’ve got to build your business around it. And of course there’s not always a lot you as a printer can do to help your customers get there, although persistent education can help. Intelligent Data Cleanup Whatever you can do to clean up your customer’s customer data should somehow make it back to their source database. Okay, let’s say Land’s End didn’t do such a good job getting my customer number and wound up with multiple records for me. They could match the customer records with order history and find that the orders came from different addresses at different times, with some continuity in credit card numbers. It would be reasonable to think that they are really the same Bob Miller moving around. They could then remove all but the address from which the most recent orders were placed. In general, what I’m talking about is using business specific knowledge to correct errors. This almost always requires custom programming. It can, however, be performed relatively independent of day-to-day operations, and could be offered as a service by digital printers. Merge/Match/Purge Now we’re getting into the world of standard offerings, as evidenced by the fact that MMP is a recognized acronym. This consists of combining different lists or databases by doing some standard data cleanup and standardization, matching on common keys, and removing duplicate records. A quick internet search will turn up a number of companies offering this service and/or software you can use to do it yourself. Many of the service bureaus offer custom programming, enabling them to be as intelligent as you or your customer can make them. CASS Presumably you’re all familiar with the USPS’s CASS certification process. This weeds out addresses that are undeliverable and cleans up the rest so they are in a standard format. Again, software and service bureaus abound. Be aware, however, that if you feed garbage data to a CASS process what you will wind up with is deliverable garbage data. Your goal, after all, is to mail to people who are alive, living at that address, and really do have the characteristics that drive their personalized materials. If you haven’t got that right before CASS then you will not get it right. Go Back and Fix It Using inaccurate or poorly targeted documents not only deliver a low response rate, they can actually have a negative impact on a customer's business by making the company look inept. Quite simply, whatever you can do to clean up your customer’s customer data should somehow make it back to their source database. Data cleanup is expensive. It’s a shame to have to do it at all, but it’s a real crime to keep doing it over and over again. We are a very goal oriented society and there is, in my experience, a universal tendency to get through the campaign and then go right on to the party (or wake, as the case may be). I strongly urge you to push your customers to build a feedback loop that integrates their nice clean data back into the source data. It will save them money and make you a look a lot less expensive for the next campaign. Printers are in a great position to quantify the cost of data problems. For example, undeliverable documents raise the cost of production and waste money. Using inaccurate or poorly targeted documents not only deliver a low response rate, they can actually have a negative impact on a customer's business by making the company look inept. Providing solutions for such problems is certainly within the capabilities of most digital printers. If you already have database-savvy staff you can expand what they do to more detailed work with customer data. If you don't, you may want to add such employees or contract database work to firms with the expertise you need. You can also do both, outsourcing more complex work while keeping easier jobs in-house. I think it’s a natural next step for digital printers to offer comprehensive database solutions to their customers. And one which can help differentiate your company from its competitors