Article

Part 3: Know Your Data, Measure Your Data, Judge Your Data

In my previous articles, I talked about the richness and completeness of data, this final article will discuss the third key to good identity verification data; accuracy. Accuracy can be the most difficult to accomplish, yet, often the metric most coveted. To truly know your customers, thorough identity assessment is needed at the time of the transaction, application, or inquiry, to layer in and leverage dynamic data. In order to have success with real-time identity verification, the data needs to not only be rich and complete, but more importantly – accurate.

Accuracy is what everybody is interested in. It’s also the most difficult to measure because it always involves comparing our data to the real world. The real world changes rapidly and this comparison can range anywhere from, “expensive to do well” to “near impossible” to do at all. Ultimately, the comparison to the real world is the only legitimate definition of accuracy, despite what you might hear and what other vendors may claim. Often we hear anecdotes about providers that claim 97% accuracy, – this often just means that if you put a record into their database, you would get the same record back, about 97% of the time. This is not useful if you are trying to determine if the data itself is true to the person’s current phone number, address, associated people, etc.

We do data differently in the Whitepages Identity Graph™ by sourcing, synthesizing and using our data science to provide the most up to date and accurate non-PII data. We are a data synthesis company which makes the data better for layered identity assessment, as it allows for corroboration. The corroboration of data is what provides confidence in greater accuracy. We are always testing the accuracy of our data with algorithms, but more importantly, we also test the accuracy of our algorithms with the human element. We have real people make calls to the data set to determine if the identity we have associated with that number, name, address, etc., are indeed, who we think they are. This method allows us to constantly learn and refine our algorithms so we can serve up accurate data. Often with a single source data point, you could see up to 65% accuracy of that specific data attribute. However, if you are using a data vendor who sources, synthesizes, and corroborates across a multitude of sources, you can get accuracy rates up into the 90th percentile.

As a data company, we have to create meaningful and honest measures that represent our data accurately, and ask for the same from others. If you truly want rich, complete, and accurate data then you either need to create your own database or use vendors that:

are methodical in their approach to sourcing, synthesizing and measuring the data

understand what the data is supposed to be representing in the real world

understand how well the data represents what it’s supposed to, so it’s useful for identity verification

As lenders, insurance carriers, retailers, and travel providers move to meet the demands of their mobile customers, they need to evaluate old identity assessment requirements and workflows to support these transactions and applications. Given the increase in breaches and compromises of personal data, the approach to just increase security doesn’t always work. Analysts are recommending a layered approach to identity assessment with a focus on rich, complete and accurate dynamic identity data.