You can’t be a successful data publisher unless you’re selling high quality data. It’s not just because customers want value for their money, it’s because they increasingly depend on third-party data for their own business success, so it's a high-stakes decision.

Even if your product has the slickest interface, the most granular and well-structured data and the best integration tools, there is nothing that matters more than the data itself. That’s why data quality is an integral part of your sales pitch. But quality is so easy to claim, and so hard to prove.

Anyone who has ever sold a data product knows that moment of dread when the prospect asks to check its quality by looking up his or her own information. Yes, it’s a way to test quality, but it’s not a good one. To take a single record, one that the prospective customer happens to know more about than almost anyone else in the world, and project the results against a database with thousands if not millions or records, is inherently imprecise. But what’s a buyer of data to do?  

Clearly, a third-party audit of data products would fill an important need. The major circulation audit agencies have taken a stab at it, but with what I would argue are weak methodologies and a lack of commitment to their offerings. Now there’s a brand-new initiative called the Data Quality Labeling Standards program. It’s being pushed by the Data & Marketing Association (formerly the Direct Marketing Association) and has a vision of providing a report card on different datasets akin to the FDA nutrition label on a food product.

While I wish this venture success, the difficulty of the undertaking can’t be underestimated. It starts with the simple but profound question of “what’s a database?” When you look at the range of data-driven products on the market today, that’s a surprisingly difficult question. The discussion gets even more complicated as you look for consistent and comparable measures across wildly varying datasets. Most complex of all are the inherent value judgments that have to be addressed when you discover a particular dataset has, for example, really good revenue data but mediocre contact data. That’s when it becomes clear that a dataset’s quality is in many cases a function of how the data will be put to use.

It may be the biggest conundrum of the data business: quality is everything, yet quality is difficult to assess. Third-party assessments, much as I like them in concept, may just be too difficult to implement. The best answer remains the simplest: if you believe in the quality of your data, let prospective buyers put it to work on a test basis for a week or a month and let the results speak for themselves.