Data Is Not a Zero Sum Game


Back in ancient times, when print directories walked the earth, one of the most surprising things I learned was that people were willing to pay meaningful amounts of money for information that wasn’t very good. And this wasn’t reluctant willingness, these buyers were just short of cheerful. How many businesses exist where your customer tells you your product stinks, and in the same breath excuses you because what you are doing “is such hard work?” One more reason to love the data business! But, you may be thinking, those days disappeared with print directories. I’m not so sure about that though. What I am seeing is a fascinating bifurcation of the market. On the one hand, you have laser-focused data products with pristine datasets that command enormous prices. On the other hand, you have these massive databases, often consisting of harvested data that have a lot of similar characteristics to the old print directories. The primary difference is one of scale.

Think about the number of new data products with one million, ten million or even 100 million records in them. At such a scale, they are almost certainly relying heavily if not exclusively on technology. And that means records will be misclassified, missing key fields, or a confusing jumble because the source content couldn’t be normalized properly. And let’s not forget that companies harvesting website data inherently know nothing about the estimated 30% of all companies with placeholder websites or no websites at all. Yet what you hear from paid subscribers to these databases is that familiar refrain, “it’s not great, but it’s good enough for what I’m doing.”

At the same time, we are seeing a number of much smaller, deeper and more precise data products entering the market as well. And these products tend to offer analysis and workflow capabilities, and often feed high stakes business decisions and high ticket selling.

Are we poised for a shake-out? Will there be winners and losers? I think there will be room for everybody. Having the most data doesn’t make you an automatic winner. Having the deepest data doesn’t knock out all your competitors. It all comes down to your intended market, and how you bring your data to market.

There still seems to be a large and active value segment of the market, those who will be happy with “good enough” data in exchange for a reasonable price. At the same time, there are customers who will pay remarkably high prices for data they can depend on, because it’s driving some critical business activity. And to the extent you differentiate your data through your user interface and data manipulation tools, you can often define still another market that wants to powerfully interact with your data.

My take-away is that the data business is increasingly not about winners and losers. Multiple companies with largely similar data can exist and succeed by having differing price points, levels of coverage and degrees of accuracy. The front-end you provide to your data can be customized to appeal to specific market segments as well.

It’s hard to definitively assess your competitors in the infinitely malleable world of data, but at the same time it’s increasingly clear that this is not even close to being a winner-takes-all business. This does not imply that you can be sloppy about your business; indeed it makes it all the more important you deeply understand your customers, how they are using your data, and where you fit into the market.

Comment