Viewing entries in
Thoughts and Predictions

Comment

Data Is Not a Zero Sum Game

Back in ancient times, when print directories walked the earth, one of the most surprising things I learned was that people were willing to pay meaningful amounts of money for information that wasn’t very good. And this wasn’t reluctant willingness, these buyers were just short of cheerful. How many businesses exist where your customer tells you your product stinks, and in the same breath excuses you because what you are doing “is such hard work?” One more reason to love the data business! But, you may be thinking, those days disappeared with print directories. I’m not so sure about that though. What I am seeing is a fascinating bifurcation of the market. On the one hand, you have laser-focused data products with pristine datasets that command enormous prices. On the other hand, you have these massive databases, often consisting of harvested data that have a lot of similar characteristics to the old print directories. The primary difference is one of scale.

Think about the number of new data products with one million, ten million or even 100 million records in them. At such a scale, they are almost certainly relying heavily if not exclusively on technology. And that means records will be misclassified, missing key fields, or a confusing jumble because the source content couldn’t be normalized properly. And let’s not forget that companies harvesting website data inherently know nothing about the estimated 30% of all companies with placeholder websites or no websites at all. Yet what you hear from paid subscribers to these databases is that familiar refrain, “it’s not great, but it’s good enough for what I’m doing.”

At the same time, we are seeing a number of much smaller, deeper and more precise data products entering the market as well. And these products tend to offer analysis and workflow capabilities, and often feed high stakes business decisions and high ticket selling.

Are we poised for a shake-out? Will there be winners and losers? I think there will be room for everybody. Having the most data doesn’t make you an automatic winner. Having the deepest data doesn’t knock out all your competitors. It all comes down to your intended market, and how you bring your data to market.

There still seems to be a large and active value segment of the market, those who will be happy with “good enough” data in exchange for a reasonable price. At the same time, there are customers who will pay remarkably high prices for data they can depend on, because it’s driving some critical business activity. And to the extent you differentiate your data through your user interface and data manipulation tools, you can often define still another market that wants to powerfully interact with your data.

My take-away is that the data business is increasingly not about winners and losers. Multiple companies with largely similar data can exist and succeed by having differing price points, levels of coverage and degrees of accuracy. The front-end you provide to your data can be customized to appeal to specific market segments as well.

It’s hard to definitively assess your competitors in the infinitely malleable world of data, but at the same time it’s increasingly clear that this is not even close to being a winner-takes-all business. This does not imply that you can be sloppy about your business; indeed it makes it all the more important you deeply understand your customers, how they are using your data, and where you fit into the market.

Comment

Comment

Monetizing Your Unfair Advantage

In the news today was the announcement that BusinessWire, a press release distribution company owned by Warren Buffett’s Berkshire Hathaway, had decided to stop offering direct access to its press releases to high frequency traders. This follows on the heels of a decision by Thomson Reuters not to sell advance access to market-moving economic data that it publishes. I find myself concerned about these decisions. That’s in part because what these two companies were doing was actually quite different. And as you dive into the details, you start to see issues that a broader range of data publishers may ultimately have to confront.

The Thomson Reuters situation involves two indexes: Consumer Confidence and the ISM Manufacturing Index. These are both major indexes that can and do influence the stock market broadly. In both cases, Thomson Reuters had licensed the rights to publish them. Nobody argues that Thomson-Reuters should have the right to monetize these indexes. But it’s one particular aspect of this monetization that raised concerns. Thomson Reuters openly offered to sell access to these indexes either a few seconds or a few minutes before they were released to the public. That’s more than enough time for computerized trading systems to analyze the news and place buy or sell orders accordingly. And by the way, it’s all legal, and Thomson-Reuters wasn’t hiding any of these arrangements. But is it fair?

The BusinessWire case is even more innocuous. BusinessWire is in the business of pushing our press releases far and wide. To that end it offers direct electronic access to anyone who might benefit from it. Some smart traders figured out how to take that innocent feed, process it, and make buy and sell decisions on it very quickly. BusinessWire was just going about its business. Third parties figured out how to profit from their activities, with no help or encouragement from BusinessWire. And while press releases don’t sound that interesting, keep in mind it’s the way many public companies first announce big events such as acquisitions.

I’m not a lawyer, so there may be nuances to this I am missing, but I understand that public policy recognizes the value of a level playing field when it comes to the stock markets, in part to build confidence. And as an individual investor, providing advance peeks to savvy stock traders doesn’t feel right to me. But as an information professional, my view is why not? The entire B2B information industry largely exists to provide unfair advantage. In fact, I know data publishers who have seriously considered variants of “Your Unfair Advantage” as corporate tag lines.

Given the murkiness of the legal issues, I think it’s fair to conclude that both companies stopped these activities primarily for reputational reasons. And that’s important to think about. These two events are very different, but you’d never know that from a quick scan of the headlines they generate. Our products are complex, sophisticated and nuanced. Typically, they are used by a range of users in a range of ways. You can’t – and shouldn’t – police what users do with your data. But you should put some thought into how you position your data and its uses, especially if there is potential to use your data for stock trading. It’s too easy to get painted as the bad guy even if you’ve done nothing wrong.

The bottom line is that as data becomes more powerful and important, we’re all going to receive more scrutiny. And the complexity of our products works against us in the media. That’s why sensitivity to how we present our data products is going to become increasingly important. And if yours is one of the companies considering a tag line that includes the words “unfair advantage,” may I politely suggest a re-think?

Comment

Comment

The Billion Prices Project

Last week, I discussed how the Internet of Things creates all sorts of potential opportunities to create highly valuable, highly granular data. The Billion Prices Project, which is based at MIT, provides another route to the same result. Summarized very simply, two MIT professors, Alberto Cavallo and Roberto Rigobon, collect data from hundreds of online retailers all over the world to build a massive database of product-level pricing data, updated daily. It’s an analytical goldmine that can be applied to solve a broad range of problems.

One obvious example is the measurement of inflation. Currently, the U.S. Government develops its Consumer Price Index inflation data the old fashioned way: mail, phone and field surveys. And inherently, this process is slow. Contrast that with the Billion Price Project that can measure inflation on a daily basis, and do so for a large number of countries.

But measuring inflation is just the beginning. The Billion Prices Project is exploring a range of intriguing questions, such as the premiums that are charged for organic foods and the impact of exchange rates on pricing. You’re really only limited by your specific business information needs – and your imagination.

The Billion Prices Project also offers some useful insights for data publishers. First, the underlying data is scraped from websites. The Billion Prices Project didn’t ask for it or pay for it. That means you can build huge datasets quickly and economically. Secondly, the dataset is significantly incomplete. For example, it entirely ignores the huge service sector of the economy. But’s it’s better than the existing dataset in many ways, and that’s what really matters.

When considering building a database, new web extraction technology gives you the ability to build massive, useful and high quality datasets quickly and economically. And as we have seen time after time, the old aphorism, “don’t let the perfect be the enemy of the good” still holds true. If you can do better than what’s currently available, you generally have an opportunity. Don’t focus on what you can’t get. Instead, focus on whether what you can get meaningfully advances the ball.

Comment

Comment

It's an Internet Thing

The Internet of Things (IoT) is, as buzzwords go, pretty easy to understand: it describes the concept of connecting things (other than computers) to the Internet. You may have heard one popular example of IoT in the not-too-distant future, when your Internet-connected refrigerator determines your orange juice is running low, and automatically places an online order to have more delivered to your doorstep. We’re a bit away from this scenario, but inching closer every day. The automobile companies in particular have been actively exploring ways for your car to alert you via email or text when it needs service or other attention. This is a clear, obvious and powerful example that you’ll soon see in dealer showrooms.

But is IoT strictly a consumer phenomenon? I think not. There are potentially huge opportunities to bring the concept of IoT to the world of business. And I think the data that can be collected by these devices will in many cases by organized and sold by data publishers.

As I have said repeatedly, data publishers are natural organizers of data for vertical markets because they’re neutral, trusted players in their markets and importantly, they’re already doing it. Moving from tracking a company and its people to tracking the location of a company’s equipment really isn’t that big a stretch. Consider Lloyd’s that tracks the exact position of all cargo ships at sea and Drilling Information, that tracks the location of drilling rigs. There’s a lot of value in knowing where things are if someone needs fast access to them. Extend that thinking a little bit, and you can start to see the opportunities. And some data is even more valuable when it is centralized and organized. That’s the traditional role and strength of data publishers.

Another tantalizing example can be found in the 2010 Model of Excellence company Spiceworks. This company offers software that helps companies manage their computer networks – and everything connected to them. Spiceworks not only knows the make and model of every printer owned by hundreds of thousands of companies, it even knows when they’re running low on toner, and all in real-time. Think of how many different ways you could monetize data like this! And as just one more example, there’s a big push in agriculture right now to use sensors that monitor moisture and other conditions in farmers’ fields. We’ve moved rapidly from first collecting information about farms, to collecting information about the crops produced at these farms, to collecting information about the soil that produces the crops at these farms. It’s about as granular as you can get, and best of all, it’s collected by devices and sensors meaning low cost and high accuracy.

Of course, not every shipping company or farmer will want to have the intimate details of their businesses tracked and reported to others. But here again, a central repository can return valuable data to those who contribute, including performance benchmarks or other useful trend data. Indeed, that’s the big goal driving the push for electronic health records – the ability to tap into large pools of data to find patterns that will make healthcare providers smarter and more productive.

The Internet of Things really is as big as our imaginations and it’s happening now. And like so many things on the Internet, the biggest opportunities go to those who move fast and early. That too is an Internet thing.

Comment