Finding Business in Discovery

I have spent a lot of time recently with the website Artsy.com. That’s in part because I find it fascinating, and in part because it is trying to accomplish so many things at once. At one level, it’s a fine art discovery service. Artsy has partnered with art galleries nationwide to allow them to post images and descriptions of artwork for sale. That’s a great convenience for someone looking to buy art, but it’s not a new idea. A number of services are already offering similar services.

artsyWhere Artsy separates itself from the pack is with its “Art Genome Project.” Essentially, Artsy is categorizing artwork against a taxonomy much as Pandora has done for music. And that’s where the magic happens. If you find a piece of art you like, you can easily explore other artwork with similar characteristics. That’s no small feat when you consider that Artsy already catalogs over 125,000 pieces of art from over 25,000 artists – and it’s still only scratching the surface of what’s available.

Artsy is what might be characterized as a next generation discovery tool. Certainly, there’s value in aggregating artwork from galleries all around the country. But the big breakthrough here is being able to point to a single piece of art and say “I’d like to be able to see art in a similar style by different artists.” That’s a powerful step forward on discovery that benefits both the art buyer and the artist. If you’ve got a market where there are huge numbers of manufacturers, little standardization and prodigious output (music, art and wine are great examples), there’s a next generation discovery opportunity waiting for you.

But what about Amazon and Netflix, you may ask? These companies too have done a lot to improve discovery, but in a very different way. These companies don’t look so much at the product itself as who bought the product and in what combinations. This is a powerful and effective approach, but what Pandora and Artsy did was build taxonomies based on inherent product characteristics and then committed to manually classifying products against them, a significant exercise in metadata creation, but one that yields powerful, proprietary results.

The other interesting aspect of Artsy is one that I initially viewed as an overreach. Artsy includes not only artwork for sale, but artwork in museums and private collections. Certainly this makes artsy more attractive to art lovers, art students and others, but it seemed to confuse somewhat the art buying experience. This is largely explained by the fact that Artsy has a mission-driven aspect to it, but there may be a huge business opportunity here as well.

If Artsy is doing all the work of classifying and posting images of artwork held in museums and private collections, why not go one step further and become an international registry of artwork? Much of the value of art is tied to its provenance – its history of ownership. Artsy could become a central registry, collecting a small fee to re-register a piece of art every time it changed hands. It could then sell subscription-based access to this database to auction houses and galleries. Lots of details to be worked out to be sure, but this notion is only a small jump from Artsy’s current ambitions.

That’s what makes the data business so fascinating and lucrative: there are infinite opportunities to make money. All it takes is some creativity, and in this case, re-framing an existing business model.

Read More

Data Is Not a Zero Sum Game

Back in ancient times, when print directories walked the earth, one of the most surprising things I learned was that people were willing to pay meaningful amounts of money for information that wasn’t very good. And this wasn’t reluctant willingness, these buyers were just short of cheerful. How many businesses exist where your customer tells you your product stinks, and in the same breath excuses you because what you are doing “is such hard work?” One more reason to love the data business! But, you may be thinking, those days disappeared with print directories. I’m not so sure about that though. What I am seeing is a fascinating bifurcation of the market. On the one hand, you have laser-focused data products with pristine datasets that command enormous prices. On the other hand, you have these massive databases, often consisting of harvested data that have a lot of similar characteristics to the old print directories. The primary difference is one of scale.

Think about the number of new data products with one million, ten million or even 100 million records in them. At such a scale, they are almost certainly relying heavily if not exclusively on technology. And that means records will be misclassified, missing key fields, or a confusing jumble because the source content couldn’t be normalized properly. And let’s not forget that companies harvesting website data inherently know nothing about the estimated 30% of all companies with placeholder websites or no websites at all. Yet what you hear from paid subscribers to these databases is that familiar refrain, “it’s not great, but it’s good enough for what I’m doing.”

At the same time, we are seeing a number of much smaller, deeper and more precise data products entering the market as well. And these products tend to offer analysis and workflow capabilities, and often feed high stakes business decisions and high ticket selling.

Are we poised for a shake-out? Will there be winners and losers? I think there will be room for everybody. Having the most data doesn’t make you an automatic winner. Having the deepest data doesn’t knock out all your competitors. It all comes down to your intended market, and how you bring your data to market.

There still seems to be a large and active value segment of the market, those who will be happy with “good enough” data in exchange for a reasonable price. At the same time, there are customers who will pay remarkably high prices for data they can depend on, because it’s driving some critical business activity. And to the extent you differentiate your data through your user interface and data manipulation tools, you can often define still another market that wants to powerfully interact with your data.

My take-away is that the data business is increasingly not about winners and losers. Multiple companies with largely similar data can exist and succeed by having differing price points, levels of coverage and degrees of accuracy. The front-end you provide to your data can be customized to appeal to specific market segments as well.

It’s hard to definitively assess your competitors in the infinitely malleable world of data, but at the same time it’s increasingly clear that this is not even close to being a winner-takes-all business. This does not imply that you can be sloppy about your business; indeed it makes it all the more important you deeply understand your customers, how they are using your data, and where you fit into the market.

Read More

Monetizing Your Unfair Advantage

In the news today was the announcement that BusinessWire, a press release distribution company owned by Warren Buffett’s Berkshire Hathaway, had decided to stop offering direct access to its press releases to high frequency traders. This follows on the heels of a decision by Thomson Reuters not to sell advance access to market-moving economic data that it publishes. I find myself concerned about these decisions. That’s in part because what these two companies were doing was actually quite different. And as you dive into the details, you start to see issues that a broader range of data publishers may ultimately have to confront.

The Thomson Reuters situation involves two indexes: Consumer Confidence and the ISM Manufacturing Index. These are both major indexes that can and do influence the stock market broadly. In both cases, Thomson Reuters had licensed the rights to publish them. Nobody argues that Thomson-Reuters should have the right to monetize these indexes. But it’s one particular aspect of this monetization that raised concerns. Thomson Reuters openly offered to sell access to these indexes either a few seconds or a few minutes before they were released to the public. That’s more than enough time for computerized trading systems to analyze the news and place buy or sell orders accordingly. And by the way, it’s all legal, and Thomson-Reuters wasn’t hiding any of these arrangements. But is it fair?

The BusinessWire case is even more innocuous. BusinessWire is in the business of pushing our press releases far and wide. To that end it offers direct electronic access to anyone who might benefit from it. Some smart traders figured out how to take that innocent feed, process it, and make buy and sell decisions on it very quickly. BusinessWire was just going about its business. Third parties figured out how to profit from their activities, with no help or encouragement from BusinessWire. And while press releases don’t sound that interesting, keep in mind it’s the way many public companies first announce big events such as acquisitions.

I’m not a lawyer, so there may be nuances to this I am missing, but I understand that public policy recognizes the value of a level playing field when it comes to the stock markets, in part to build confidence. And as an individual investor, providing advance peeks to savvy stock traders doesn’t feel right to me. But as an information professional, my view is why not? The entire B2B information industry largely exists to provide unfair advantage. In fact, I know data publishers who have seriously considered variants of “Your Unfair Advantage” as corporate tag lines.

Given the murkiness of the legal issues, I think it’s fair to conclude that both companies stopped these activities primarily for reputational reasons. And that’s important to think about. These two events are very different, but you’d never know that from a quick scan of the headlines they generate. Our products are complex, sophisticated and nuanced. Typically, they are used by a range of users in a range of ways. You can’t – and shouldn’t – police what users do with your data. But you should put some thought into how you position your data and its uses, especially if there is potential to use your data for stock trading. It’s too easy to get painted as the bad guy even if you’ve done nothing wrong.

The bottom line is that as data becomes more powerful and important, we’re all going to receive more scrutiny. And the complexity of our products works against us in the media. That’s why sensitivity to how we present our data products is going to become increasingly important. And if yours is one of the companies considering a tag line that includes the words “unfair advantage,” may I politely suggest a re-think?

Read More

The Billion Prices Project

Last week, I discussed how the Internet of Things creates all sorts of potential opportunities to create highly valuable, highly granular data. The Billion Prices Project, which is based at MIT, provides another route to the same result. Summarized very simply, two MIT professors, Alberto Cavallo and Roberto Rigobon, collect data from hundreds of online retailers all over the world to build a massive database of product-level pricing data, updated daily. It’s an analytical goldmine that can be applied to solve a broad range of problems.

One obvious example is the measurement of inflation. Currently, the U.S. Government develops its Consumer Price Index inflation data the old fashioned way: mail, phone and field surveys. And inherently, this process is slow. Contrast that with the Billion Price Project that can measure inflation on a daily basis, and do so for a large number of countries.

But measuring inflation is just the beginning. The Billion Prices Project is exploring a range of intriguing questions, such as the premiums that are charged for organic foods and the impact of exchange rates on pricing. You’re really only limited by your specific business information needs – and your imagination.

The Billion Prices Project also offers some useful insights for data publishers. First, the underlying data is scraped from websites. The Billion Prices Project didn’t ask for it or pay for it. That means you can build huge datasets quickly and economically. Secondly, the dataset is significantly incomplete. For example, it entirely ignores the huge service sector of the economy. But’s it’s better than the existing dataset in many ways, and that’s what really matters.

When considering building a database, new web extraction technology gives you the ability to build massive, useful and high quality datasets quickly and economically. And as we have seen time after time, the old aphorism, “don’t let the perfect be the enemy of the good” still holds true. If you can do better than what’s currently available, you generally have an opportunity. Don’t focus on what you can’t get. Instead, focus on whether what you can get meaningfully advances the ball.

Read More

It's an Internet Thing

The Internet of Things (IoT) is, as buzzwords go, pretty easy to understand: it describes the concept of connecting things (other than computers) to the Internet. You may have heard one popular example of IoT in the not-too-distant future, when your Internet-connected refrigerator determines your orange juice is running low, and automatically places an online order to have more delivered to your doorstep. We’re a bit away from this scenario, but inching closer every day. The automobile companies in particular have been actively exploring ways for your car to alert you via email or text when it needs service or other attention. This is a clear, obvious and powerful example that you’ll soon see in dealer showrooms.

But is IoT strictly a consumer phenomenon? I think not. There are potentially huge opportunities to bring the concept of IoT to the world of business. And I think the data that can be collected by these devices will in many cases by organized and sold by data publishers.

As I have said repeatedly, data publishers are natural organizers of data for vertical markets because they’re neutral, trusted players in their markets and importantly, they’re already doing it. Moving from tracking a company and its people to tracking the location of a company’s equipment really isn’t that big a stretch. Consider Lloyd’s that tracks the exact position of all cargo ships at sea and Drilling Information, that tracks the location of drilling rigs. There’s a lot of value in knowing where things are if someone needs fast access to them. Extend that thinking a little bit, and you can start to see the opportunities. And some data is even more valuable when it is centralized and organized. That’s the traditional role and strength of data publishers.

Another tantalizing example can be found in the 2010 Model of Excellence company Spiceworks. This company offers software that helps companies manage their computer networks – and everything connected to them. Spiceworks not only knows the make and model of every printer owned by hundreds of thousands of companies, it even knows when they’re running low on toner, and all in real-time. Think of how many different ways you could monetize data like this! And as just one more example, there’s a big push in agriculture right now to use sensors that monitor moisture and other conditions in farmers’ fields. We’ve moved rapidly from first collecting information about farms, to collecting information about the crops produced at these farms, to collecting information about the soil that produces the crops at these farms. It’s about as granular as you can get, and best of all, it’s collected by devices and sensors meaning low cost and high accuracy.

Of course, not every shipping company or farmer will want to have the intimate details of their businesses tracked and reported to others. But here again, a central repository can return valuable data to those who contribute, including performance benchmarks or other useful trend data. Indeed, that’s the big goal driving the push for electronic health records – the ability to tap into large pools of data to find patterns that will make healthcare providers smarter and more productive.

The Internet of Things really is as big as our imaginations and it’s happening now. And like so many things on the Internet, the biggest opportunities go to those who move fast and early. That too is an Internet thing.

Read More