Data Disconnect: Don't Let This Happen to You


I'm just back from the SIIA Information Industry summit, and it was refreshing to see so much enthusiasm in this industry again. Online advertising is back with a vengeance, and many in the room are predicting 20% to 80% growth in online ad revenues this year. And for the subscription-based publishers in the room, there seemed to be growing comfort with their products, and in how to market effectively to ever more demanding customers. And all this wasn't just my sense: more than a few speakers and attendees were noting that "it feels like 1998 again."

The only disturbing note was the seeming appetite to address the growing cost and complexity of data compilation by simply skipping the hard stuff. What I mean by this is that there is a lot of activity around the idea of essentially automating the editorial function. Halsey Minor created some buzz when during his luncheon talk he mused about the possibility of building services such as CNET without editors. We heard from companies whose entire businesses were based on re-packaging data gathered on the Web. In several conversations with publishers, it seemed that all of them were seeking opportunities for products that could be built largely, if not entirely, from Web-based data gathered on an automated basis. This thinking stands in stark contrast to one of the main themes I heard hammered home by speaker after speaker: success depends on adding value to your content and building content products that were not only useful, but unavailable elsewhere. Certainly, software is more powerful than ever, and there are examples of products built largely on an automated basis that offer real value. But when it comes to building database and directory products, I believe a lesson I learned early on still holds: if the data you need for your product is easy to collect, your new product is probably a lot less valuable than you think. Re-formatting readily available data or adding a few additional data elements rarely yields "must have" data products, particularly in today's demanding environment. Just as important to remember, if you can get your hands on the raw data easily, so can your competitors. And the software you developed to create your automated product? Every single day, application development tools are becoming cheaper and more powerful, meaning that your "proprietary software" offers little competitive protection either. While you can light up a data publisher's eyes at the thought of eliminating phone calls, faxes and mail, and possibly even eliminating human editors altogether, what we're really seeing is a re-emergence of the perpetual motion machine fallacy on the late 1980's, where a number of half-baked schemes were launched where the database was supposed to somehow maintain itself, the product would be shipped automatically, and the publisher's primary responsibility became checking his daily bank balance from the beach. If only! I am very excited by the potential of data mining tools and user self-updating, and all the wonderful things that can be done by applying software to the wealth of data available on the Web. But I'm concerned by our blind rush towards the world envisioned by computer industry visionary Bill Joy where "the future does not need us." Let's not be too eager to disconnect data quality from human effort just yet. Instead, let's recognize that the human editorial function, which by the way allows us to address the sizable base of businesses that still have no Web presence, is fundamental to the creation of the value added products we need to produce in order to succeed and thrive in the years ahead.

Comment