Few would argue that over the last five years the major search engines have made enormous strides in improving their coverage of the open Web. They now find new sites more quickly, re-index them more often and even provide searchable access to non-textual content. It's all very impressive and very much for the good. However, as we all well know, too much information can be as much a curse as a blessing. That's why so much effort has been invested in trying to improve the precision of search, an effort often referred to as "improving relevancy." Providing a list of Web pages that contain a keyword or phrase is no longer considered innovative or even particularly valuable. Value is now embodied in identifying the most relevant Web pages for searchers. Every major search engine has its own secret sauce of techniques, processes and algorithms to devine relevance, and they seem to be getting better every day. But this isn't where the battle for search primacy ends, not by a long shot.
The next phase in this competitive battle is to get more paid content under index. Opening shots in this battle have already been fired by both Google and Yahoo. Lexis-Nexis used to run an ad campaign in the early days of the Internet touting that it held far more data than the entire Internet. As a competitive response to the Web, it missed the whole point, but it does underscore another one: even at this late date, some of the most important, powerful and useful content is still not to be found through search engines. This may be the simple explanation why content aggregators continue to do well despite the long shadows cast by the big search engines: their content is valuable and not available elsewhere.
To its credit, the content aggregation industry realized years ago that this distinction would not provide protection forever, which is why they've upped the ante, moving beyond delivery of raw data, and even moving beyond the search precision issue to focus on the biggest value-add of all: making data truly useful. OneSource built a nice business by taking on the hard work of integrating disparate databases to create highly comparable company profiles. Alacra has a nice niche providing customized data feeds to clients for use in their internal systems. Factiva continues to develop increasingly elaborate and powerful taxonomies that can even be extended to the internal data of its clients. LexisNexis builds virtual company profiles drawing on its vast data warehouses.
Interestingly, publishers are jumping on this bandwagon as well. Gale is now out with a product that assembles content from across its range of databases to present deep and comparable profiles. infoUSA is also jumping into the fray, having become a recent convert to the power of data mining to deepen its databases.
Where's this all heading? I think once the gee whiz factor of all the new content assembly technology wears off, it will become evident that the marketplace has moved beyond giant, one-size-fits-all databases. No matter how big, deep and accessible a database is, the fact remains that engineers, purchasing agents and analysts need different data different ways, and it's unlikely that anyone will cook up a single product that will keep them all equally happy. And just as there is growing user sophistication in terms of data elements and search interfaces, so too is there growing sophistication in terms of the overall dataset. Users are going to value 98% coverage of what really matters to them over 80% coverage of everything in the world. All this suggests to me that the future of search looks vertical. Business success will be a function of limited coverage, tailored to certain specific types of users, and executed very, very well.
Winners and losers in this scenario? Most data publishers already have a vertical orientation, and those that quickly figure out how to deliver data as well as they compile it will be very nicely positioned. Aggregators should have a solid, continuing role serving the distinct market that will continue to need convenient access to broad swaths of content.
It's the search engines that seem to be the ones not invited to this party. They are simply too wedded to serving up the most stuff to the most people. That will still be a great business for them, but it's a different business. And as the data content business gets comfortable with its distinctive place in the market, the industry will see greater stability and a much clearer path to profits.