Viewing entries tagged

Search is Easy; Data is Hard

The New York Times magazine has just published a fascinating article about Google, discussing whether Google has become an aggressive monopolist in the area of search, and if so, whether or not it needs to be broken up under anti-trust law. The article, which is well worth a read, cites case after case where Google ostensibly derailed other companies that had seemingly developed better search tools than Google.

Better search tools than Google? Is that even possible? That’s where I take some slight exception to the article. Possibly in order to make this topic more accessible to a mass audience, it labels all these competitive search providers as “vertical search” companies.

Those of us with some history in the business remember back to around 2006 when “vertical search” was a thing, a thing that has long since faded. At the time, the concept of vertical search was a full-text search engine, much like Google, but one that was focused on a single vertical market or specific topic area. The thinking was that if publishers curated the content that was being indexed, the search results would be stronger, more accurate, more contextual and more precise. A prime example at the time was the word “pumps.” As a search term, it’s a tough one – the user could be looking for a device that moves fluids … or shoes. A vertical search engine, which would be oriented towards either equipment or fashion, would reliably return more relevant responses. Vertical search failed as a business not because it was a bad idea, but because (and let’s be honest here) most of the publishers rushing to get into it were too lazy to do the up-front curation work. And without quality, up-front curation, vertical search quickly becomes just plain bad search.

Vertical search as used in the article really refers to vertical databases. The difference is important because the article also states that parametric search is hard. That statement is simply more proof that vertical search as used in this article means databases. Parametric search is not hard: collecting and normalizing data so it can be searched parametrically is hard. Put another way, searching a database is easy, providing there is a database to search.

Google never wanted to do the work of building databases. It sometimes bought them (example: a $700 million acquisition of airline data company ITA) or “borrowed” them (pulled results from third-party databases into its own search results, effectively depriving the database owner of much of its traffic – think Yelp). What Google did instead was devote unfathomable resources to develop software code to try to make unstructured data as searchable as structured data. While it made some impressive strides in this area, overall Google failed.

With this context, you can clearly see why data products are so important and valuable. Data collection is hard. Data normalization is hard. But there’s still no substitute for it, something Google has learned the hard way. It may be disheartening to see survey after survey where we learn that users turn to Google first for information. But this is the result of habituation, not superior results. For those who need to search precisely, and for those who really depend on the information they get back from a search, data products win almost every time … provided that users can find them. Read this article and judge for yourself  just how evil Google may be…

Is It App Time?

In a move that further signals the remarkable growth and increasing importance of mobile devices,
Google has announced changes
to its search algorithms to prioritize apps.

Starting April 21, for all searches on mobile devices, Google will present search results that identify and prioritize mobile-friendly sites. That means those with mobile-friendly sites should rank higher in search results conducted from mobile devices. Further, if you wish, Google will also begin to index content on mobile apps that may not appear on your website. And to close the loop, Google will allow app developers to use mobile search results to guide users to either the website or the app (provided the app is installed on the mobile device, something Google will check), wherever you the content owner think they’ll have the best experience.

These are cool if not world-changing new features from Google, but they indicate clearly the rapid evolution of the mobile ecosystem, one where the quality of the displayed information is becoming nearly as important as the information itself. This means that mobile-friendly websites are important, but the future lies with apps that will become an increasingly seamless part of the search experience. Think about it. Google will now check (on Android devices) what apps you have installed, index the content of those apps, present this content in regular Google search results, and allow you to seamlessly view that content in the installed app.

If your website isn’t already mobile-friendly (and it should be just as a best practice), Google’s giving you a big incentive to do so by pushing you up in mobile search results.

And if you’re wondering about the value of apps for your products, consider how quickly they are moving from handy appendages to the mobile experience to becoming central to that experience. If your data makes sense on a mobile device (and it doesn’t always), it’s probably time to stop thinking and start coding!


Google: Free Here; Paid There

Google may be a lot of things, but it's certainly not boring. Just this week in fact, it did several interesting revenue model back-flips, changing one product to free and making another one paid.

Let's start with Zagat. Zagat sells its content, in print and online. Not a revenue model Google knows anything about, but that didn't stop Google from snatching up Zagat for around $150 million in 2011. I predicted at the time that Google would make Zagat content free and dump in into Google Places (home of its user-generated business reviews). What happened? Google announced this week that it will make Zagat content free and dump it into Google Places. Google Places, in turn, will be dumped into Google Plus, as part of an initiative to shore up Google's faltering response to Facebook.

Google hasn't thrown away all of Zagat's revenue, at least not yet. You'll still be able to buy the print Zagat guides. Google will still charge for the Zagat iPad app. And my suspicion is that Zagat's real source of profit, gift copies of the guides imprinted with corporate logos, will continue. Make sense? If so, click here.

The biggest question for me is what happens when you mix Zagat's edited, witty, curated reviews with a much larger grab-bag of user generated reviews? Will Zagat reviews shine, or get lost in the sauce? Will people continue to submit reviews to Zagat when they can get immediate gratification (and reach the same audience) with a user-generated review? Sure, the Zagat brand is strong, but Google is sailing into uncharted waters, and I am not sensing a strong hand on the tiller.

This very same week, Google decided to rebrand its Google Product Search service as Google Shopping. And with the new name, Google decided a revenue model might be cool too. So the new Google Shopping service will be paid inclusion. Yes, Google Shopping is now a buying guide.

Charging for inclusion in the product directory (Google daintily calls this "a commercial relationship with merchants") is apparently the first time a Google-created service has gone from free to paid. Also, as you read Google's rationale for this shift, you realize that it has spent a lot of time and money to learn some basic truths about data publishing, for example:

  • Even companies that do make the effort to submit product information in structured format are lousy about keeping their information current
  • A smaller database of highly accurate data is more attractive to most users than a larger database of moderately accurate data
  • Structured data permits far more powerful and precise searching of product information

So while I have historically been at a loss to figure out what Google is doing, it's getting easier these days as Google moves ever-closer to doing everything, all at once. Just don't try this strategy at home!



Search Engines: From Indexers to Distributors?

A New York Times article this week, entitled "From Search, to Fetch," describes moves by both Google and Bing to get you to an answer faster. Called the "Knowledge Graph" by Google and "Snapshot" by Bing, you'll find that searches for certain types of information will now bring you a highly summarized presentation of key facts without needing to click on any of the links shown in the search results.

As the article concludes:

Both Microsoft and Google stress that these developments are but the first timid steps into a beautiful future - a future where search pages know what you mean, display exactly the information you want with one click, and even perform tasks for you. These companies are no longer happy serving only as the card catalog for the Web; now they even want to bring you the book.

More interesting to me, however, is that only in a small percentage of cases will Google (courtesy of Google Books) truly bring you the book. In the majority of cases, what Google will bring you is data. And where do these data come from? Third-party databases.

This is just one more example of search engines tacitly acknowledging the value of structured and semi-structured content. As importantly, Google is also acknowledging that some content sources are more dependable and trustworthy than others. Yes, Google is now featuring content that hasn't been selected by algorithms, but rather by humans basing their decisions in large part on the brand reputation of the content provider. Bing is presumably operating the same way.

Google so far is limiting itself to free third-party data sources such as Freebase, the CIA World Factbook and Wikipedia, among others. The data sources used by Bing aren't disclosed, but Snapshot reportedly is a bit more commercially oriented, providing summarized data on hotels, restaurants, bands, events, etc. I think it is quite likely Bing is already licensing some of this content from third parties.

The potentially great outcome is that with the arms race mentality of Bing and Google, one or both may start licensing more content in an attempt to offer the most compelling search experience. That's good for those publishers willing to be paid a large fee to make some or all of their content broadly available for free (and what a great ride that was for many publishers during the dot com boom). The losers in this scenario are those data products with commoditized content. For those publishers with expensive, specialized and proprietary content, it's a mixed scenario. Some may experience neither benefit nor harm. Others may find that exposing a taste of their data for free can yield tremendous levels of exposure that can drive new sales.

The way I see it, the search engines continue to evolve from information indexers to information distributors. And this could be a very fine evolution indeed.