Viewing entries in
Thoughts and Predictions

Good Databases Are More Than Just Good Data

We can look to the UK for a case study of how a government agency, after several tries, couldn’t build a user-friendly data product, creating a giant opportunity for a for-profit data company.

The story begins with a regulatory agency called the Financial Conduct Authority (FCA) that among other duties, registers and regulates financial advisors and advisory firms. The FCA has a searchable database on its website, but like so many government websites, it is optimized for one purpose: checking the registration status of a known individual or firm. As a tool to assist you in identifying an advisor to help you with your investments, it’s pretty useless.

In recognition of this shortcoming, the FCA called on a quasi-governmental organization called the Money Advice Service (MAS) to help build a better adviser database, and MAS accepted the challenge. I took a look at this website when it first launched, and though I saw some design issues, it had potential.

But even though MAS nominally had the freedom to build a creative database with almost any business model behind it, its need to avoid controversy ultimately resulted in a very limited and timid product. And when, unsurprisingly, there wasn’t a lot of revenue to be had with such a product, MAS buried the database three levels down on its website and moved on to greener pastures.

With two free databases of financial advisers out there, you think there wouldn’t be much opportunity left for anyone. However, a company called Unbiased saw things differently, and said there was indeed an opportunity … for the right product.

Unbiased has been a big hit in the marketplace, and the way it differentiated itself from the free government services with the same basic listing data holds lessons for us all

  • Greater visibility – Unbiased wants to be found because its business model depends on driving lots of traffic to its participating advisers
  • Deeper data – ratings, discount offers and detailed profiles
  • Strong user interface – clean, inviting design and both parametric search and a custom matching service         

If you have ever wondered how you could compete against a free, government online database, Unbiased provides the answer: data presentation can be as valuable as the underlying data itself, particularly if you are serving a consumer market. And aggressive promotion of your online database will let you run circles around government agency databases, that are generally hard to find in addition to being hard to use. 

Inferring Intent

Today’s Gartner blogpost points to some interesting limitations and opportunities surrounding intent data. Let’s start at the beginning by defining what it is.

Simply put, intent data is an indication that an individual or organization is actively interested in purchasing a specific product or service. You may already be familiar with sales triggers. One classic sales trigger is so-called “new move” data. It’s valuable to know when a company moves offices because it is highly likely that the company will likely make lots of new purchases such as office furniture and the like. Think of intent data as a more sophisticated cousin of the sales trigger.

Media companies are in a great position to generate sales intent data, because much intent data is generated by watching what a person reads and does online. If a reader looks at five articles on 3-D printers in a short period of time, those actions can be viewed as indicating an intention to purchase a 3-D printer. Intent data can get a lot more sophisticated than that, but this gives you the general idea.

You might think that if a sales organization has intent data available to it, that’s probably all the data it needs. After all, intent data is like mind-reading: it’s identifying people who are likely to be purchasing a product before they purchase it. What could be better?

Well, as the Gartner blogpost points out, many companies are filtering sales leads based on intent data with something called “fit analysis.” This is an automated attempt to evaluate if the company is a likely buyer. If your company typically sells to larger, multi-office organizations, a fit analysis will filter out smaller, single location companies because they represent lower grade prospects.

Further, the Gartner blogpost notes that companies selling highly specialized products or brand-new technologies often can’t get enough intent-based sales leads or they get leads that are weak because the intent indicators aren’t sufficiently granular. Finally, some sales departments don’t like intent-based sales leads because they identify prospects too early in the sales process. As you can see, sales leads based on intention are still fairly rudimentary, and there is lots of opportunity to refine them.

But what’s most worthy of note is that Gartner believes that most intent-based sales lead data is focused on the technology industry. But there is no reason that it should. Technology sellers just happen to be free-spending early adopters. I have long preached the virtues of what I call “inferential data,” a term that includes both intent and sales trigger data. I firmly believe that many data publishers have opportunities in this area, and if they happen to be part of larger media companies, they are even greater. In fact, data publishers are natural providers of fit analytics as well. If you look at your data creatively and read between the lines you can make some very lucrative connections. 

Can You Over-Monetize?

To avoid accusations of commercial blasphemy, I am going to pose this as a question, not a statement: can you over-monetize a data product?

Consider the online real estate listings databases. There are lots of them, all engaged in a fierce battle to the death. They make their money selling listing upgrades to real estate agents, a hotly competitive and demanding group. The product they are selling is homes that can easily cost $1 million and more, with very sizable commission dollars at stake. In such a high ticket and fiercely competitive market, would you want to junk up the user experience with irrelevant advertising, and annoy your real estate agent customers by distracting users from the listings they are paying to enhance? The answer appears to be yes.

Several of these sites have now been designed to display programmatic ads. With all it takes to attract a live buyer to your site, do you really want to risk that buyer clicking on an ad for a local car dealer and leaving your site entirely? Do you want to intersperse listings of homes with ads for mortgages when your primary source of revenue is real estate agents who badly want your site visitors to look at their listings?

You know the saying: real estate is all about “location, location, location.” Does it make sense then that when a potential buyer clicks on a map icon to see where a home is located, she is presented with a map cluttered with logos indicating the location of nearby State Farm insurance offices? Does anyone buy a house based on proximity to an insurance agent? Doubtless someone thought this was a clever marketing gambit, but it distracts, confuses and possibly annoys the potential buyer.

The photo slideshows that are the critical core of each home listing are now increasingly cluttered with advertising. If I was a real estate agent paying to upgrade a listing only to find it was chock full of ads, I’d be furious. I want prospects looking at pictures for the home I am selling, not distracted or annoyed by irrelevant advertising from third parties.

A lot of this comes down to the degradation of the user experience. But in some cases, it’s an even bigger issue: it’s a problem of the data publisher forgetting who they are serving and in some cases, why they are even in business. A little bit of incremental revenue can sometimes have a very high cost attached. And the guiding rule of all things online remains the same: just because you can, doesn’t mean you should.

Get it Right ... Or Else!

Why is data getting so much attention these days? Why is it such a good business? Why is it so profitable? Well, there are numerous reasons, but the one I’d like to highlight today is that increasingly, data matters.

What do I mean by that? Simply that data, to a degree you don’t see with other forms of content, gets relied on to make serious decisions, some of which have significant, business, economic and personal impact. Some people (many of them rich data publishers) have understood this for a long time. For others, this insight is a new one. And one consequence of data’s growing importance is that it is increasingly the focus of lawmakers. Consider just a few examples:

In a true "only in Hollywood" moment, the state of California now has a law that says data providers cannot publish the ages of people in the entertainment industry. Yes, actors have long been skittish about putting their ages out there, but in the old days, they simply lied about their ages. Now, they have the force of law behind them. The ostensible purpose of this law is to help prevent age discrimination, however, the law also specifically includes everyone in the videogame industry as well, so go figure.

Across the pond, UK financial regulators have taken Morningstar, the mutual funds data company to court. Its offense? A number of the funds to which it gave high ratings ended up under-performing relative to their benchmarks. Apparently your predictions are now required to always be accurate. Of course, if Morningstar could identify top-performing funds with 100% accuracy, my strong recommendation to Morningstar would be to get out of the data business and into the investing business, pronto.

We also have the example of health insurance company physician directories. Every health plan publishes a directory of participating physicians, and in many cases, these directories are woefully inaccurate. Examples abound of plan directories with physicians who have left the plan, moved offices (sometimes hundreds of miles away), retired and even died. This would be just another everyday annoyance except for the fact that many people select their health plans, and spend thousands of dollars, based on the network of physicians a health plan claims to offer. The federal government has stepped in on this one, and not to be outdone, California (surprise!) has its own legislation covering physician directories.

These examples are just the tip of the iceberg. Consider all the various laws around credit data, for example.

Back to my original point, all these rules and laws simply illustrate that data at its core is all about helping people to identify, select, assess and decide. And as databases proliferate, so does their influence and impact. There is power in data, which is why, increasingly, data producers are being held to higher standards of quality and accuracy. While painful for some, in the aggregate, good data is good for all of us.

Not All Datasets Are Good Datasets

As someone who has been a long-time proponent of data, it is intriguing to see the number of new start-ups that have revenue models based partially – sometime entirely – on the sale of data, even though they are not data publishers in the conventional sense. Rather, they are seeking to monetize data they are collecting incidentally in the course of other activities.

A fashion website or app, for example, might realize that by tracking what new fashions its users viewed the most, they were collecting valuable intelligence that could be sold to fashion manufacturers. The early players in this area usually did, in fact, have valuable and readily saleable data collections and they had in fact identified an important new revenue stream.

But now “data” is transforming into a buzz-term, up there with “the cloud” and “social.” Purported data opportunities are being used to mask weak business models because everyone these days knows “it’s all about the data.” Just as start-ups these days feel compelled to be in the cloud and have a strong social component, so too do they now need a data opportunity.

Not every new business can create value from the incidental data it generates. Those that do represent the exception, not the rule.  Here are a few reasons why these data opportunities may not be as strong as the entrepreneurs behind them would like to believe:

1. You generate too little data. While everyone talks about quality data, there is still a quantity aspect as well. Even for things as valuable as sales leads, most companies will turn up their noses at them if you can’t deliver a certain volume of leads regularly and dependably. Depending on the data itch you’re trying to scratch, 100,000 or even a million users may not cut it.

2. You generate too much data. Having the most data about something can be as much a burden as an opportunity. Think Twitter. Everyone “knows” that the huge collective stream of consciousness that its  users generate is enormously valuable, but extracting that value is very complex and expensive, and much of the final output still represents conjecture and surmise.

3. You don’t really know much about the data you’ve got. I’ve been in numerous meetings where the issue on the table was, “we’ve got tons of data, but we’re not sure how to monetize it.” This situation naturally calls for advanced TAPITS (There’s A Pony in There Somewhere) analysis to assess value. More times than not, the chosen solution is simply to sell the raw data and hope that the buyer can find value. Of course, when you sell data by the ton, you have to charge for it by the ton too. It’s just not that valuable if the buyer needs to do all the thinking and all the work.

4. A sample of none. Online businesses want lots of traffic and lots of users, the more the merrier. This is good for business generally, but not necessarily great from a data perspective. If your user base is too disparate, the aggregate insights from the data they generate may not be all that valuable. And if your user base is largely anonymous, good luck with that.

5. Buy me a drink first. Many times, an online company is in possession of extremely detailed and valuable data. Unfortunately, this typically means that these data can only be had by violating the trust if not the privacy of the user. It’s even more complicated if the company built its business with a strong privacy policy that prohibits it from ever selling all this valuable information.

6.  Exclusive insights. These days, if you said you have “near-real-time insight into bus station storage locker utilization rates” it will be automatically assumed that you've tapped a huge data opportunity. Every bus station certainly needs this information, bus lines probably have a use for it, there’s probably a government market, some hedge funds will want it and there might even be a consumer opportunity as well – think of an app that shows you available storage lockers nationwide! But in reality, every market is not a viable data market. The market might be too small, marginally profitable, too localized or too consolidated. It is absolutely possible to have data that nobody cares about or that too few people care about to create a meaningful revenue stream.

7. Competition. Your data may indeed be valuable, but chances are, you don’t have the full picture. This means your data is less valuable than a company that can supply the full picture. That means the market for your data may be the one company that knows more about the market than you do. Yes, there’s revenue to be had in this case, but you won’t get rich.

8. Raw data follies. Typically, companies trying to sell the data they collect incidentally want to sell the data, get the money, and get back to their core business activities. But if you don’t clean and organize your data, you’re leaving lots of money on the table. And if you decided to get serious about your data, you’re moving into a different business, one you probably don’t understand very well.

I could keep going, but hopefully you get the point: the chances that the incidental data you generate from some other business activities are valuable is pretty low. And even if you have valuable data, getting maximum value from it generally demands getting a lot more serious about your data, which starts to move you into a totally different business.