It’s not news that fraud is rampant in online advertising. It turns out that one of the biggest reasons is the fact that the buyers and sellers of online advertising in large part do not deal directly. They transact through third party brokers and marketplaces. Increasingly, it’s now computers ordering through third party brokers and marketplaces – the wonderful world we call programmatic. With no humans watching, much less policing the buying process, it is notsurprising that crooks and thieves have rushed in.

One of the easiest types of fraud is simply to misrepresent yourself online. You can tell an online marketplace that you represent the CNN website, collect the revenue, then run the ads you sold on some other website, often one that gets lots of bot traffic and other fake clicks in order to show performance.

To fight this type of misrepresentation, the Internet Advertising Bureau (IAB) created a new standard called ADS.TXT. It’s a small standardized format file that a website owner creates and places on the website that lists all the website’s authorized sellers. If you’re familiar with ROBOTS.TXT, it is exactly analogous.

The idea is that programmatic advertising buyers can easily and confidently check a website’s list of authorized resellers. It’s a full, workable solution to a significant problem, but it comes with one big catch: the ADS.TXT file is necessarily open to everyone who wants to view it. And a lot of publishers and other website owners aren’t thrilled about exposing what they consider proprietary information.

The solution? In my view, it’s a central database, operated by an independent third party. The same information can be placed in the database, but access can be easily restricted to those who “need to know” the information. I’ve always liked opportunities where an industry needs to share information but at the same time doesn’t want to make that information public. A neutral data provider is most times the perfect answer, as I think it is in this case.

Moreover, a central database can add additional value, because it can track what is happening. It can automatically nag website owners who don’t update their reseller lists regularly. It can check which advertising marketplaces are using the service. In these and many other ways, it can actively work to keep all players engaged and honest.

And of course, data being data, there’s an easy opportunity to aggregate this reseller data to look for sales trends and market share. This information can be given or sold back to the industry without any privacy concerns.

ADS.TXT is just one example of a good idea that could be a much better idea if there was a trusted data provider in the middle, protecting privacy while mediating and recording access to insure compliance and data accuracy. I’d like to see ADS.TXT as what you might call ADS.DATA. You’d be wise to look for analogous opportunities in your own market.




Top Level Domains/Low-Level Trustmarks

If you’re not immediately familiar with the term top level domain (TLD), think of “.com” and “.net” and “.edu” – they are all top-level domains, along with hundreds of others, and by the way, they are not limited to three characters anymore.

In the early days of the Internet, domain names were free for the asking, and I stocked up on quite a few for no other reason than a gut feeling they had some value. I did ultimately sell a lot of them, including several Fortune 500 companies who bought their corporate names back from me. By the time I realized there might be a bigger opportunity here, the rules of the game changed and big companies that had previously shown up with checkbooks now showed up with lawyers. Ah, well!

But for all my domain name hoarding, I couldn’t ever get domains names with the “.edu” TLD because they were reserved for schools. Similarly, “.net” was reserved for Internet Service Providers back then, and “.org” was reserved for non-profits. These distinctions were widely understood back then, and even today, I hear people telling me some organization “must” be a non-profit because it has a “.org” domain name. Old naming conventions die hard. More importantly, people are hungry for trustmarks.

But TLDs were never great trustmarks, for two reasons. First, validating an organization’s credentials before handing out a domain name is hard and expensive work. Second, domain names don’t sell for a lot, so you can only make money with volume. The pickier you are, the less money you make.

Despite this, the non-profit sector is now pushing the “.ngo” TLD. Think of it as a do-over of the “.org” TLD, because the operator of the domain is trying to limit sales to non-profit entities with the explicit hope that the TLD will become a trustmark over time. Similarly, the AICPA, the big association of certified public accountants, is in a fierce battle to control the forthcoming “.cpa” TLD, again with the hope it can restrict its use to certified public accountants and build it into a trustmark.

My view is that TLDs make for poor trustmarks. The economics make it hard to enforce standards, and there are too many sleazy operators in the business that drag down the credibility of TLDs across the board. The need for online trustmarks remains high. Who better than data companies to seize the opportunity?




Survey Says ... It Depends

The data for data products can come from a wide array of sources. Traditionally, datasets were compiled through primary research, usually via questionnaires or by phone. There is alsosecondary research, where staff gathers data using online sources. There are also public domain databases that can be leveraged. We have also seen a rise in technologically-driven data gathering, such as web harvesting. And a growing number of data publishers license third-party data to augment their data gathering. Almost anything goes these days, and the savviest data publishers are mixing and matching their collection techniques for maximum effectiveness. (a topic that will be addressed at the Business Information and Media Summit in November. )

This brings me to a question I have been asked more than a few times: can survey data be turned into a data product? When I talk about surveys, I mean the types of surveys most of us do routinely: you ask, say, 20,000 restaurant owners to answer questions about their businesses and the market generally, and if you’re lucky, you’ll get 1,000 responses. My take? While a survey does in fact generate data, I don’t think a survey automatically qualifies as a commercial data product. The reason is subtle, but important.

Much of the value of a data product is in its granularity and specificity. Typically, a data product focuses on organizations, individuals or products and attempts to collect as much detail as possible on each unit of coverage, as comprehensively as possible. Most surveys, by contrast, are anonymous by nature and hit-and-miss in coverage. Using our earlier example, a survey of restaurants might well be useful and valuable if it didn’t get any response from Taco Bell operators. A restaurant database without any listings of Taco Bell locations would have no credibility.  Since most surveys promise anonymity to increase survey participation rates, only aggregate reporting is possible. From my perspective, surveys of this type are useless as data products.

But not all surveys are the same. Some surveys ask respondents to list the vendors they use, or which of a specified set of companies they like the most and the least. Surveys where you ask the anonymous respondent to list or opine on specific companies or products actually can yield a very compelling type of commercial data product. That’s because the companies or products that come out of the survey effort are not anonymous. If the owner of the Blue Duck restaurant tells you that she likes National Restaurant Supply, you’re developing lots of valuable data about National Restaurant Supply that you can publish, even while keeping Blue Duck restaurant anonymous. Your survey data can report on attitudes or adoption or market share of specific products or firms and compare them and rank and rate them. That’s very valuable because the data are highly proprietary, difficult to collect and actionable.

My bottom line on surveys is that “traditional format” surveys with anonymous submissions and aggregate reporting are truly surveys, not data products. But if your survey asks respondents to tell you how much they use or like specific companies or products – you’ve got yourself the makings of a data product!


Inexhaustible Data Opportunities

A new product from LexisNexis Risk Solutions monitors newly listed homes for sale on behalf of home insurance companies to alert them when a customer is preparing to move. The insurer can use this advance notice to contact these customers to help retain their business. 

This is a great idea. For a long time now, data companies have offered so-called “new mover” databases, identifying people who have recently moved into a new home. These are prime prospects because they’re in the market for all sorts of things, sometimes urgently, meaning the first offer they get stands a strong chance of being accepted.

This LexisNexis product shows how to combine databases to up your game. What could be a better prospect than a new mover? How about a pre-mover! While LexisNexis is focused on insurance companies, there are all sorts of companies that would be very interested to have at-risk current customers identified for them so that they can focus their customer retention efforts.

What makes this big leap in sales targeting possible isn't cutting edge technology in this case. It’s having the insight to see that data produced by one type of organization (in this case real estate agents) is valuable to another type of organization (in this case, insurance companies). Add in some additional value by matching the database of one organization to the database of another, and you almost assuredly have a nice business opportunity for the taking.

That’s what is so exciting and fun about the data business today: with so many new databases coming together, opportunity is everywhere. The key is to look at every new database you see and ask, “who else could use these data, and what could I do to these data to make them even more valuable to others?”

The people who create databases are almost always trying to solve a specific, single problem or need. Flip, spin, match or sometimes simply re-sort these databases, and you can often solve someone else’s problem or need. Am I talking about what’s known as data exhaust? To some extent yes, but some of the biggest and most interesting opportunities are right in front of us in plain sight – far less complex and challenging than most of the data exhaust opportunities I have seen.



Bigger Is Not Always Better

One key dynamic of the data business is that the strongest businesses serve single, tightly-defined markets, typically a single vertical market. The result is that the market opportunity tends to be smaller, but it is much easier to stay close to and defend.

The problem for data publishers attempting to build products with horizontal coverage across multiple markets, or who want to play in large consumer markets, comes down to a very simple reality: it’s hard to be everything to everybody.

It’s instructive to look at some of the reasons why it’s so hard to achieve long-term success with broad-based data products:

Lowest common denominator: In order to operate efficiently, broad-based data publishers typically have to collect fairly standardized and fairly shallow data across multiple vertical markets. This creates an opportunity for other data publishers to “slice and dice” these publishers, peeling off the largest and most profitable vertical sub-markets, and serving the same need with deeper and more tailored data.

Greater incentive for competitors: If you achieve any level of success with a horizontal, broad-based data product, you’ve not only identified a big market need, you’ve identified a big market opportunity as well. That means it may well be worth it for a competitor to invest significantly to steal market share or push you out entirely. Contrast this with successful vertical market data publishers, where the small scale of the market is one of their best protections. Competitors typically can’t financially justify trying to push their way into small vertical markets.

Turning an ocean liner: In addition to being a juicy competitive target, an established broad-based data publisher typically succeeds because it has built an operation that over time becomes very difficult to change for technical and business reasons. That means it will be at the mercy of such forces as new technology, shifts in user preferences and new business models, and just a few competitive successes can break the momentum and market dominance of the incumbent data provider. Moreover, the incumbent data responder is only able to react slowly, if it can react at all.

Too cool for school: While some broad-based data publishers become exposed because they can’t react quickly, others expose themselves by innovating so aggressively they get ahead of their markets and their customers. In a relentless quest to stay relevant and ahead of the competition, these publishers roll out features and functionality that their customer often don’t understand or even want, adding complexity to the user experience while muddying the core value proposition.

Platform envy: Perhaps encouraged by the spectacular success of Amazon, it’s easy to take the view that your data product can become a data platform, a way to distribute all kinds of data, products, whatever. That’s a big leap technologically, and while platforms are enticing to publishers, they almost inherently mean diffused focus, thus opening opportunities for competitors to enter the market with more focused products.

The most successful data publishers and products I see these days tend to serve one market and serve it extremely well. As long as these businesses stay close to their customers, evolve their products regularly and prudently, and offer good customer support and fair pricing, they can be enormously profitable while remaining largely immune to competition. That’s why in the data business at least, bigger isn’t always better.