infocomm 5/3/13 infocomm 5/3/13

Enigma: Disrupting Public Data

Can you actually disrupt public data, which by definition is public, and by extension is typically free or close to free? Well, in a way, you can. Enigma Logo A new start-up called Enigma, which can be thought of as “the Google of public data,” has assembled over 100,000 public data sources – some of them not even fully or easily accessible online. Think all kinds of public records from land ownership, public company data, customs filings, private plan registrations, all sorts of data, and all in one place.

But there’s more. Enigma doesn’t just aggregate, it integrates. That means it has expended tremendous effort to both normalize and link these disparate datasets, making information easier to find, and data easier to analyze.

The potentially disruptive aspect to a database that contains so much public data is that there are quite a few data publishers with very successful businesses built in whole or in part on public datasets.

But beyond the potential for disruption, there’s some other big potential for this (I’ve requested a trial, so at this point I am working with limited information). First, Enigma isn’t (at least for now) trying to create a specific product, e.g. a company profile database. Rather, it’s providing raw data. That will make it less interesting to many buyers of existing data products who want a fast answer with minimum effort. But it also means that Enigma could be a leveraged way for many data publishers to access public data to integrate into their own products, especially since Enigma touts a powerful API.

The other consideration with a product like this is that even with 100,000 datasets, it is inherently broad-based and scatter-shot in its coverage. That makes it far less threatening to vertical market data publishers.

Finally, Enigma has adopted a paid subscription model, so it’s not going to accelerate the commoditization of data by offering itself free to everyone and adopting an ad-supported model.

So from a number of angles, this is a company to watch. I’m eagerly waiting for my trial subscription; I urge you to dig in deep on Enigma as well.

infocomm 4/19/13 infocomm 4/19/13

A Plug and Play Publishing Platform?

Dun & Bradstreet Credibility Corporation, an independent company with such an extensive relationship with Dun & Bradstreet that it was even granted use of the vaunted D&B name, has been targeting smaller businesses with not only traditional D&B credit products, but a beta offering of what might be called a “next generation credit rating,” a so-called credibility score that examines the company from a number of different non-financial perspectives, yielding a letter grade and presumably an online trust mark that companies can use to build confidence with both suppliers and customers. It’s a clever and ambitious concept. And there are some serious resources behind this venture: Boston-based private equity firm Great Hill Partners is backing the venture with in excess of $100 million. In an apparently related development, D&B Credibility recently announced the launch of the “Credibility Review Business Marketplace,” an innovative move to partner with publishers to extend the reach of its credibility ratings, by turning B2B data publishers into a sales channel. D&B Credibility indicates a number of publishers have already signed onto this program.

I’m still waiting to get full details on this program from the company, leaving me free to speculate wildly, a favorite pastime. Here’s what I picture:

D&B Credibility has licensed access to the full D&B business database, and this provides a content backbone to the initiative. When it emerges from beta, D&B Credibility will presumably move to aggressively sell credibility scores to smaller businesses. Each sale yields a richly detailed business profile (part of the score involves “transparency,” so participating companies are obliged to supply all sorts of useful information – smart!) that the participating company is highly motivated to keep current (yielding high leverage user-generated content). These enhanced listings are added to the basic listings in the content backbone.

To accelerate adoption of the credibility scores, D&B Credibility will partner with publishers on an intriguing offer: a self-maintaining database offering a growing number of credibility scores, that the publisher can access for free in exchange for selling credibility scores (and anything else it wants) to companies in its vertical market.

As I envision it, publishers would simply flag the companies they want to appear in their vertical market buying guides, getting in effect a customized view of the larger database. The publisher codes each company against its own vertical market taxonomy, and presto-whammo, it’s got a high quality database that costs almost nothing to build or maintain. All it has to do is sell the credibility scores and other advertising to companies that it has flagged. For trade magazine publishers in particular, selling ads is a true core competency, where database development and maintenance is not.

What’s in this for D&B Credibility? It gets a revenue cut from every credibility score a publisher sells. It gets all the company information being collected (everything goes into its backbone database), and it gets valuable help in building momentum and acceptance for its scores.

Is this a good deal for publishers? When it comes to vertical market buying guides, the majority of publishers have unevenly maintained databases with limited company information. This approach not only goes a long way to solving the twin issues of data quality and data depth, it also provides the ability to sell a new and useful offering – a B2B trust mark.

Fascinating stuff, and well worth watching as the product rolls out from beta.

infocomm 3/21/13 infocomm 3/21/13

Walking Around Money

A young company called Placed is deep into Big Data analytics, but with a twist: it marries customer data with its own proprietary data to yield insights into customer behavior. Essentially, Placed wants to provide context around how customers use the mobile applications of its clients, for example, when do they use the app and where do they use it?

The “where” part of the analysis is what’s interesting. Placed could simply spit back to its clients that its customers are in certain ZIP codes or other dry demographics – interesting, like so many analytics reports are, but not particularly useful.

Instead Placed marries customer location with its own proprietary database of places – named stores, major buildings, points of interest. By connecting the two, Placed can tell its clients where mobile use of its app is occurring. For example, if a client’s customers utilize its mobile app in a competitor’s store, it might suggest competitive price comparisons. Knowing its customers frequent Starbucks and nightclubs might influence the clients’ marketing strategy or advertising campaign design. Knowing that the app is used most often when someone is walking (yes, Placed can tell you that) can be important for user interface design – you get the idea.

And therein lies an important insight. There are an endless number of companies offering Big Data analytics capabilities. But almost all of them expect their customers to bring both the problem and the data. That’s a sure recipe for commoditization, and as analytics software evolve, it’s also certain that the companies with the biggest analytics needs will decide to do the work themselves.

Solution? Big Data analytics players should bring proprietary data to the party. Placed is a perfect case study. It differentiates itself by providing answers others can’t. It adds value to its analytics by integrating proprietary and licensed data with customer data and its own optimized analytical tools. As I discussed in my presentation at DataContent 2012, there are lots of ways publishers can profit from the Big Data revolution -- even if they don't have big data themselves.

In a market where companies like Placed can make money by tracking people walking around, it behooves data publishers to walk around to some of these Big Data analytics players and suggest data partnerships that will help them stand out from the crowd.

infocomm 3/5/13 infocomm 3/5/13

Education Data: Lessons Learned

A recent Reuters story described a new national database of student information. Reportedly built at a cost of $100 million, and backed by prestigious non-profits such as the Bill and Melinda Gates Foundation and the Carnegie Corporation, the aim of the project is to build a standardized database of information on all students in the country, grades K-12. No, this is not aggregate data. This is detailed, specific information on every student that can include such information as grades, learning disabilities, hobbies and interests. Surely this database doesn’t include student names and other identifiers you say. But in fact it does. And that’s the point. It’s also why this database is so exciting to so many companies in the education market. The goal is to jump-start technology-driven individualized learning for students.

According to the article, school administrators have long (and legally) maintained all sorts of data on students for educational purposes. And, as you would suspect, every school did things a little differently. They collected different data elements and held them in different formats in different locations. So if you were marketing educational technology to schools that tried to personalize the learning experience, you faced a painful data interface challenge for every new school you sold. Seeing a real impediment to growth for cutting-edge educational technology, several big foundations jumped. And rather than just developing a data standard which would take decades to gain widespread adoption, they invested to actually build a single database. Participation by schools is voluntarily and (currently) free, but lots of incentives have been created to spur participation.

We can draw a few fascinating lessons and trends from this initiative.

First, we see a wonderful acknowledgement of what I modestly call Perkins’ Law: no organization will voluntarily build and maintain a database if it is outside their core competencies and there is a viable alternative to doing so. The commercial data publishing business is really built around this law: data publishers succeed because people want the data, but don’t want to collect or maintain it themselves.

Second, we see another great example of a “data pipe,” where one organization provides data that developers can tap into via APIs to build applications driven by that data. The data provider seeks to become an information utility, while dozens or even hundreds of different developers can identify and mine niche opportunities faster and better than any single data publisher. This is a relatively young model, but it’s quickly gaining a following.

Third, valuable data is more often than not sensitive data as well. As this database hits the radar of parents and civil liberties advocates, the inevitable questions around privacy and security are being asked. And the answers to date, according to the article, do not seem particularly robust or reassuring. The non-profit managing the database makes all the appropriate noises about protecting the data, while at the same time the database exists in large part to benefit commercial entities. While the goal of the database is laudable, we have a classic example of a database that will likely succeed only with strong governance and privacy policies. This is something that commercial data publishers will need to become attentive to in years to come.

It’s a fascinating initiative, and one where we can all learn by example.

infocomm 5/11/12 infocomm 5/11/12

Got Klout?

Imagine a business based on a mash-up of social media, analytics and ratings. And that's exactly where a company called Klout plays.

Klout exists to assess your social media importance. Using advanced algorithms, it looks at how active you are in social media, how big your audience is, how influential are the people in your audience, and the impact of your social media activity. All this gets rolled up in a Klout score - a number from 1 to 100.

If this sounds like nothing more than an interesting academic research exercise, you might be surprised. Klout reportedly has over 5,000 large companies tapping into its database to determine who really matters online. Uses are varied and fascinating. PR companies use Klout to assess whether or not to personally engage with someone who has made a negative online comment about a client. Marketers are creating customized pitches to those with the highest Klout scores in the hopes of engaging with them and getting them to talk to their audiences about their products. And this is just the tip of the iceberg in terms of potential applications. Consider, for example, that Klout has already built a connector to Salesforce.com.

In terms of potential applications, some are cutting edge, but not all are necessarily positive. There are numerous reports floating around of people applying for jobs and being rejected due to low Klout scores. Some hotels reportedly will look up your Klout score at check-in, and provide free upgrades to those with high scores, presumably in the hopes of favorable online mentions. Similarly, Cathay Pacific airlines will make its San Francisco frequent flier lounge available to anyone with a high Klout score - regardless of what airline they are flying. The objective again is favorable mentions.

Implications? What we may be seeing is a devolution in advertising where marketers move to a bottoms-up approach to distributing their messages, with the hope that they can achieve powerful and cost-effective reach by having a small group of individuals amplify their brands and their messages for them. This could have serious impact on those that make money today by aggregating fixed audiences.

Of course, as the rewards for having social influence grow, so too will the number of people gaming the system to improve their scores to reap all these upgrades, free samples and attention. As these activities accelerate, social media measurement could end up getting so polluted and undependable that it becomes too difficult to isolate true influencers, likely a fatal blow to this innovative new marketing approach. Alternatively, Klout, like Google, could try to keep the game going by regularly tweaking its algorithms to maintain its value. But as we add the wisdom of algorithms to the wisdom of crowds, are we really getting any smarter?