Knowing More Than You Can Tell

Most of you have some familiarity with Gerson-Lehrman Group (GLG), the phenomenal success story that pioneered the idea of connecting experts on a wide variety of topics with those who needed fast, trustworthy and unbiased insights into a market, a company, a technology … whatever.

Not surprisingly, GLG found most of its clients in the financial sector, from hedge funds to private equity firms and others that needed expert insight fast to inform the often significant investment decisions they were making. These clients paid fat fees, and the experts were well paid for small chunks of their time, and it all went swimmingly for many years.

Where things got awkward is that some investors wanted more than background information: they wanted confidential information. GLG was very aggressive about policing this, understanding that it could damage its business. However, some GLG competitors didn’t have the same ethics, and differentiated themselves by playing on the often-murky line between public information and inside information. This potential to misuse the raft of expert services that now exist continues to cast a pall over another otherwise strong business model.

Enter a new start-up called Emissary. It’s an expert service, but rather than focusing on connecting experts to investors, it seeks to connect experts to salespeople. Want to know how to tailor your pitch to a particular company? Emissary can find someone who knows. Similarly, salespeople often find themselves wondering if they are dealing with a decision-maker or not at a particular company. Say hello to Emissary, whose experts may well have worked at the company in question.

Visit the Emissary website, and you’ll see a carefully crafted message: we’re just people helping other people. At one level, this is certainly true. And connecting a sales team to a recent former employee of the prospect company doesn’t seem to be rife with the same legal and ethical issues that exist for investors, but I suspect Emissary’s long-term success will depend on it also establishing an ethical line in the sand and policing it closely.

What also makes Emissary interesting is it’s a model that can be moved not only across verticals, but across functional areas as well. 

Not All Datasets Are Good Datasets

As someone who has been a long-time proponent of data, it is intriguing to see the number of new start-ups that have revenue models based partially – sometime entirely – on the sale of data, even though they are not data publishers in the conventional sense. Rather, they are seeking to monetize data they are collecting incidentally in the course of other activities.

A fashion website or app, for example, might realize that by tracking what new fashions its users viewed the most, they were collecting valuable intelligence that could be sold to fashion manufacturers. The early players in this area usually did, in fact, have valuable and readily saleable data collections and they had in fact identified an important new revenue stream.

But now “data” is transforming into a buzz-term, up there with “the cloud” and “social.” Purported data opportunities are being used to mask weak business models because everyone these days knows “it’s all about the data.” Just as start-ups these days feel compelled to be in the cloud and have a strong social component, so too do they now need a data opportunity.

Not every new business can create value from the incidental data it generates. Those that do represent the exception, not the rule.  Here are a few reasons why these data opportunities may not be as strong as the entrepreneurs behind them would like to believe:

1. You generate too little data. While everyone talks about quality data, there is still a quantity aspect as well. Even for things as valuable as sales leads, most companies will turn up their noses at them if you can’t deliver a certain volume of leads regularly and dependably. Depending on the data itch you’re trying to scratch, 100,000 or even a million users may not cut it.

2. You generate too much data. Having the most data about something can be as much a burden as an opportunity. Think Twitter. Everyone “knows” that the huge collective stream of consciousness that its  users generate is enormously valuable, but extracting that value is very complex and expensive, and much of the final output still represents conjecture and surmise.

3. You don’t really know much about the data you’ve got. I’ve been in numerous meetings where the issue on the table was, “we’ve got tons of data, but we’re not sure how to monetize it.” This situation naturally calls for advanced TAPITS (There’s A Pony in There Somewhere) analysis to assess value. More times than not, the chosen solution is simply to sell the raw data and hope that the buyer can find value. Of course, when you sell data by the ton, you have to charge for it by the ton too. It’s just not that valuable if the buyer needs to do all the thinking and all the work.

4. A sample of none. Online businesses want lots of traffic and lots of users, the more the merrier. This is good for business generally, but not necessarily great from a data perspective. If your user base is too disparate, the aggregate insights from the data they generate may not be all that valuable. And if your user base is largely anonymous, good luck with that.

5. Buy me a drink first. Many times, an online company is in possession of extremely detailed and valuable data. Unfortunately, this typically means that these data can only be had by violating the trust if not the privacy of the user. It’s even more complicated if the company built its business with a strong privacy policy that prohibits it from ever selling all this valuable information.

6.  Exclusive insights. These days, if you said you have “near-real-time insight into bus station storage locker utilization rates” it will be automatically assumed that you've tapped a huge data opportunity. Every bus station certainly needs this information, bus lines probably have a use for it, there’s probably a government market, some hedge funds will want it and there might even be a consumer opportunity as well – think of an app that shows you available storage lockers nationwide! But in reality, every market is not a viable data market. The market might be too small, marginally profitable, too localized or too consolidated. It is absolutely possible to have data that nobody cares about or that too few people care about to create a meaningful revenue stream.

7. Competition. Your data may indeed be valuable, but chances are, you don’t have the full picture. This means your data is less valuable than a company that can supply the full picture. That means the market for your data may be the one company that knows more about the market than you do. Yes, there’s revenue to be had in this case, but you won’t get rich.

8. Raw data follies. Typically, companies trying to sell the data they collect incidentally want to sell the data, get the money, and get back to their core business activities. But if you don’t clean and organize your data, you’re leaving lots of money on the table. And if you decided to get serious about your data, you’re moving into a different business, one you probably don’t understand very well.

I could keep going, but hopefully you get the point: the chances that the incidental data you generate from some other business activities are valuable is pretty low. And even if you have valuable data, getting maximum value from it generally demands getting a lot more serious about your data, which starts to move you into a totally different business.

 

 

 

  

Data Insights from Bitsight

A Boston-area start-up called Bitsight is pulling in investor money so quickly, a total of $95 million, that it doesn’t know what to do with it all … yet.

And what does Bitsight do, to justify this level of investment? It examines company websites, evaluates them for the quality of their website security, and assigns them a rating, much like a credit score.

How do they do it? There’s a bit of proprietary secret sauce in how the company evaluates the security of a website, but what’s particularly interesting is that they do it all with publicly available information. And that raises another fascinating aspect of the business: the companies that Bitsight rates are not its clients. Bitsight is not an online security consultant with an automated assessment tool. Indeed, it has evaluated over 60,000 websites to date, and ultimately may evaluate tens or even hundreds of thousands of websites.

Why would anyone want this information? The uses for this data are surprisingly numerous. You can sell it in the form of a benchmark products to the companies you have rated. What IT manager wouldn’t want to know how their company stacks up against its peers? A better opportunity is to help insurance companies properly price data breach insurance policies.

But perhaps the best opportunity is to help big companies evaluate and manage risk with their vendors – a huge issue as a number of headline-grabbing recent data breaches resulted from a company’s network being penetrated via one of its vendors that was connected to it.

While Bitsight may look like a cutting edge analytics company, what’s significant is that so much of its business model is drawn from very basic approaches used by many other data publishers. It is aggregating publicly-available data into a database. It normalizes this information, then applies an algorithm to assess it and produce comparable company ratings. It sells this data product for internal benchmarking, risk management and due diligence applications.

In short, despite its high tech trimmings, Bitsight very much has data publishing DNA. It is also a great example that data products don’t have to be perfect right out of the gate. By relying on public information, Bitsight can’t possibly know everything about the security of a company’s website. But by relying just on public data, it can quickly build a large database of comparable company ratings using a credible methodology and solve market needs that require a certain scale of coverage. If you’re the first data provider serving a serious market need, you can launch with good-enough data and improve it over time. Trying to perfect your data prior to launch can mean missing the opportunity entirely.

Do You Rate?

An article in the New York Times today discusses the growing proliferation of college rankings as focus shifts to trying to evaluate colleges based on their economic value.

Traditionally, rankings of colleges have tended to focus on their selectivity/exclusivity, but now the focus has shifted to what are politely called “outcomes,” in particular, how many graduates of a particular college get jobs in their chosen fields, and how well they are paid. Interestingly, many of the existing college rankings, such as the well-known one produced by U.S. News, have been slow to adopt to this new area of interest, creating opportunities for new entrants. For example, PayScale (an InfoCommerce Model of Excellence winner) has produced earnings-driven college rankings since 2008. Much more recently, both the Economist and the Wall Street Journal have entered the fray with outcomes-driven college rankings. And let’s not forget still another college ranking system, this one from the U.S. Department of Education.

At first blush, the tendency is to say, “enough is enough.” Indeed, one professor quoted in the Times article somewhat humorously noted that there are so many college rankings that, “We’ll soon be ranking the rankings.”

However, there is typically always room for another useful ranking. The key is utility. Every ranking system is inherently an alchemic blend of input data and weightings. What data are used and how they are evaluated depend on what the ratings service thinks is important. For some, it is exclusivity. For others it is value. There are even the well-known (though somewhat tongue in cheek) rankings of top college party schools.

And since concepts like “quality” and “value” are in the eye of the beholder with results often a function of available data, two rating systems can produce wildly varying results. That’s why when multiple rating systems exist, most experts suggest considering several of them to get the most rounded picture and most informative result.

It’s this lack of a single right way to create a perfect ranking that means that in almost every market, multiple competing rating systems can exist and thrive. Having a strong brand that can credential your results always helps, but in many cases, you can be competitive just with a strong and transparent methodology. It helps too when your rankings aren’t too far out of whack with general expectations. Totally unintuitive ranking results are great for a few days of publicity and buzz, but longer term they struggle with credibility issues.

A take-away for publishers is that just because you weren’t first to market with the rankings for your industry, there may still be a solid opportunity for you, if you have better data, a better methodology and solid credibility as a neutral information provider. 

Data's Brave New World

The ACLU has just released a report highlighting the growing relationship between law enforcement agencies and a Chicago-based company called Geofeedia. In a nutshell, Geofeedia is apparently marketing to law enforcement agencies a crowd surveillance tool that mixes geolocation with social media sentiment analysis.

This illustrates the gray area we operate in as data providers, especially those of us dealing with consumer data. Things that are perfectly legal may be seen by others as unethical and inappropriate. And, perhaps ironically, the power and pervasiveness of social media means that reputational risk becomes an outsized area of concern for those of us who deal in data.

On the one hand, Geofeedia is simply aggregating and analyzing information that individuals have voluntarily and publicly posted on various social media platforms. On the other hand, its particular application for these data can be seen to be chilling to lawful speech, dissent and free assembly. And as noted earlier, the law lags far behind these new technologies, and thus provides little guidance.

Facebook reacted to the ACLU report by quickly severing ties with Geofeedia. It understands that anything that creates even the slightest hesitancy to use its platform is detrimental to its own business. Instagram suspended Geofeedia as well. Even Twitter, which we have previously noted seems content to be a datastream for others to monetize, has suspended Geofeedia from commercial access to its data.

As we have noted, it’s difficult to come down on one side or the other in this issue. As a data producer, I think that aggregating and analyzing publicly available data is generally a beneficial activity. Indeed, what Geofeedia is doing is conceptually not all that different than the many social sentiment analysis companies selling aggregated insights to hedge funds seeking early warning on news and emerging trends. Yet at the same time, even if Geofeedia was working with the best of intentions, the optics of its product offering should have received greater attention. And that’s the lesson here for data publishers: just because you can do something doesn’t always mean you should do it. Perception has become as important as reality. Don’t let ignorance or arrogance crater your products or your entire business. Keep firmly in mind at all times that, especially when it comes to data, optics do matter.