AI in Action

Two well-known and highly successful data producers, Morningstar and Spiceworks, have both just announced new capabilities built on artificial intelligence (AI) technology. 

Artificial Intelligence is a much-abused umbrella term for a number of distinctive technologies. Speaking very generally, the power of AI initially came from sheer computer processing power. Consider how early AI was applied to the game of chess. The “AI advantage” came from the ability to quickly assess every possible combination of moves and likely responses, as well as having access to a library of all the best moves of the world’s best chess players. It was a brute force approach, and it worked.

Machine learning is a more nuanced approach to AI where the system is fed both large amounts of raw data and examples of desirable outcomes. The software actually learns from these examples and is able to generate successful outcomes of its own using the raw data it is supplied. 

There’s more, much more, to AI, but the power and potential is clear.

So how are data producers using AI? In the case of Morningstar, it has partnered with a company called Mercer to create a huge pool of quantitative and qualitative data, to help investment advisors make smarter decisions for their clients. The application of AI here is to create what is essentially a next generation search engine that moves far beyond keyword searching to make powerful connections between disparate collections of data to identify not only the most relevant results, but to pull meaning out of those search results as well.

 At Spiceworks (a 2010 Model of Excellence), AI is powering two uses. The first is also a supercharged search function, designed to make it easier for IT buyers to more quickly access relevant buying information, something that is particularly important in an industry with so much volatility and change.

Spiceworks is also using AI to power a sell-side application that ingests the billions of data signals created on the Spiceworks platform each day to help marketers better target in-market buyers of specific products and services.

As the data business has evolved from offering fast access to the most data to fast access to the most relevant data, AI looks to play an increasingly important and central role. These two industry innovators, both past Models of Excellence m are blazing the trail for the rest of us, and they are well worth watching to see how their integration of AI into their businesses evolves over time.

For reference:

Spiceworks Model of Excellence profile
Morningstar Model of Excellence Profile





Form Follows Function

Numerous online marketing trade associations have announced their latest initiative to bring structure and transparency to an industry that can only be called the Wild, Wild West of the data world: online audience data. Their approach offers some useful lessons to data publishers.

At their brand-new one-page website ( this industry coalition is introducing its “Data Transparency Label.” In an attempt to be hip and clever, the coalition has modeled its data record on the familiar nutrition labels found on most food packaging today. It’s undeniably cute, but it’s a classic case of form not following function. Having decided on this approach, the designers of this label immediately boxed themselves in as to what kind and how much data they could present to buyers. I see this all the time with new data products: so much emphasis is placed on how the data looks, its visual presentation, that important data elements often end up getting minimized, hidden or even discarded. Pleasing visual presentation is desirable, but it shouldn’t come at the expense of our data.

The other constraint you immediately see is that this label format works great if an audience is derived from a single source by a single data company. But the real world is far messier than that. What if the audience is aggregated from multiple sources? What if its value derives from complex signal data that may be sourced from multiple third parties? What about resellers? Life is complicated. This label pretends it is simple. Having spent many years involved with data cards for mailing lists, during which time I became deeply frustrated by the lost opportunities caused by a simple approach used to describe increasingly sophisticated products, I see history about to repeat itself.

My biggest objection to this new label is that its focus seems to be 100% on transparency, with little attention being paid to equally valuable uses such as sourcing and comparison. The designers of this label allude to a taxonomy that will be used for classification purposes, but it’s only mentioned in passing and doesn’t feel like a priority focus at all. Perhaps most importantly, there’s no hint of whether or not these labels will be offered as a searchable database or not. There’s a potentially powerful audience sourcing tool here, and if anyone is considering that, they aren’t talking about it.

 Take-aways to consider:

·     When designing a new data product, don’t allow yourself to get boxed in by design

·     The real world is messy, with lots of exceptions. If you don’t provide for these exceptions, you’ll have a product that will never reach its full potential

·     Always remember that a good data product is much more than a filing cabinet that is used to look up specific facts. A thoughtful, well-organized dataset can deliver a lot more value to users and often to multiple groups of users. Don’t limit yourself to a single use case for your product – you’ll just be limiting your opportunity.


Regulating by the Numbers

While so many large financial institutions were teetering during the Great Recession, regulators trying to bring stability to the global financial system quickly learned a startling, shocking fact: there was really no way to net out how much money one financial institution owed to another.

The reason for this is that the complex financial trades that banks were engaged in weren’t straightforward bank-to-bank deals. JP Morgan didn’t just do trades with Citibank, for example. Rather, they were done through a web of subsidiaries, many of them set up specifically to be opaque and obscure. And that’s just the banks. Add in hedge funds and other investors, and their offshore companies and subsidiaries that also were designed to be opaque, and you quickly get to mind-numbing complexity. 

 With an eye to better regulation and better information during a future financial crisis, an idea was proposed during a 2011 meeting of the G-20 countries to create a numbering system called the Legal Entity Identifier (LEI). The simple idea was that if every legal entity engaged in financial transaction had a unique number, and the record of that legal entity also contained the number for its parent company, it would be easy to roll up these records to see the total financial exposure of any institution.

While you may never have heard of it, the LEI system actually exists, and most financial institutions now have LEI numbers. There is a push in some countries (in the United States, the Treasury Department is leading the charge) to require all companies to obtain a LEI number, it’s been slow going so far.

If this discussion has you wondering about the DUNS number from D&B, not to worry: it’s alive and well. It’s also far more evolved and comprehensive than the LEI system. However, as a privately maintained identifier system, D&B not unreasonably wants to be paid for its use. This rankles some government agencies that are paying substantial sums to D&B for access to the DUNS system, and more than a few are pushing for broad expansion of the LEI system as a replacement for the DUNS system. Suffice to say there is a lot going on behind the scenes.

There are a number of free lookup services for LEI records, and the information is in the public domain. Some data publishers may find immediate uses for LEI data, but its fundamental weakness at this point is that it’s hit and miss as to what companies have registered. Still, it’s a database to know about and watch, particularly if you have an interest in company relationships. Over time, its likely its coverage and importance will grow.

Just in Time Data

Databases are tricky beasts because their content is both fluid and volatile. There are likely no databases that are 100% comprehensive and 100% accurate at the same time. This problem has only been exacerbated by increasingly ambitious data products that continue to push the envelope in terms of both the breadth and depth of their coverage.

Data publishers have long had to deal with this issue. The most widely adopted approach has been what might be called “data triage.” This is when the publisher quickly updates a data record in response to a subscriber request.

I first encountered this approach with long-time data pioneer D&B. If you requested a background report on a company for which D&B had either a skeleton record or out of date information, D&B adroitly turned this potential problem into a show of commitment to its data quality. The D&B approach was to provide you with whatever stale or skimpy information it had on file in order to provide some data the subscriber might find useful. But D&B would also  indicate in bold type words to the effect of, “this background report contains information that may be outdated. To maintain our data quality standards, a D&B investigator will update this report and an updated report will be sent to you within 48 hours.”

Data triage would begin immediately. D&B would have one of its more experienced researchers call the company and extract as much information as possible. The record was updated, the new information was sent to the subscriber, and anyone else requesting that background report would benefit from the updated information as well.

A variation on this approach is to offer not updates to existing records, but rather to create entirely new records on request. Not in our database? Just let us know, and we’ll do the needed research for you pronto. Boardroom Insiders, a company that sells in-depth profiles of C-suite executives, does this very successfully, as does The Red Flag Group

 The key to succeeding with data triage? First, you have to set yourself up to respond quickly. Your customers will appreciate the custom work you are doing from them, but they still want the information quickly. Secondly, use this technique to supplement your database, not substitute for it. If you are not satisfying most of your subscribers most of the time with the data you have already collected, you’re really not a data publisher, you’re a custom research shop, and that’s a far less attractive business. Finally, learn from these research requests. Why didn’t you already have the company or individual in question in your database? Are the information needs of your subscribers shifting? Are there new segments of the market you need to cover? There’s a lot you can learn from custom requests especially if you can find patterns in these requests. 

Data triage is a smart tactic that many data publishers can use. But always remember, no matter how impressive the service, the subscriber still has to wait for data. Ultimately, this nice courtesy becomes a real inconvenience if the subscriber encounters it too often. What you need to do is both satisfy your customers most of the time, and be there for them when you fall short.

LinkedIn: A D&B For People?

I joined LinkedIn in 2004. I didn’t discover LinkedIn on my own; like many of you, I received an invitation to connect with someone already on LinkedIn, and this required me to create a profile. I did, and became part of what I still believe is one of the most remarkable contributory databases ever created.

Those of you who remember LinkedIn in its early days (it was one of our Models of Excellence in 2004), remember its original premise: making connections – the concept of “six degrees of separation” brought to life. With LinkedIn, you would be able to contact anyone by leveraging “friend of a friend” connections.

It was an original idea, and a nifty piece of programming, but it proved hard to monetize. The key problem is that the people most interested in the idea of contacting someone three hops removed from them were salespeople. People proved remarkably resistant to helping strangers access their friends to make sales pitches. LinkedIn tried all sorts of clever tweaks, but there clearly wasn’t a business opportunity in this approach.

What saved LinkedIn in this early phase was a pivot to selling database access to recruiters. A database this big, deep and current was an obvious winner and it generated significant revenue. But there are ultimately only so many recruiters and large employers to sell to, and that was a problem for LinkedIn, whose ambitions had always been huge.

Where things got off the tracks for LinkedIn was the rise of Facebook, Twitter and the other social networks. Superficially, LinkedIn looked like a B2B social network, and LinkedIn was under tremendous pressure to accept this characterization, because it did wonders for both its profile and its valuation. LinkedIn created a Twitter-like newsfeed (albeit one without character limits), and invested massive resources to promote it. Did it work? My sense is that it didn’t. I never go into LinkedIn with the goal of reading my news feed, and I have the same complaint about it as I have about Twitter: it’s a massive, relentless steam of unorganized content, very little of which is original, and very little of which is useful. 

Today, LinkedIn to me is an endless stream of connection requests from strangers who want to sell me something. LinkedIn today is regular emails reminding me of birthdays of people I barely know because I, like everyone else, have been remarkably undisciplined about accepting new connection requests over the years. LinkedIn is also just one more content dump that I barely glance at, and it’s less and less useful as a database as both its data and search tools are increasingly restricted in order to incent me to become a paid subscriber.

Am I predicting the demise of LinkedIn? Absolutely not! What LinkedIn needs now is another pivot, back to its database roots. It needs to back away from its social media framing, and think of itself more like a Dun & Bradstreet for people. LinkedIn has to use its proven creativity and the resources of its parent to embed itself so deeply into the fabric of business that one’s career is dependent on a current LinkedIn profile. LinkedIn should create tools for HR departments to access and leverage all the structured content in the LinkedIn database so that they will in turn insist on a LinkedIn profile from all candidates and employees. Resurrect the idea of serving as the internal company directory for companies (and deeply integrate it into Microsoft network management tools). Most exciting of all to me is the opportunity to leverage LinkedIn data within Outlook for filtering and prioritizing email – big opportunities that go far beyond the baby steps we’ve seen so far.

I think LinkedIn’s future is bright indeed, but it depends on management focusing on its remarkable data trove, rather than being a Facebook for business.