Viewing entries in
Public Domain Data

Regulating by the Numbers

While so many large financial institutions were teetering during the Great Recession, regulators trying to bring stability to the global financial system quickly learned a startling, shocking fact: there was really no way to net out how much money one financial institution owed to another.

The reason for this is that the complex financial trades that banks were engaged in weren’t straightforward bank-to-bank deals. JP Morgan didn’t just do trades with Citibank, for example. Rather, they were done through a web of subsidiaries, many of them set up specifically to be opaque and obscure. And that’s just the banks. Add in hedge funds and other investors, and their offshore companies and subsidiaries that also were designed to be opaque, and you quickly get to mind-numbing complexity. 

 With an eye to better regulation and better information during a future financial crisis, an idea was proposed during a 2011 meeting of the G-20 countries to create a numbering system called the Legal Entity Identifier (LEI). The simple idea was that if every legal entity engaged in financial transaction had a unique number, and the record of that legal entity also contained the number for its parent company, it would be easy to roll up these records to see the total financial exposure of any institution.

While you may never have heard of it, the LEI system actually exists, and most financial institutions now have LEI numbers. There is a push in some countries (in the United States, the Treasury Department is leading the charge) to require all companies to obtain a LEI number, it’s been slow going so far.

If this discussion has you wondering about the DUNS number from D&B, not to worry: it’s alive and well. It’s also far more evolved and comprehensive than the LEI system. However, as a privately maintained identifier system, D&B not unreasonably wants to be paid for its use. This rankles some government agencies that are paying substantial sums to D&B for access to the DUNS system, and more than a few are pushing for broad expansion of the LEI system as a replacement for the DUNS system. Suffice to say there is a lot going on behind the scenes.

There are a number of free lookup services for LEI records, and the information is in the public domain. Some data publishers may find immediate uses for LEI data, but its fundamental weakness at this point is that it’s hit and miss as to what companies have registered. Still, it’s a database to know about and watch, particularly if you have an interest in company relationships. Over time, its likely its coverage and importance will grow.

Proposed Bill Puts the OPEN in Government Data

Should federal government data be open to the public? Perhaps a better way to frame the question is whether or not the federal government should make public data publicly available. Because databases compiled by the government are, with few exceptions, already open to the public, if you can track them down in the first place. And this problem with discovering government datasets has long been the rub.

The federal government collects data for many reasons, but generally data gathering is for regulatory, compliance or statistical reasons. When this data gathering relates to business entities, there’s usually a business opportunity to be found. That’s because government agencies usually collect data for one specific purpose only. For example, the Federal Aviation Administration maintains a database of all airplanes that are licensed for operation in the United States. It collects a lot of data about both the plane and its owner, but its overall objective is simply to keep a record of whether or not a given plane is licensed to operate. Even if it puts this database online for public access, your ability to search the database is limited to looking up specific airplanes by tail number or owner. This is the compliance focus of government manifesting itself. But that’s great news for commercial data publishers who can get the underlying database and add tremendous value simply by making the data parametrically searchable. Online government databases are almost always designed to help the user find information on a single, known entity. Parametric search creates a powerful sales prospecting tool. Suddenly, the database can be searched by make and model and age of the plane, with the ability to limit search results to specific geographies.

Needless to say, federal government databases can offer huge business opportunities because the government has done all the compilation work, at its own expense, and even keeps the database updated for you. But again, the challenge is finding and accessing these databases in the first place. Government agencies have no incentive to merchandise their internal databases, and many continue to resist opening their datasets to the public, usually out of bureaucratic fear or inertia.

Yes, there is, a much-heralded federal government initiative to not only move more data online, but to put it all in a central place. But the datasets of interest to commercial data publishers will rarely be found there. However, if you’re interested in data on migratory butterflies in Oklahoma, is a great place to go.

That’s why I am excited by the OPEN Government Data Act (OPEN Data Act, S. 2852, H.R. 5051) that will mandate that all federal government agencies make all of their datasets immediately available for public use, subject only to a handful of exceptions. This is a bill worth watching and supporting. Fortunes have already been made by commercial data publishers with the savvy and persistence to navigate the federal labyrinth. The OPEN Data Act will level the playing field and open even more opportunities to leverage government data for commercial applications. What’s not to like?




Enigma: Disrupting Public Data

Can you actually disrupt public data, which by definition is public, and by extension is typically free or close to free? Well, in a way, you can. Enigma LogoA new start-up called Enigma, which can be thought of as “the Google of public data,” has assembled over 100,000 public data sources – some of them not even fully or easily accessible online. Think all kinds of public records from land ownership, public company data, customs filings, private plan registrations, all sorts of data, and all in one place.

But there’s more. Enigma doesn’t just aggregate, it integrates. That means it has expended tremendous effort to both normalize and link these disparate datasets, making information easier to find, and data easier to analyze.

The potentially disruptive aspect to a database that contains so much public data is that there are quite a few data publishers with very successful businesses built in whole or in part on public datasets.

But beyond the potential for disruption, there’s some other big potential for this (I’ve requested a trial, so at this point I am working with limited information). First, Enigma isn’t (at least for now) trying to create a specific product, e.g. a company profile database. Rather, it’s providing raw data. That will make it less interesting to many buyers of existing data products who want a fast answer with minimum effort. But it also means that Enigma could be a leveraged way for many data publishers to access public data to integrate into their own products, especially since Enigma touts a powerful API.

The other consideration with a product like this is that even with 100,000 datasets, it is inherently broad-based and scatter-shot in its coverage. That makes it far less threatening to vertical market data publishers.

Finally, Enigma has adopted a paid subscription model, so it’s not going to accelerate the commoditization of data by offering itself free to everyone and adopting an ad-supported model.

So from a number of angles, this is a company to watch. I’m eagerly waiting for my trial subscription; I urge you to dig in deep on Enigma as well.



Cleaning Up by Cleaning Up




Meet Equilar, a $20+ million data publisher that sells publicly available SEC data. Yup, get it free from the SEC, or buy it through Equilar. How does that work?

Well, as data publishers well know, an approach like this usually doesn't work, unless you find a way to add value. And Equilar does this, in spades. You see, Equilar deals in executive compensation benchmarking data, where making it comparable and getting the data right is the basis for an incredible business, and getting it wrong is the basis for going out of business. That’s the challenge and opportunity that exists in many public datasets today, and there is plenty of opportunity still to be mined by savvy companies such as Equilar that look for highly focused data needs and meet them well.

Top executives at publicly-traded companies need to justify their compensation to a number of different constituencies. The best way to do this is to benchmark their compensation against peer companies. But with the complexity of executive compensation plans these days, that’s easier said than done. Equilar saw the need and set out to create a flexible, normalized database of executive compensation data points for benchmarking purposes.

Equilar has done such a good job meeting this need in the marketplace that it faces a problem many of us think we’d like to have – its flagship product has essentially captured all of its core market, and now needs to look elsewhere to find continued growth.

So how does a company that has executed so brilliantly come at a challenge like this? How does it look at opportunities? What does it see as the challenges? Take the opportunity to hear the answers directly from Equilar CEO David Chun, when he provides a company case study at our upcoming Subscription Site Summit this May 8-9 in New York City. There’s limited seating, so sign up today to meet David and other CEO’s from subscription content companies and industry experts that will have you filling notebook after notebook with actionable insights you can use to clean up in your market.