Viewing entries in
Building Databases

Good Databases Are More Than Just Good Data

We can look to the UK for a case study of how a government agency, after several tries, couldn’t build a user-friendly data product, creating a giant opportunity for a for-profit data company.

The story begins with a regulatory agency called the Financial Conduct Authority (FCA) that among other duties, registers and regulates financial advisors and advisory firms. The FCA has a searchable database on its website, but like so many government websites, it is optimized for one purpose: checking the registration status of a known individual or firm. As a tool to assist you in identifying an advisor to help you with your investments, it’s pretty useless.

In recognition of this shortcoming, the FCA called on a quasi-governmental organization called the Money Advice Service (MAS) to help build a better adviser database, and MAS accepted the challenge. I took a look at this website when it first launched, and though I saw some design issues, it had potential.

But even though MAS nominally had the freedom to build a creative database with almost any business model behind it, its need to avoid controversy ultimately resulted in a very limited and timid product. And when, unsurprisingly, there wasn’t a lot of revenue to be had with such a product, MAS buried the database three levels down on its website and moved on to greener pastures.

With two free databases of financial advisers out there, you think there wouldn’t be much opportunity left for anyone. However, a company called Unbiased saw things differently, and said there was indeed an opportunity … for the right product.

Unbiased has been a big hit in the marketplace, and the way it differentiated itself from the free government services with the same basic listing data holds lessons for us all

  • Greater visibility – Unbiased wants to be found because its business model depends on driving lots of traffic to its participating advisers
  • Deeper data – ratings, discount offers and detailed profiles
  • Strong user interface – clean, inviting design and both parametric search and a custom matching service         

If you have ever wondered how you could compete against a free, government online database, Unbiased provides the answer: data presentation can be as valuable as the underlying data itself, particularly if you are serving a consumer market. And aggressive promotion of your online database will let you run circles around government agency databases, that are generally hard to find in addition to being hard to use. 

1 Comment

The Past is Prologue … to Profit

With the arrival of a new administration in Washington, government websites have been in the news, as many groups have been closely watching them as a way to read the political tea leaves. And the new administration has not been shy about making changes. Within days of the Inauguration, there were reports of substantial changes to the White House website, with whole categories of content suddenly disappearing. Similarly, controversy erupted over removal of content from the Environmental Protection Agency website.

Leaving aside the highly-charged politics driving these actions, there is an important point here for data publishers: online data doesn’t last forever. And that’s a big area of opportunity for publishers.

Company websites are a useful source of current information on companies. They generally do a great job of keeping information on their leadership teams, office locations, products and the like all current and accurate. But while current data is what most people want, those who really want to understand a company also want to know what came before. Even more importantly, if you have enough history, you can start to see trends. But as I just noted, websites only tell you about the present, and they tell you about the present in a way designed to put them in the best possible light.

For example, knowing the name of a company’s current CEO has some value. But there is often as much or greater value in knowing the name of the company’s prior CEO – perhaps she is a recruiting target, or perhaps her biography can provide insight into the changing focus and strategy of the company. And if you keep track of all prior CEO’s and how long each served, you can, among other things, offer high-value insight into the stability of the company.

It’s the same idea with product information. Companies generally announce new products with great fanfare on their websites – usually a press release and often much more. But when new products fail and are discontinued, most companies scrub their websites to remove all traces of these products. There are lots of use cases where knowing what a company is no longer doing is at least as valuable as knowing what it is currently doing. But this kind of information disappears quite quickly online, except in cases where a savvy publisher held onto it.

Perhaps the most intriguing example of preserving online data is The Internet Archive, which takes periodic snapshots of millions of websites. This non-profit project has become a goldmine for researchers, lawyers, investigators, historians, analysts and even savvy salespeople looking to understand how companies have grown and evolved over time.

While it’s easy to conclude that “everything is online now,” the fact is that a lot of information, particularly company information, disappears fairly quickly from the web both by accident and design. Smart publishers are the ones to understand this, and who set themselves up to capture and preserve this information as a way to enhance the value of their own online data products.

 

1 Comment

Time to Get a New Address?

I’ve long been fascinated by unique identifier systems, because while often hard to implement, they can provide enormous value and constitute a great business opportunity. We’re all familiar with the D&B DUNS system, but there are far more identifier systems in use in vertical markets than you might expect. Don’t, for example, try to publish a book without an ISBN number. Similarly, don’t try to get into the advertising specialties business without an ASI number.

Identifier systems are not just for companies. They exist for people too. Physicians in the U.S. have government-issued unique identifiers. LexisNexis has implemented a similar private sector solution for lawyers called the International Standard Lawyer Number (ISLN). And we’re all of course familiar with Social Security numbers. For geographic locations, think about such identifiers as ZIP codes and their value in identifying specific geographic areas.

The power of unique identifiers is that that they serve as a sort of numeric lingua franca. Everyone agrees that a specific company, person or location is identified by a single permanent identifier. This removes ambiguity. It makes all sorts of transactions easier and more efficient. It allows for better and more precise record-keeping. And in this data-centric age, it makes matching of datasets easier and more precise. If everyone can agree on a unique identifier system, all sorts of things happen more easily and smoothly. Needless to say, the operator of the identifier system is in a powerful and lucrative position.

But how ambitious can you get with a non-governmental unique identifier system? After all, if you can’t mandate adoption of your identifiers, you’ve got to build voluntary participation. That’s tough in a narrow, vertical market. Imagine trying to build participation on a broad-based, global basis.

That’s why we were intrigued to run across perhaps the most ambitious attempt at a unique identifier system we have seen. It’s operated by a company called What3Words. Its goal is to assign a unique identifier to every inch of the planet, in 3 meter square blocks. Further, much like the Internet’s Domain Name System, What3Words assigns each block a three-word name instead of numbers, believing the system will be easier to use with words rather than hard to remember random numbers or latitude and longitude coordinates.

You may be saying, “cool, but who needs this?” Well, start with obvious examples of aid agencies trying to serve areas of rural Africa, where no neat systems like ZIP codes exist. Indeed, the founders of Just3Words are quick to note that 75% of the population of the earth essentially don’t exist because they have no physical address. Similarly, hikers and travelers will benefit from being able both to find and describe remote areas. And with much talk of delivery by drones in the near future, a uniform global geo-identifier could be very useful. A consistent system also benefits government administration, development of consistent and comparable statistics, and much more. Those of us who regularly deal with international addresses know they are an inconsistent mess, and these are addresses in advanced, developed countries. There are vast swaths of the planet that still lack addressing systems at all.

It’s a big project, but there’s a big need. And hopefully this brief overview inspires some big thinking about the potential of unique identifiers to make all kinds of activities take place more smoothly and efficiently, with some of those productivity savings accruing to the operator of the identifier system.

 

 

 

How Do You Rate?

Morningstar, the financial information giant, today announced that it will be licensing a ratings system from Sustainalytics, a Dutch company that assesses and rates public companies along three dimensions: environmental and social responsibility and governance. Morningstar will adapt this methodology and apply it to mutual funds.

Why the rush by Morningstar to add still more ratings to its data platform? And why license a ratings system when Morningstar already has demonstrated expertise in this area? Indeed, Morningstar has been rating mutual funds on their stewardship (akin to governance) for a number of years now.

The answer, in a word, is that ratings systems are hot. While they don’t look like much on the surface, they offer to users what they most want today: fast answers. You could even go so far as to say that the other reason ratings system are so popular is that they do the research – if not the thinking – for you.

Most importantly of all from a data perspective, a ratings system provides a consistent, normalized and sortable data point. This is especially valuable in the investment world, which is in the business of finding needles in haystacks. Ratings systems and other filters significantly streamline this process.

Imagine if someone asked to you identify the ten best restaurants in Dallas. Without Yelp and Zagat and the other existing restaurant rating services, this would be a nearly impossible task, particularly if you were looking for a comprehensive and objective answer. But these services in effect conduct mass-scale surveys, asking people to condense their opinions of restaurants into a predefined ratings scale. This user-generated approach to ratings has all sorts of imperfections, but most people believe that with enough people participating, the truth will present itself.

A step up from these open surveys are the professionally administered ratings systems. These distinguish themselves by identifying and rating companies against a fixed set of criteria. The goal of the exercise is to be objective as possible. That’s why data are used in place of opinions whenever possible. The more rigorous the system, the more valuable it tends to be. That’s because in addition to being normalized and consistent, these ratings systems allow you to make dependable comparisons. Companies rated “A,” for example, are all rated that way because they met a certain specified set of criteria. That means you can place more trust in the ratings system.

Interestingly, most ratings systems happily publish their underlying criteria and ratings methodologies. While this might seem to be their highly proprietary “secret sauce,” the reality is that nobody wants to undertake the same laborious ratings work if somebody else has done it, and publicizing the underlying methodology builds credibility and trust. In fact, the underlying methodology of most professional rating systems is central to their marketing efforts.

Rating systems reflect the fundamental shift we are seeing from data publishers selling vast piles of raw data to high value, more analytical datasets. The next opportunity is to actually do the analysis for them.

You can learn more about how publishers are using their data to produce a wide range of high value products at this year's Business Information and Media Summit. Hope to see you there!

Making Music

I’ve been impressed and entranced by the music service Pandora since I first ran across it several online lifetimes ago in 2007.

Two things particularly impressed me about Pandora. First, unlike services such as Spotify that allow you to access music you already know about, Pandora was the first large scale attempt to offer music discovery. Enter the artist or tracks you like most, and Pandora would find more music that was similar. Normally you would expect to learn that Pandora is powered by cutting-edge algorithms.

In fact, Pandora is powered by humans. Music school graduates. Many dozens of them, all methodically classifying individual songs against a master taxonomy of over 400 characteristics. It’s an expensive approach, but it’s organized and returns consistently high quality results. And while Pandora continues to struggle from a profitability standpoint, nobody argues with the quality of its service.

But what if you could create a Pandora-like service without the high labor costs? That’s what a company called 8Tracks set out to do.

Rather than having a paid staff categorize music, 8 Tracks went the social media route. Everyone was invited in essence to become a DJ, and upload their own song lists to the 8Tracks site. These playlists were organized via tags, so users could discover music based on mood or musical style, for example. If users like particular playlists, they can follow the people who uploaded them in order to see all their new playlists right away.

8Tracks is unquestionably providing a music discovery service, just like Pandora. But it’s a fundamentally different experience. Pandora is dependable, seamless and efficient. 8Tracks is hit-and-miss, time-consuming and requires lots of user interaction.

There’s room for both services in the vast music market and indeed, both services have many enthusiastic adherents. Yet by looking at both services side-by-side, you can see the strengths and weaknesses of user-generated content very clearly.

Music is entertainment. There’s no risk or consequence if you don’t discover a certain song by a certain artist. But when you move into the realm of business information, that dynamic changes. Suddenly, getting the right answer starts to matter a lot. That’s where user-generated content can come up short. Users generate whatever content they want, whenever the want, for as long as they want. You have little control. User-generated content works best where there is a massive volume of content (think Yelp or TripAdvisor) and the correct answers will win out, or in situations where there is no alternative information source, making your content the best that is available. But when the quality of your content matters, social approaches to content creation can yield decidedly off-key results.