Credit Scores: Not Just for Credit Anymore

A credit score, like it or not, is something that exists for all of us. Pioneered by a company called Fair Isaac (now just known as FICO), the credit score provided powerful advantages to credit granters in two key ways. First, using massive samples of consumer payment data, FICO analysts were able to tease out what characteristics were predictive of an individual’s willingness to re-pay their debts. With this knowledge, the company built sophisticated algorithms to automatically assess and score consumers. This approach is obviously more efficient than manual credit reviews by humans, but it offered consistency and dependability as well. Second, FICO reduces your credit history to a single number in a fixed range. The higher the number, the better your credit. This innovation made it possible for banks and other to write software to offer instant credit decisions, online credit approvals and more. Moreover, a consistent national scoring system made it easy for banks to both manage and benchmark their credit portfolios, as well as watch for early signs of credit erosion.

There’s little doubt that credit scoring was a brilliant innovation, but is it so specialized it can’t be replicated elsewhere? Well, it appears that creative data types are seeing scoring opportunities everywhere these days.

Consider just one example: computer network security scores. There are several companies (and FICO just acquired one of them) that use a variety of publicly available inputs to score the computer networks of companies to assess their vulnerability to hackers. Is this even possible to do? A lot of smart people in the field say it is, and pretty much everyone agrees the need is so great that even if these scores aren’t perfect, they’re better than nothing.

You may also be asking whether or not there is a business opportunity here and indeed there is. Companies buy their own scores to assess how they are doing and to benchmark themselves against their peers. Insurance companies writing policies to cover data hacks and other cybercrimes are desperate for these objective assessments. And increasingly, companies are asking potential vendors to provide them with their scores to make sure all their vendors are taking cybersecurity seriously.

While scoring started with credit, it certainly doesn’t end there. Are there scoring opportunities in your own market? Put on your thinking cap and get creative!

Ebay Revamps By Adding Structure

Ebay, the giant online marketplace/flea market, is reacting to lackluster growth in an interesting way: with a new focus on structured data. The goal, simply put, is to make it easier for users to find merchandise on its site.

Currently, eBay merchants upload free-text descriptions of the products they are offering for sale. This works reasonably well, but as we all know, searching on unstructured text is ultimately a hit-or-miss proposition. And with over one million merchants on eBay doing their own data entry with very few rules and little data validation, you can imagine the number of errors that result, ranging from typos, to use of inconsistent terminology to missing data elements, etc. The consequence of this is that buyers can’t efficiently and confidently discover all items available for sale, and sellers can sell their products because they are not being seen.

It may seem odd that after several decades in business, eBay is just getting around to this. But in fact it hasn’t been standing still. Rather, it’s been investing its resources in perfecting its search software, trying to use algorithms to overcome weaknesses in the descriptive product data. And while eBay has made great strides, this shift to structured data is really an admission that there are limits to free text searching.

Granular, precise search results can’t be better or more accurate than the underlying data. If you want to be able to distinguish between copper and aluminum fasteners in your search results, you need your merchants to specify copper or aluminum, spell the words correctly and consistently, and have agreement on how to handle exceptions such as copperplate aluminum. Ideally, you also want your merchants to tag the metal used in the fastener so that you don’t have to hunt for the information in a block of text, with the associated chance of an erroneous result.

While we’ve come to believe there are no limits to full-text search wizardry, remember the best software in the world breaks down when the data is wrong or doesn’t exist. Google spent many years and millions of dollars trying to build online company directories, before finally admitting that even it couldn’t overcome missing and incorrect data.

Databases and data products are all about structure. Cleaning up and organizing data is slow, expensive and not a lot of fun, but it is a huge value-add. Indeed, one of the biggest complaints of those working in the Big Data arena is that the data they want to analyze is simply too inconsistent and undependable to use.

These days, anyone can aggregate giant pots of data. But increasingly, value is being created by making these pots of data more accessible by adding more structure. This is the essence of data publishing, and something successful data publishers fully appreciate and never forget.  

Time to Get a New Address?

I’ve long been fascinated by unique identifier systems, because while often hard to implement, they can provide enormous value and constitute a great business opportunity. We’re all familiar with the D&B DUNS system, but there are far more identifier systems in use in vertical markets than you might expect. Don’t, for example, try to publish a book without an ISBN number. Similarly, don’t try to get into the advertising specialties business without an ASI number.

Identifier systems are not just for companies. They exist for people too. Physicians in the U.S. have government-issued unique identifiers. LexisNexis has implemented a similar private sector solution for lawyers called the International Standard Lawyer Number (ISLN). And we’re all of course familiar with Social Security numbers. For geographic locations, think about such identifiers as ZIP codes and their value in identifying specific geographic areas.

The power of unique identifiers is that that they serve as a sort of numeric lingua franca. Everyone agrees that a specific company, person or location is identified by a single permanent identifier. This removes ambiguity. It makes all sorts of transactions easier and more efficient. It allows for better and more precise record-keeping. And in this data-centric age, it makes matching of datasets easier and more precise. If everyone can agree on a unique identifier system, all sorts of things happen more easily and smoothly. Needless to say, the operator of the identifier system is in a powerful and lucrative position.

But how ambitious can you get with a non-governmental unique identifier system? After all, if you can’t mandate adoption of your identifiers, you’ve got to build voluntary participation. That’s tough in a narrow, vertical market. Imagine trying to build participation on a broad-based, global basis.

That’s why we were intrigued to run across perhaps the most ambitious attempt at a unique identifier system we have seen. It’s operated by a company called What3Words. Its goal is to assign a unique identifier to every inch of the planet, in 3 meter square blocks. Further, much like the Internet’s Domain Name System, What3Words assigns each block a three-word name instead of numbers, believing the system will be easier to use with words rather than hard to remember random numbers or latitude and longitude coordinates.

You may be saying, “cool, but who needs this?” Well, start with obvious examples of aid agencies trying to serve areas of rural Africa, where no neat systems like ZIP codes exist. Indeed, the founders of Just3Words are quick to note that 75% of the population of the earth essentially don’t exist because they have no physical address. Similarly, hikers and travelers will benefit from being able both to find and describe remote areas. And with much talk of delivery by drones in the near future, a uniform global geo-identifier could be very useful. A consistent system also benefits government administration, development of consistent and comparable statistics, and much more. Those of us who regularly deal with international addresses know they are an inconsistent mess, and these are addresses in advanced, developed countries. There are vast swaths of the planet that still lack addressing systems at all.

It’s a big project, but there’s a big need. And hopefully this brief overview inspires some big thinking about the potential of unique identifiers to make all kinds of activities take place more smoothly and efficiently, with some of those productivity savings accruing to the operator of the identifier system.

 

 

 

Proposed Bill Puts the OPEN in Government Data

Should federal government data be open to the public? Perhaps a better way to frame the question is whether or not the federal government should make public data publicly available. Because databases compiled by the government are, with few exceptions, already open to the public, if you can track them down in the first place. And this problem with discovering government datasets has long been the rub.

The federal government collects data for many reasons, but generally data gathering is for regulatory, compliance or statistical reasons. When this data gathering relates to business entities, there’s usually a business opportunity to be found. That’s because government agencies usually collect data for one specific purpose only. For example, the Federal Aviation Administration maintains a database of all airplanes that are licensed for operation in the United States. It collects a lot of data about both the plane and its owner, but its overall objective is simply to keep a record of whether or not a given plane is licensed to operate. Even if it puts this database online for public access, your ability to search the database is limited to looking up specific airplanes by tail number or owner. This is the compliance focus of government manifesting itself. But that’s great news for commercial data publishers who can get the underlying database and add tremendous value simply by making the data parametrically searchable. Online government databases are almost always designed to help the user find information on a single, known entity. Parametric search creates a powerful sales prospecting tool. Suddenly, the database can be searched by make and model and age of the plane, with the ability to limit search results to specific geographies.

Needless to say, federal government databases can offer huge business opportunities because the government has done all the compilation work, at its own expense, and even keeps the database updated for you. But again, the challenge is finding and accessing these databases in the first place. Government agencies have no incentive to merchandise their internal databases, and many continue to resist opening their datasets to the public, usually out of bureaucratic fear or inertia.

Yes, there is data.gov, a much-heralded federal government initiative to not only move more data online, but to put it all in a central place. But the datasets of interest to commercial data publishers will rarely be found there. However, if you’re interested in data on migratory butterflies in Oklahoma, data.gov is a great place to go.

That’s why I am excited by the OPEN Government Data Act (OPEN Data Act, S. 2852, H.R. 5051) that will mandate that all federal government agencies make all of their datasets immediately available for public use, subject only to a handful of exceptions. This is a bill worth watching and supporting. Fortunes have already been made by commercial data publishers with the savvy and persistence to navigate the federal labyrinth. The OPEN Data Act will level the playing field and open even more opportunities to leverage government data for commercial applications. What’s not to like?

  

 

Disruption without Destruction

In 2013, I wrote about a fascinating new app called Vivino that used image recognition technology in place of the traditional database search interface. Snap a picture of a wine label using the app, and Vivino would search its database to return information on the wine, including ratings and price.

Lest you think this was a specialized, one-off application of image recognition technology, we now learn that Vivino has licensed its technology to a new app company called Magnus that wants to apply the same concept to the world of art. Step up to any painting or other piece of flat artwork (it reports over 8 million pieces of art in its database already), snap a picture, and the app will match it to a database record that will tell you the artist, the year it was created, the medium, and most significantly, the price most recently commanded at auction or the price being asked by the gallery where the art is currently being offered for sale.

Content comes from auction data results. To crack the gallery market, Magnus turns to crowdsourcing, but with a demanding quality control process behind the scenes.

The app is currently free, and this has a double benefit to Magnus. First, it builds the size of its audience some of whom will start to supply price data as well. Second, if Magnus gets to a critical mass of users, art galleries will feel compelled to supply price data to stay competitive, and that would really change the art market, which remains inordinately fond of supplying prices only “on request.”

And that’s truly what is most fascinating about Magnus: it is technically a disruptive data play in the art market, yet it’s not meant to displace galleries. The simple objective of Magnus is to get galleries to be more open about their pricing in the belief that this will make buying art less intimidating to the average consumer and grow art sales overall. There’s no evidence that Magnus is anything but sincere in its desire to help change gallery practices for the good of the galleries.

To date, disrupting a market has typically meant re-ordering an existing market to make a place for the disruptor, typically at the expense of some or all existing players in that market. Here, Magnus is simply trying to disrupt a single, hidebound industry practice for the greater benefit of the industry. Magnus creates a place for itself, but at nobody’s expense. This notion of “additive disruption” is intriguing, and worth further discussion. If there are opportunities to re-arrange or re-invigorate existing markets rather than blowing them up, the number of potential opportunities out there increases dramatically – a pretty picture indeed!