Looking for a New Product Idea? Just Ask.

(Part One- Continues Next Week)

Where do really good ideas for new data products come from? Not surprisingly, I am asked this question a lot. Perhaps surprisingly, the answer isn’t all that complicated.

The best ideas for new data products almost invariably come from personal need. History shows that the data products that succeed most readily tend to be highly specialized in terms of content and user base – and they were typically surfaced by people who would use such a data product themselves, if someone else produced it. The person who sees the opportunities knows just how useful and valuable the new product would be, that nothing else like it currently exists in the market, and that there are many other people in similar roles in other companies who would benefit from it. Right there, you have all the ingredients for a winning data product, and I have seen dozens of them over the years, in almost every case started by someone with no data publishing experience, but who did have a deep understanding of the need for the data. As just one example, a recent news article talks about a professor, frustrated by the lack of information on sustainable building products manufacturers, decided to compile his own directory. Despite being published as a print directory, it’s already in its second edition – the need was out there for this information.

Why did a professor of architectural technology and building science decide to become a publisher? Likely because he didn’t feel he had any options. And that’s not surprising. For despite the intense interest of B2B media companies in new data products, not one that I know tries to reach out to its audience for new product ideas. That’s a shame, because in my experience it’s mid-level executives buried deep in large organizations who are the best source of these new opportunities. All you have to do is ask. 



North Korea Sparks a Trip Down Memory Lane

The latest news from North Korea should make us grateful we are not in the business there.  On word that several North Korean phone directories had been smuggled outside of the country, the country’s leader, Kim Jung-un, ordered that ALL phone numbers in the country be changed … randomly and without warning!

Here’s some nostalgia to put this in perspective. Here in the US, it began with  something called the “fax machine.” This was a device that scanned documents and then transmitted them via phone lines to a distant location. Faxes were the email of their day, but to get the real-time delivery benefits of faxing, you needed a separate phone line for your fax machine so that it was always available to send and receive. This created a huge jump in demand for new phone lines, and thus, new phone numbers.

If fax machines weren’t enough, we also had the advent of mobile phones, each of which demanded its own phone number. Phone companies ran out of available phone numbers in existing area codes, and begin the seemingly endless process of introducing new area codes (73 in just the past ten years), creating endless amounts of new work for data publishers in the process.

Those of you in the trenches for all this fun may also recall that the phone companies initially favored the dreaded area code “splits,” where half the people in an existing area code would be assigned a new area code. After much complaining, particularly from businesses that had to change signage, stationery and more, the phone companies moved to “overlay” area codes, where all new phone number requests in an existing area code simply received numbers with the new area code.

That’s another quaint aspect of area codes in the old days – they used to define specific geographies. But with the growth of both toll-free numbers, VOIP phones and number portability, your phone number no longer necessarily ties you to any geographic area.

Of course, for all the angst and additional work these changes have caused, at least they were systematic.  And if you are looking for expansion opportunities in 2018, North Korea appears wide open. 



Where the Value is In Visual Data

The New York Times recently reported on the results of a fascinating project conducted at Stanford University. Using over 50 million images drawn from Google Street View, along with ZIP code data, the researchers were able to associate automobile ownership preferences with voting patterns. For example, the researchers found that the type of vehicles most strongly associated with Republican voting districts are extended-cab pickup trucks.

While this particular finding may not surprise you, the underlying work represents a programmatic tour de force, because artificial intelligence software was used to identify and classify the vehicles found in these 50 million images. The researchers used automotive experts to identify specific makes and models of cars from the images, giving the software a basis for training itself to find and identify vehicles all by itself, regardless of the angle of the photo, shadows and a host of other factors that make this anything but an easy task.

This project is believed to represent that first time that images have been used on a large scale to develop data. And while this image identification is a technically impressive example of both artificial intelligence and Big Data, most of the really useful insights come from associating the finding with other datasets, what I like to refer to as Little Data.

Think about it. The artificial intelligence software is given as input an image, and the ZIP code associated with that image. The software identifies an automobile make and model from the image, and creates an output record with two elements: the ZIP code and a normalized make and model description of the automobile. With this, you can explore auto ownership patterns by geography. But with just a few more steps, you can go a lot further.

You can use “little data” government and private datasets to link ZIP code to voting districts and thus voting patterns. With this information, you can determine that people living in Republican districts prefer extended-cab pickup trucks.

You can also use the ZIP code in the record to link to “little data” Census demographic data summarized at ZIP level. With this, you can correlate car ownership patterns to such things as income, race, education and ethnicity. Indeed, the study found it could predict demographics and voting patterns based on auto ownership.

And you can go further. You can link your normalized automobile make and model data to “little data” datasets of automobile technical specifications which is how the study determined, for example, that based on miles per gallon, Burlington, Vermont is the greenest city in the United States.

Using artificial intelligence on a Big Data image database to build a normalized text database is impressive. But all the real insights in this study could only be developed by linking Big Data to Little Data to allow for granular analysis.

While Big Data and artificial intelligence are getting all the breathless coverage, we should never forget that Little Data is what’s providing the real value behind the scenes.  


Workflow Elimination

The power of embedding one’s data product into a customer’s workflow is well understood by data publishers. Simply put, once a customer starts depending on your data and associated software functionality, it’s hard to cancel or switch away from you because the customer’s work has become designed around your product. It’s a great place to be, and it’s probably the primary reason that renewal rates for data products can sometimes verge on 100%.

But should workflow embedment be the ultimate objective of data publishers? This may depend on the industry served, because we are starting to see fascinating glimpses of a new type of market disruption that might be called “workflow elimination.”

Here’s a great example of this phenomenon in the insurance industry. A company called Metromile has rolled out an artificial intelligence system called Ava. What Ava does is stunning.

Auto insurers using Ava require their policyholders to attach a device called Metromile Pulse to their cars. As you may know, virtually all cars now have onboard computers that log tremendous amounts of data about the vehicle. In fact, when your local auto mechanic performs a computerized diagnosis of your car, this is where the diagnostic data comes from. Metromile Pulse plugs into this onboard computer. The device does two things for insurance companies: It allows them to charge for insurance by the mile, since the onboard computer records miles driven and the device transmits them wirelessly to the insurer. That’s pretty cool and innovative. But here’s what’s mind-blowing: if a policyholder has an auto accident, he or she can file an online claim, and Ava can use the onboard data to confirm the accident, re-construct the accident using artificial intelligence software, and automatically authorize payment on the claim if everything checks out, and all this can be done within a few seconds. The traditional claims payment workflow hasn’ just been collapsed, it’s effectively been eliminated.

How does a data publisher embed in workflow if there’s no workflow? That’s a problem, but it’s also an opportunity, because data publishers are well positioned to provide the tools to eliminate workflow. If they do this, and do this first, they’ll be even more deeply embedded in the operations of their customers. And doubtless you’re already thinking about all the subsidiary opportunities that would flow out of being in the middle of so much highly granular data on automobile operation.

“Workflow elimination” won’t impact every industry quickly if at all. But it’s an example of how important it is to stay ahead of the curve on new technology and always seeking to be the disrupter as opposed to the disruptee.


Sharing in Private

While there are many, many B2C ratings and review sites where consumers rate and otherwise report their experiences with businesses, there are relatively few B2B sites where businesses rate other businesses. There are multiple reasons for this, but prime among them is that while businesses tend to have a strong interest in using this kind of information, they typically don’t want to supply this kind of information. In short, they see competitive advantage in keeping their vendor experiences confidential.

One fascinating example of this in the legal market is a company called Courtroom Insight. Originally founded with the simple and reasonable idea of creating a website where lawyers could rate expert witnesses (experts hired by lawyers to testify in court), the company hit this exact wall: lawyers didn’t want to tell other lawyers about which experts they did and didn’t like.

Rather than close up shop, though, Courtroom Insights pivoted, in an interesting way. It discovered that large law firms were very sloppy about keeping records of their own expert witnesses. So, Courtroom Insights built a database of expert witness from public sources and licensed data. It then went to large law firms an offered them an expert witness management database. Not only could lawyers search for expert witnesses and verify their credentials, it could flag those experts they used, along with private notes that could be shared freely within the law firm, but not externally.

This pivot created a nice business for Courtroom Insights but it wasn’t done. Since all of its large law firm clients were sharing the same database, but also individually flagging the experts they were using, could Courtroom Insights convince them to share that information among themselves? Recently, they offered this “who’s using who” data to its clients on a voluntary, opt-in basis. And it worked. While not every client opted in, enough did so that Courtroom Insights could make another level of valuable information available.

While this is just my personal prediction, I think Courtroom Insights will ultimately be able to offer the expert witness ratings that it originally tried to provide. How? By using the protected space of its system to let lawyers trade this high-value information with each other. It will probably start small: perhaps lawyers could click a simple “thumbs up/thumbs down” icon next to each expert that could be shared. But I also suspect that if Courtroom Insights can crack the initial resistance to share information, the floodgates will open, because lawyers will realize they are communicating only with other lawyers, and because the benefits of “give to get” information exchange becomes so compelling.

The Courtroom Insights story provides a fine example of the power of what we call the Closed Data Pool in our Business Information Framework. Sometimes data that nobody will share publicly can in fact be shared among a restricted group of participants, with of course, a trusted, neutral data publisher making it all happen.