Transforming Data the Humin Way
Imagine launching a start-up that is touted as a pioneering “social operating system” a key player in the burgeoning area of “contextual computing” and even a “digital butler.” Let’s go even further, and imagine the burden of having to live up to the goal of “organizing the world” and most intriguing of all, building “a master contacts database for pretty much the entire world?” Well, if you can in fact imagine living up to expectations like this, you’ll probably want to apply for a job at a company called Humin.
On a more practical level, Humin (at least for now) is an app that grabs your contact list, calendar entries and social networks to build a master list. It then automatically contacts everyone on the list and asks them to confirm their details and provide additional information. Once all these data are confirmed and unduplicated, you get a contact list that can be searched by location, by connections (who knows who) and a lot of other ways that go far beyond the typical address book.
To live up to its contextual computing hype, Humin wants to move into push mode. Fly into Cincinnati, for example, and it will present you with a list of your contacts there. Humin will of course get smarter as it begins to find deeper meaning in both the data itself and how you use it. Privacy concerns? Not to worry. Humin hangs onto only the minimum amount of data needed to do its magic – all the most valuable data stays right on your phone.
Those of you who are students of data may see shocking similarities between to an earlier service called Plaxo. In its original incarnation, Plaxo grabbed your address book and would periodically query everyone in it via automated emails to confirm that their details were current. Even more cool, if you updated your own information, Plaxo pushed it out to all your contacts automatically. It was the original globally synchronized contact list. Ultimately, Plaxo went astray, jumping on the social media bandwagon in a failed attempt to challenge Facebook.
The lesson of Humin (beyond possible confirmation that all great data publishing ideas are derivative), is that while Humin may be loosely based on the Plaxo concept, it is moving aggressively to surround data with tools. Humin isn’t just organizing and tidying a giant pile of data and then asking the user to find value in it – it is innovating in multiple ways to do that thinking for the user, and to deliver the right data in the right format at the right time to offer maximum value. We at InfoCommerce Group call it “data that does stuff.” Surround good data with good tools, and you, too, can become master of the data publishing universe.
Deriving Data from Images
Google recently acquired a company called Skybox Imaging for $500 million. While admittedly a small deal by the standards of Google, the potential of this acquisition is so mind-boggling it is amazing it has not received greater attention. You see, Skybox makes satellites. And it’s just a year or so away of having six satellites orbiting the earth that will be able to photograph, in high resolution, every spot on Earth and do it twice a day.
In many respects, the innovation here is speed of refresh, not resolution of the images. There is nothing particularly special about the optics used by Skybox. The excitement comes from the frequency of update. That’s because if you check on things frequently, you can see changes easily. And from those changes you can infer meaning.
Already, we know that satellite photos can be used to calculate the square footage of a building (based on the roof surface), a potential boon to roofing contractors, who can now do estimates from their desks. But you can also start to see angles for data providers as well: the size of a company’s facility can let you infer a lot of valuable things about that company. And Skybox will potentially take this concept to the next level.
Overlay satellite photos with map data (something Google routinely does now), and you now know who owns the property you are looking at. Checking the number of cars in the parking lot twice a day could allow you to infer number of employees. Over time, you could infer if the company is growing or shrinking.
One hedge fund (allegedly) now uses satellite photography to check the number of cars in lots at big box retailers to infer sales. It’s suggested that Skybox can assess the quality and yield of crops while still in the ground, as well as the amount of oil being pumped around the world (by analyzing storage tanks). Consider construction data, where new home starts and completion rates could be accurately measured on a daily basis. Consider measuring the truck and rail traffic into manufacturing plants over time to assess financial conditions. Let your mind roam, because that’s what Skybox is all about.
And lest you think I am alone in this geeky view of things, consider this statement by Skybox co-founder Dan Berkenstock, “"We think we are going to fundamentally change humanity's understanding of the economic landscape on a daily basis.”
The key to all this magic is software that is smart enough to interpret photographic images. This is where images get turned into data. And once that data is overlaid on maps, giving it both context and other data such as ownership, you quickly move to actionable data.
I’m focused on commercial applications for Skybox. For those considering the consumer implications, privacy concerns abound. For the moment it seems, we have to rely on Google not to be evil. And in the interim, there’s still a lot of work to be done to get this infrastructure fully in place and to determine what can be measured as well as what is worth measuring. But as a potential new source of high-value business intelligence in structured form, Skybox is painting a very pretty picture of the future.
Buying Guides That Do Stuff
It’s been very interesting to watch the transition of buying guides from print to online. Print buying guides were a pretty good business, although in fact few of them were very good products. That’s because most buying guides were what I call shallow information products: they would typically list a product and the names and addresses of companies that (hopefully) made or sold the product. After that, users were on their own. This stripped-down format was in part practical, because even this limited information was hard to obtain. It was in part by design, because it encouraged companies to buy advertising next to their listings to provide additional information. There’s no room on the web for shallow information products anymore. Search engines have gotten good enough that you can find at least a few manufacturers or sellers of just about anything with very little effort. And company websites now typically contain a wealth of product information, in part because it is so cheap and efficient to do so. Overall, this leaves little room for buying guides to add value, at least in their traditional format.
So is the buying guide model dead? If you are talking about the traditional shallow information model, the answer is yes (something that the big yellow page publishers, incredibly, have still not figured out). But what is emerging in its place are a number of exciting new products that mix and match such features as:
- User ratings and reviews (and some now validate users and even confirm that they have purchased the product they are reviewing)
- Links to third-party professional reviews
- Downloadable CAD drawings
- Photo portfolios showing product applications and/or the product in use
- Strong parametric search
- Side-by-side comparison of selected products
- Guided search where instead of traditional searches, users answer a questionnaire instead
- Shared online areas where users can post products for review by co-workers
- Ability to request product samples from the manufacturer
- Integrated ordering capabilities
- Warehousing and shipping of product on behalf of manufacturers
- Product specification data, warranty data, installation instructions, manuals
- Real-time inventory information
- Real-time pricing information
In short, the list is long. And what results is a true destination purchasing research site and, increasingly, a central marketplace. Find exactly what you need and order it. That’s been the holy grail of buying guides for decades, and it’s finally becoming a reality.
The other piece of the puzzle is advertising. Because publishers are now building these true destination sites, they can also develop substantial traffic simply because they are offering utility and value. And advertisers respect these highly qualified or often quite large audiences because they are truly “in the market,” and what advertiser doesn’t want visibility when the buying decision is being made. It is, as we like to say, “data that does stuff.”
So while the approach is different, what we see with buying guides is exactly the same as what we see with other forms of data, and exemplifies infocommerce: creating a high value proposition with better, deeper data and tools to act on it.
Lessons From the Data Brokers
Despite its name and author, the new report entitled “Data Brokers: A Call for Transparency and Accountability” from the Federal Trade Commission makes a fascinating read. First of all, what’s a data broker? I believe this label originated within the government to describe aggregators of information about consumers, excluding credit bureaus. As you might expect, this definition includes an eclectic mix of companies. For example, the nine randomly selected companies asked to provide background information for this report are: Acxiom, Corelogic, Datalogix, eBureau, ID Analytics, Intelius, PeekYou, Rapleaf and Recorded Future. The operating scale of these businesses is impressive: one reported that it maintains over 700 billion aggregated data elements, another holds information on 1.4 billion consumer transactions, and still another adds 3 billion new records monthly to its database. And the money is substantial too, with just these nine companies generating over $427 million annually from the sale of consumer data.
So are there lessons for data publishers in the activities of these large providers of personal data? Absolutely. Here’s what I see:
Let’s start with creativity. There seems little doubt these consumer data companies are a number of years ahead of most B2B companies in extracting maximum value from their data. They have developed market segmentation systems, ratings and scores, and powerful analytical tools. They understand the value of historical data to discern patterns and trends, and have rolled out lucrative new products based on data others might discard. Also, these companies truly understand and apply the concept of inferential data. Someone with a pickup truck and a fishing license can be categorized as an outdoor enthusiast, for example.
Perhaps most powerfully, these companies are active in helping marketers to bridge the online-offline divide. They are routinely matching online website registration data to the deep offline data they collect, creating much richer audience profiles. These companies will even embed some of their data in tracking cookies for ad targeting purposes. And of course there is intense marketer interest in understanding online-offline buying behavior.
The segments of the business are interesting as well. Data brokers, in the view of the FTC, break into three types: those who primarily sell marketing data, those who sell risk management data and those who sell people search products (the “background check” and “find anyone online” products that are proliferating these days). Again, we see a progressive industry. It has morphed from the obvious marketing application for its data to risk management. Risk management in the B2C world largely means using data to pre-screen purchases. For example, a risk management product might alert an online vendor that the buyer has had merchandise shipped to an unusually large number of different addresses, a potential indication of fraud. So what’s largely the same dataset has been spun into an entirely new market, and very successfully we might add: for the nine companies in the FTC study, risk management revenues are rapidly approaching marketing revenues. Are there untapped B2B opportunities in risk management? I believe there are.
We also see that with the people search products, the industry has morphed from selling its data to businesses, to selling its data to consumers. This is a sweet pivot that many data publishers aspire to, because with little more than a different user interface, you’ve got a product of interest to a new and vast market.
Another intriguing insight from the report is that none of the companies surveyed obtained all their data from the original source. All were licensing substantial portions of their data from each other. And that makes sense because it allows these companies to move faster, and not have to develop the large staffs necessary to be expert in so many different data sources. Why build your own capability to pull in home ownership data from the source when you can license it from someone who gathers it as a primary business activity? And these licensing agreements, not surprisingly, have grown complex, with some even containing clauses preventing “reverse engineering” of licensed data elements as a way to keep other data brokers from getting too clever with licensed data.
What this report offers is an inside peek into the operations of some very savvy, large and successful consumer data providers, and there you’ll see the future of the business data industry as well.
The Power of Predictive Prospecting
Out of all data products, the single largest group is what we call "opportunity finders," databases used by customers to identify sales prospects. These databases, many of which originated as print directories, have followed the normal trajectory of data publishing: moving from being a mile wide and an inch deep to adding tremendous amounts of depth. As publishers add more information to each listing (e.g., revenue, number of employees, year founded, line of business) they enable their users to engage in much more sophisticated targeting of sales prospects. In those situations where a company is looking to sell into a very specific market segment and the data exists to isolate those prospects, it's pretty much mission accomplished for the data publisher. For example, if you sell a product that is only of interest to banks with more than ten branch offices, you can probably find a database that will quickly help you to identify a manageable list of qualified prospects for your product. But there are an awful lot of situations that aren't so neat and tidy. For example, some companies have huge target markets such as "all companies with revenues under $5 million." Some companies literally target everybody. And an awful lot of companies are seeking highly defined target markets for which data doesn't exist (e.g., all private companies whose are considering starting a 401(k) plan).
Until recently, what this meant is that companies were required to slog through a huge number of semi-qualified prospects. Using expensive telesales and field sales teams, they would eventually identify some good prospects, but the work to do so was expensive, slow and not a lot of fun. Could there be a better way?
What we're seeing now are remarkable advances in lead scoring and predictive sales software. The premise is simple: by bringing to bear a lot of information and a lot of smarts about what data points might identify a good prospect, we are getting better a separating strong prospects from weak prospects. Some of the companies leading the way in this area are Lattice Engines (a DataContent 2012 presenter), Context Relevant and Infer.
The potential opportunity for data publishers is to move more aggressively into lead scoring for your customers. Imagine (possibly in combination with one of these firms) to allow your customers to enter parameters about their sales targets, then let them search your data to receive not only the raw information but a predictive score as well to indicate the quality of the prospect.
It's all part of the continued push to data publishers to surround their data with more powerful tools. And is there a tool more powerful that you can offer your customers than one that can help pinpoint where their next sales are most likely to come from?