Viewing entries in
Publishing Trends

Comment

Transforming Data the Humin Way

Imagine launching a start-up that is touted as a pioneering “social operating system” a key player in the burgeoning area of “contextual computing” and even a “digital butler.” Let’s go even further, and imagine the burden of having to live up to the goal of “organizing the world” and most intriguing of all, building “a master contacts database for pretty much the entire world?” Well, if you can in fact imagine living up to expectations like this, you’ll probably want to apply for a job at a company called Humin.

On a more practical level, Humin (at least for now) is an app that grabs your contact list, calendar entries and social networks to build a master list. It then automatically contacts everyone on the list and asks them to confirm their details and provide additional information. Once all these data are confirmed and unduplicated, you get a contact list that can be searched by location, by connections (who knows who) and a lot of other ways that go far beyond the typical address book.

To live up to its contextual computing hype, Humin wants to move into push mode. Fly into Cincinnati, for example, and it will present you with a list of your contacts there. Humin will of course get smarter as it begins to find deeper meaning in both the data itself and how you use it. Privacy concerns? Not to worry. Humin hangs onto only the minimum amount of data needed to do its magic – all the most valuable data stays right on your phone.

Those of you who are students of data may see shocking similarities between to an earlier service called Plaxo. In its original incarnation, Plaxo grabbed your address book and would periodically query everyone in it via automated emails to confirm that their details were current. Even more cool, if you updated your own information, Plaxo pushed it out to all your contacts automatically. It was the original globally synchronized contact list. Ultimately, Plaxo went astray, jumping on the social media bandwagon in a failed attempt to challenge Facebook.

The lesson of Humin (beyond possible confirmation that all great data publishing ideas are derivative), is that while Humin may be loosely based on the Plaxo concept, it is moving aggressively to surround data with tools. Humin isn’t just organizing and tidying a giant pile of data and then asking the user to find value in it – it is innovating in multiple ways to do that thinking for the user, and to deliver the right data in the right format at the right time to offer maximum value. We at InfoCommerce Group call it “data that does stuff.” Surround good data with good tools, and you, too, can become master of the data publishing universe.

Comment

Comment

Deriving Data from Images

Google recently acquired a company called Skybox Imaging for $500 million. While admittedly a small deal by the standards of Google, the potential of this acquisition is so mind-boggling it is amazing it has not received greater attention. You see, Skybox makes satellites. And it’s just a year or so away of having six satellites orbiting the earth that will be able to photograph, in high resolution, every spot on Earth and do it twice a day.

In many respects, the innovation here is speed of refresh, not resolution of the images. There is nothing particularly special  about the optics used by Skybox. The excitement comes from the frequency of update. That’s because if you check on things frequently, you can see changes easily. And from those changes you can infer meaning.

Already, we know that satellite photos can be used to calculate the square footage of a building (based on the roof surface), a potential boon to roofing contractors, who can now do estimates from their desks. But you can also start to see angles for data providers as well: the size of a company’s facility can let you infer a lot of valuable things about that company. And Skybox will potentially take this concept to the next level.

Overlay satellite photos with map data (something Google routinely does now), and you now know who owns the property you are looking at. Checking the number of cars in the parking lot twice a day could allow you to infer number of employees. Over time, you could infer if the company is growing or shrinking.

One hedge fund (allegedly) now uses satellite photography to check the number of cars in lots at big box retailers to infer sales. It’s suggested that Skybox can assess the quality and yield of crops while still in the ground, as well as the amount of oil being pumped around the world (by analyzing storage tanks). Consider construction data, where new home starts and completion rates could be accurately measured on a daily basis. Consider measuring the truck and rail traffic into manufacturing plants over time to assess financial conditions. Let your mind roam, because that’s what Skybox is all about.

And lest you think I am alone in this geeky view of things, consider this statement by Skybox co-founder Dan Berkenstock, “"We think we are going to fundamentally change humanity's understanding of the economic landscape on a daily basis.”

The key to all this magic is software that is smart enough to interpret photographic images. This is where images get turned into data. And once that data is overlaid on maps, giving it both context and other data such as ownership, you quickly move to actionable data.

I’m focused on commercial applications for Skybox. For those considering the consumer implications, privacy concerns abound. For the moment it seems, we have to rely on Google not to be evil. And in the interim, there’s still a lot of work to be done to get this infrastructure fully in place and to determine what can be measured as well as what is worth measuring. But as a potential new source of high-value business intelligence in structured form, Skybox is painting a very pretty picture of the future.

Comment

Comment

Lessons From the Data Brokers

Despite its name and author, the new report entitled “Data Brokers: A Call for Transparency and Accountability” from the Federal Trade Commission makes a fascinating read. First of all, what’s a data broker? I believe this label originated within the government to describe aggregators of information about consumers, excluding credit bureaus. As you might expect, this definition includes an eclectic mix of companies. For example, the nine randomly selected companies asked to provide background information for this report are: Acxiom, Corelogic, Datalogix, eBureau, ID Analytics, Intelius, PeekYou, Rapleaf and Recorded Future. The operating scale of these businesses is impressive: one reported that it maintains over 700 billion aggregated data elements, another holds information on 1.4 billion consumer transactions, and still another adds 3 billion new records monthly to its database. And the money is substantial too, with just these nine companies generating over $427 million annually from the sale of consumer data.

So are there lessons for data publishers in the activities of these large providers of personal data? Absolutely. Here’s what I see:

Let’s start with creativity. There seems little doubt these consumer data companies are a number of years ahead of most B2B companies in extracting maximum value from their data. They have developed market segmentation systems, ratings and scores, and powerful analytical tools. They understand the value of historical data to discern patterns and trends, and have rolled out lucrative new products based on data others might discard. Also, these companies truly understand and apply the concept of inferential data. Someone with a pickup truck and a fishing license can be categorized as an outdoor enthusiast, for example.

Perhaps most powerfully, these companies are active in helping marketers to bridge the online-offline divide. They are routinely matching online website registration data to the deep offline data they collect, creating much richer audience profiles. These companies will even embed some of their data in tracking cookies for ad targeting purposes. And of course there is intense marketer interest in understanding online-offline buying behavior.

The segments of the business are interesting as well. Data brokers, in the view of the FTC, break into three types: those who primarily sell marketing data, those who sell risk management data and those who sell people search products (the “background check” and “find anyone online” products that are proliferating these days). Again, we see a progressive industry. It has morphed from the obvious marketing application for its data to risk management. Risk management in the B2C world largely means using data to pre-screen purchases. For example, a risk management product might alert an online vendor that the buyer has had merchandise shipped to an unusually large number of different addresses, a potential indication of fraud. So what’s largely the same dataset has been spun into an entirely new market, and very successfully we might add: for the nine companies in the FTC study, risk management revenues are rapidly approaching marketing revenues. Are there untapped B2B opportunities in risk management? I believe there are.

We also see that with the people search products, the industry has morphed from selling its data to businesses, to selling its data to consumers. This is a sweet pivot that many data publishers aspire to, because with little more than a different user interface, you’ve got a product of interest to a new and vast market.

Another intriguing insight from the report is that none of the companies surveyed obtained all their data from the original source. All were licensing substantial portions of their data from each other. And that makes sense because it allows these companies to move faster, and not have to develop the large staffs necessary to be expert in so many different data sources. Why build your own capability to pull in home ownership data from the source when you can license it from someone who gathers it as a primary business activity? And these licensing agreements, not surprisingly, have grown complex, with some even containing clauses preventing “reverse engineering” of licensed data elements as a way to keep other data brokers from getting too clever with licensed data.

What this report offers is an inside peek into the operations of some very savvy, large and successful consumer data providers, and there you’ll see the future of the business data industry as well.

Comment

Comment

The Power of Predictive Prospecting

Out of all data products, the single largest group is what we call "opportunity finders," databases used by customers to identify sales prospects. These databases, many of which originated as print directories, have followed the normal trajectory of data publishing: moving from being a mile wide and an inch deep to adding tremendous amounts of depth. As publishers add more information to each listing (e.g., revenue, number of employees, year founded, line of business) they enable their users to engage in much more sophisticated targeting of sales prospects. In those situations where a company is looking to sell into a very specific market segment and the data exists to isolate those prospects, it's pretty much mission accomplished for the data publisher. For example, if you sell a product that is only of interest to banks with more than ten branch offices, you can probably find a database that will quickly help you to identify a manageable list of qualified prospects for your product. But there are an awful lot of situations that aren't so neat and tidy. For example, some companies have huge target markets such as "all companies with revenues under $5 million." Some companies literally target everybody. And an awful lot of companies are seeking highly defined target markets for which data doesn't exist (e.g., all private companies whose are considering starting a 401(k) plan).

Until recently, what this meant is that companies were required to slog through a huge number of semi-qualified prospects. Using expensive telesales and field sales teams, they would eventually identify some good prospects, but the work to do so was expensive, slow and not a lot of fun. Could there be a better way?

What we're seeing now are remarkable advances in lead scoring and predictive sales software. The premise is simple: by bringing to bear a lot of information and a lot of smarts about what data points might identify a good prospect, we are getting better a separating strong prospects from weak prospects. Some of the companies leading the way in this area are Lattice Engines (a DataContent 2012 presenter), Context Relevant and Infer.

The potential opportunity for data publishers is to move more aggressively into lead scoring for your customers. Imagine (possibly in combination with one of these firms) to allow your customers to enter parameters about their sales targets, then let them search your data to receive not only the raw information but a predictive score as well to indicate the quality of the prospect.

It's all part of the continued push to data publishers to surround their data with more powerful tools. And is there a tool more powerful that you can offer your customers than one that can help pinpoint where their next sales are most likely to come from?

Comment

1 Comment

Everything Has Its Price

An excerpt from a new book by former Time Inc. executive Walter Isaacson makes a point that is still not fully appreciated by everyone in the content business:

“At Time Inc., we initially planned to charge a small fee or subscription, but Madison Avenue ad buyers were so enthralled by the new medium that they flocked to our building offering to buy the banner ads we had developed for our sites. Thus we and other journalism enterprises decided that it was best to make our content free and garner as many eyeballs as we could for eager advertisers”

Isaacson confirms an absolutely critical insight: it’s not that “information wants to be free.” The reality is that many of the largest content companies chose to make information free. And with no history to provide a guide, and a sense of a giant gold rush and land grab underway, other content producers followed suit. Soon enough, pretty much all content on the web was free, and guess what: users decided they liked things that way, so much so that any content producer brave enough to offer paid content experienced derision from other content producers and almost militant pushback from users.

All this led to the sorry state of affairs where advertisers have moved much of their advertising dollars elsewhere, and users have been fully conditioned to expect their content for free. Intriguingly, what saved most data publishers from this fate was the fact they typically had little in the way of advertising revenues. Thus, offering free online content was clearly nothing more than an express lane to bankruptcy, and this gave them the backbone to continue to charge for their content. And they are all better off for it.

Even today, it remains true that you can make more money faster selling advertising than selling subscriptions. And that’s why many media companies, with their executives steeped in advertising sales culture, still can’t get fully comfortable with the notion of paid content. Subscription-based businesses are desirable, durable and diversified in terms of the customer base, but these businesses build slowly. Indeed, almost all the characteristics that make subscription-based businesses attractive as businesses make them unattractive to those who grew up selling advertising. It’s truly a cultural issue.

All this leads me to think that the emergence of the freemium and metered models is critical to the future of many content publishers. More and more websites are sporting “plus” and “pro” versions that offer different and supplemental content on a paid basis. The publisher keeps a portion of its content for free, the better to aid discovery and get the user hooked. And a portion of the audience will pay to get even more of that content.

Just as we trained users to expect all content for free, we now must begin the slow but essential process of training them that going forward only some content will be free. You can also argue that this shift simultaneously weans both users and publishing executives off of free content. There are still plenty of eyeballs to sell while at the same time the publishers begin to diversify their revenue streams.

And for those data publishers that have always charged for their content online, I will say just two words: carry on.

1 Comment