Smarter Data is Right Inside the Box
Most of us are at least somewhat familiar with the concept of the “sales trigger,” something I lump into a larger category I call “inferential data.” If you’re not familiar with the concept, what we are talking about is taking a fact, for example that a company has just moved, and drawing inferences from that fact. We can infer from a recent company move that the company in question is likely to imminently be in the market for a host of new vendors for a whole range of mundane but important office requirements. So if we learn about this company move right after it happens (or, ideally, right before it happens), we have an event that will trigger a number of sales opportunities, hence the name “sales trigger.” But as I noted above, sales triggers in my view are a subset of inferential data. I say that because sales triggers tend to be rather basic and obvious, while true inferential data can get extremely nuanced and powerful, especially when you start analyzing multiple facts and drawing conclusions from them. Tech-savvy folks refer to these multiple input streams as “signals.”
Let’s go back to our example above. The company has moved. That means they likely need a new coffee service and cleaning service, among others. That’s fine as far as it goes. But let’s go deeper. Let’s take the company’s old address and new address, and bounce them against a commercial property database. If the company is moving from $20/square foot space to $50/square foot space, chances are this company is doing well. At a minimum, this makes for a more interesting prospect for coffee service vendors. But it can also be the basis for assigning a company a “high growth” flag, making it interesting to a much broader range of vendors, many of whom will pay a premium to learn about such companies.
Or perhaps we know this company has changed addresses three times in five years. We could infer from this either extremely high growth or extreme financial distress. Since this relocation signal doesn’t give us enough clarity, we need to marry it with other signals such as number of employees during the same period, or the cost of the space or amount of square feet leased. Of course, signals go far beyond real estate. If the company had a new product launch or acquisition during that period, these signals would suggest the address changes signify rapid growth.
You can see the potential power in inferential data, as well as the complexity. That’s because in the business of signals, the more the better. Pretty soon, you’re in the world of Big Data, and you’ll also need the analytical horsepower to make sense of all these data signals, and to test your assumptions. It’s not a small job to get it right.
That’s why I was excited to learn a company called – what else – Infer. Infer collects and interprets signals to help score sales leads. And it sells this service to anyone who wants to integrate it with their existing applications. It’s essentially SaaS for lead scoring. Intriguingly, Infer licenses data from numerous data providers to get critical signals it needs.
Inferential data makes any data it is added to smarter, which in turn makes that data more valuable. Many publishers have latent inferential data they can make use of, but for others, watch out for those “signals in a box” products from what I suspect will be a growing number of vendors in this space. It’s the smart thing to do.
Source Data’s True Worth
In my discussion of the Internet of Things (IoT) a few weeks back, I mentioned that there was a big push underway to put sensors in farm fields to collect and monitor soil conditions as a way to optimize fertilizer application, planting dates, etc. But who would be the owner of this information, which everyone in agriculture believes to be exceedingly valuable? Apparently, this is far from decided. An association of farmers, The Farm Bureau, recently testified in Congress that it believes that farmers should have control over this data, and indeed should be paid for providing access to it.
We’ve heard this notion advanced in many different contexts over the past few years. Many consumer advocates maintain that consumers should be compensated by third parties who are accessing their data and generating revenue from it.
Generally, this push for compensation centers on the notion of fairness, but others have suggested it could have motivational value as well: if you offer to pay consumers to voluntarily supply data, more consumers will supply data.
The notion of paying for data certainly makes logical sense, but does it work in practice? Usually not.
The first problem with paying to collect data on any scale is that it is expensive. More times than not, it’s just not an economical approach for the data publisher. And while the aggregate cost is large, the amount an individual typically receives is somewhere between small and tiny which really removes its motivational value.
The other issue (and I’ve seen this first-hand) is the perception of value. Offer someone $1 for their data, and they immediately assume it is worth $10. True, the data is valuable, but only once aggregated. Individual data points in fact aren’t worth very much at all. But try arguing this nuance to the marketplace. It’s hard.
I still get postal mail surveys with the famous “guilt dollar” enclosed. This is a form of paying for data, but it drives, as noted, off guilt, which means undependable results. Further, these payments are made to assure an adequate aggregate response: whether or not you in particular respond to the survey really doesn’t matter. It’s a different situation for, say, a data publisher trying to collect retail store sales data. Not having data from Wal-Mart really does matter.
Outside of the research world, I just haven’t seen many successful examples of data publishers paying to collect primary source data. When a data publisher does feel a need to provide an incentive, it’s almost always in the form of some limited access to the aggregated data. That makes sense because that’s when the data becomes most valuable: once aggregated. And supplying users with a taste of your valuable data often results in them purchasing more of it from you.
Read More
The Billion Prices Project
Last week, I discussed how the Internet of Things creates all sorts of potential opportunities to create highly valuable, highly granular data. The Billion Prices Project, which is based at MIT, provides another route to the same result. Summarized very simply, two MIT professors, Alberto Cavallo and Roberto Rigobon, collect data from hundreds of online retailers all over the world to build a massive database of product-level pricing data, updated daily. It’s an analytical goldmine that can be applied to solve a broad range of problems.
One obvious example is the measurement of inflation. Currently, the U.S. Government develops its Consumer Price Index inflation data the old fashioned way: mail, phone and field surveys. And inherently, this process is slow. Contrast that with the Billion Price Project that can measure inflation on a daily basis, and do so for a large number of countries.
But measuring inflation is just the beginning. The Billion Prices Project is exploring a range of intriguing questions, such as the premiums that are charged for organic foods and the impact of exchange rates on pricing. You’re really only limited by your specific business information needs – and your imagination.
The Billion Prices Project also offers some useful insights for data publishers. First, the underlying data is scraped from websites. The Billion Prices Project didn’t ask for it or pay for it. That means you can build huge datasets quickly and economically. Secondly, the dataset is significantly incomplete. For example, it entirely ignores the huge service sector of the economy. But’s it’s better than the existing dataset in many ways, and that’s what really matters.
When considering building a database, new web extraction technology gives you the ability to build massive, useful and high quality datasets quickly and economically. And as we have seen time after time, the old aphorism, “don’t let the perfect be the enemy of the good” still holds true. If you can do better than what’s currently available, you generally have an opportunity. Don’t focus on what you can’t get. Instead, focus on whether what you can get meaningfully advances the ball.
A New Push to End Passwords
I hate passwords. But I don’t hate passwords as a concept. Certainly I understand the need, but password protection implemented poorly creates friction and often frustration, and that’s not good for business or for my own personal protection.
Now there’s a new initiative out of Silicon Valley called the “Petition Against Passwords.” It’s not proposing a specific alternative, but the basic premise is that we can do better. And the initiative seems to be getting some early traction. But I think that before we try to improve, we also need to address our failings.
In my view, because online security has become such a high profile concern, many companies have given their programmers carte blanche to “beef up security.” And beef they have, adding all sorts of onerous restrictions, cool new programming and faddish techniques that satisfy their intellectual curiosity, but put a big dent in the overall user experience.
Several years ago, I bought one of the most popular password management programs called Roboform. It actually will provide long, randomly generated passwords for every site where I have an account. Once set-up, I could access any site with a single click. Nirvana! I was fully protected, and friction was eliminated. This was a win for everyone. And it worked. For a while.
But I’ve watched as RoboForm has become less effective, as more sites institute cool new login processes that force you to do more, remember more, and defeat the popular password managers.
I have one site that insists I manually input my password into a virtual keypad on the screen. Way cool, but essentially pointless. I have another site with no fewer than ten challenge questions that it presents randomly, with responses that have to be entered perfectly, or you are locked out and forced to spend 20 minutes with their call center to get back in. Still another site wants a ten character password that includes both a capital letter and two non-alphanumeric characters. And the latest cool approach is “two-factor authentication,” which sends a separate code to your cellphone every single time you want to login. Honestly, can you picture yourself doing this several times (or more) a day? We want more user engagement, not less.
Where I come out is with this simple, three-point proposition:
- Login security should be proportionate to what you are protecting, a point of particular relevance to online content providers. Let’s be honest with ourselves: we’re not protecting nuclear launch codes.
- Don’t leave login protocols completely in the hands of your programmers. Logins are a critical component of the overall user experience and need to be assessed accordingly. If users aren’t logging in, they’re also not renewing.
- For most of us, time would be better spent improving our back-end system security, to reduce the chance of wholesale theft of user logins, credit card data and personal information. That’s where the big business risk resides, although the necessary programming is admittedly less glamorous than virtual keypads.
So sure, let’s start talking about eliminating passwords. But first, let’s acknowledge that a lot of the problem is self-inflicted by the way in which we have implemented passwords.
The Gamification of Data
I attended the Insight Innovation Conference this week – a conference where marketing research professionals gather to think about the future of their industry. A number of the sessions dealt with the topic of gamification. Marketing research is really all about gathering data, and a lot of that data is gathered via surveys. And, not surprisingly, market researchers are finding it harder than ever to get people to participate in their surveys, finish the surveys even when they do participate, and supply trustworthy, high quality answers all the way through. It’s a vexing problem, and it is one that is central to the future of this industry.
That’s where gamification comes in. Some of the smartest minds in the research business think that by making surveys more fun and more engaging, they can not only improve response rates, but actually gather better quality data. And this has implications for all of us.
One particularly interesting presentation provided some fascinating “before and after” examples of boring “traditional” survey questions, and the same question after it had been “gamified.” As significantly, he showed encouraging evidence that gamified surveys do in fact deliver more and better data.
And while it’s relatively easy to see how a survey, once made more fun and engaging, would lead people to answer more questions, it’s less obvious how gamification leads to better data.
In one example, the survey panel was asked to list the names of toothpaste brands. In a standard survey, survey respondents would often get lazy, mentioning the top three brands and moving to the next question. This didn’t provide researchers with the in-depth data they were seeking. When the question was designed to offer points for supplying more than three answers and bonus points for identifying a brand that wasn’t in the top five, survey participants thought harder, and supplied more complete and useful data.
In another example, survey participants were given $20 at the start of the survey, and could earn more or lose money based on how their responses compared to the aggregate response. Participation was extremely high and data quality was top-notch.
Still other surveys provided feedback along the way, generally letting the survey participants know how their answers compared to the group.
Most intriguing to me is that gamification allowed for tremendous subtlety in questions. In a game format, it’s very easy to ask both “what do you think” and “what do you think others think,” but these are devilishly hard insights to get it in traditional survey format.
Gamification already intersects with crowdsourcing and user generated content quite successfully. Foursquare is just one well-known example. But when the marketing research industry begins to embrace gamification in a big way, it’s a signal that this is a ready-for-prime-time technique that can be applied to almost any data gathering application. Maybe it’s time to think about adding some fun and games!