Building Databases R Perkins Building Databases R Perkins

Fresh Data Sold Here! 

While many successful data publishers obsess about continually adding new features and functionality to their data products, there are lots of good reasons to be regularly evaluating your data as well.

Don’t get me wrong: new features and functionality are critically important, particularly if you have a data product that offers a workflow solution.

But adding new, well-selected data elements can add significant value and appeal as well. Here’s a few examples:

Morningstar just enhanced its suite of investment analysis tools by introducing a single new data element: a Carbon Risk Score. This score assesses how vulnerable a company is financially to the transition away from a fossil-fuel-based economy to a lower-carbon economy. Not only does the score hold significant value in its own right, but as an individual and consistently presented data element, it can be used for discovery and filtering by investment analysts. Moreover, as a proprietary piece of information, it gives Morningstar additional differentiation and strengthens its competitive edge.

Data-driven real estate listings sites such as Realtor.com, Zillow and Trulia have moved away from tussling over who has the most complete listings to trying to outdo each other with deeper datasets. Various combinations of these three sites now give detailed information and ratings on local schools, crime data, traffic data, neighborhood data, walkability data … even data on whether or not a particular home is likely to be a good candidate for solar panels! And in a move I particularly admire, they have gotten major cable and companies to pay to indicate if a particular house is eligible for their services. In the hotly competitive world of real estate data sites, it’s a relentless battle at the data element level, all with the goal of providing the most attractive one-stop shop for prospective homebuyers.

Consider too the intensely competitive market of hotel booking databases. Think of services such as Expedia, TripAdvisor, Oyster and Hotels.com. Having exhausted themselves by all claiming to offer the lowest rates, they’re now seeking to differentiate themselves at the data element level. Using filters, site visitors can draw on specific data elements to locate hotels with free wi-fi, that accept pets, that have handicapped access, that are green or sustainable, that are LGBT-welcoming and even hotels that have a party atmosphere.

Features and functionality matter, but a single new and well-chosen data element can add tremendous value, while simultaneously providing competitive advantage and product differentiation. Keep your data fresh of course, but always be on the lookup for fresh new data elements as well.

Read More
Building Databases R Perkins Building Databases R Perkins

Data Flipping

One of the best things above government databases is that even when the government agency makes the database available on its website for free, it isn’t very useful. That’s because government agencies put these databases online for regulatory or compliance reasons.  They’re designed to search for known entities because the expectation is that you are checking the license status of a company, or perhaps its compliance history.

Occasionally, a government agency will get ambitious and permit geographic searches, but in these cases, there are real limitations. That’s because the underlying data were collected for regulatory, not marketing purposes. So, for example, a manufacturer with 30 plants around the country may only appear in one ZIP code because the government agency wants filings only from headquarters locations.

Taking a regulatory database and changing it into, say, a marketing database, is something I call “flipping the file,” because while the underlying data remains the same, the way the database is accessed is different. Sometimes this is as simple as offering more search options; sometimes it involves normalizing or re-structuring the data to make it more useful and accessible. As just one example, a company called Labworks built a product called the RIA Database. It started with an  investment advisor database that the SEC maintains for regulatory purposes, and then flipped the file to make the same database useful to companies that wanted to market toinvestment advisors.  There are hundreds of data publishers doing this in different markets, and as you might expect, it’s a very attractive model since the underlying data can be obtained for free.

In addition to simply flipping a file, you can also enhance a database. The shortcoming of many government databases is that they focus on companies, not people, so while there may be a wealth of information on the company, data buyers typically want to know the names of contacts at those companies. Companies such as D&B and ZoomInfo do a brisk business licensing their contact information to be appended onto government databases of company information.

This is one of the truly magical aspects of the data business. Databases built for one reason can often be re-purposed for an entirely different use. And re-purposing can involve something as little as a new user interface. This magic isn’t limited to government data of course. Another great place to look for flipping opportunities is so-called “data exhaust,” data created in the course of some other activity, and thus not considered valuable by the entity creating it. You can even license data from other data providers and re-purpose it. There are a number of mapping products, for example, that take licensed company data and essentially create a new user interface by displaying data in a map context.

Increasingly, identifying the data need is as important as identifying the data source. With data, it’s all in how you look at it. 

Read More
Thoughts and Predictions R Perkins Thoughts and Predictions R Perkins

Searching for a Better Recommendation Engine

My first experience with recommendation engines was with Amazon in its early days. Then, when you bought a book, Amazon would tell you that people who bought the same book had also bought these other books. It was simple, brilliant, and most importantly, it worked. When Amazon later started selling CDs, the recommendation engine worked even better. I got to enjoy music I never knew existed, and Amazon sold more CDs. It’s a classic win-win, and you would think Amazon would put its substantial resources into making its recommendations even better. 

But apparently not. After buying an introductory book on Photoshop a while back, the recommendation engine started showing me every Photoshop book ever written (there appear to be hundreds of them), and crowded out every other book recommendation for nearly a year. These were lazy recommendations, and disproportionate to the one book I bought – ever – on a specific topic. And Amazon recommendations have gotten even lazier since then.

You may also recall the Netflix Prize, announced with great fanfare back in 2008. A $1 million prize was given to anyone who could improve the efficacy of the Netflix recommendation engine. It was an impressive commitment by Netflix, and it showed they deeply understood the importance and value of recommendations to their business. Fast forward to today. Having watched every single episode of Arrested Development on Netflix, how did I learn about the arrival of new episodes? I read about it in the newspaper. Has Netflix brought these new episodes to my attention? Not yet. Somewhere along the way, Netflix seems to have stopped caring about the quality of its customer recommendations.

Move over to the search engines – all of them. You may know that you can force a search engine to search for a specific phrase by putting quote marks around it. Typically, your first search results will be web pages containing that exact phrase. But then the search engines actually remove the quote marks and toss in results that have the requested search terms, but not necessarily together. Then they toss in pages that have some but not all of your search terms. Since I didn’t ask for these search results, I think it’s fair to consider them as recommendations. And they are (predictably) lousy. It’s as if the search engines assume I don’t know what I am doing, so they give me every possible type of result. Yes, more is better with search engines, but only if they are giving me more of what I want. 

Contrast this with the music service Pandora that I’ve been raving about since 2007. Despite a tough revenue model, Pandora has not forgotten that it lives and dies by the quality of its recommendations, and it’s built to over $1 billion in annual revenue by staying focused. Hopefully they'l maintain that focus as it continues to grow.

When companies get big, it’s very easy for them to get distracted and lose interest in what made them big in the first place. There are more voices now saying that Google search quality is in decline. And remember when Yahoo got bored with search and decided to outsource search while it chased bigger dreams? These distractions create opportunities for smaller players to do search better, and some are finding success.  

Read More

Being in the Middle of a New Data Product

I’ve written before about the application model called the “Closed Data Pool.” In this model, companies (and many times they are competitors) contribute proprietary data to a central, neutral data company. The data company aggregates the data and sells aggregate views of the data back to the very companies that contributed it. Madness you say? Not really, because these companies get great benefit from those aggregated views (think market share, average pricing and other vital business metrics). It’s the neutral, trusted data provider in the middle who makes it possible. 

But there is another twist on the closed data pool that represents an even more profitable business for the data provider in the middle. Consider a company called The Work Number.

The Work Number came into being because a lot of credit grantors need to be able to quickly verify employment status and income. At the same time, companies hated getting an endless stream of calls from creditors seeking to verify employment data. The Work Number came up with an ingenious solution. It went to big companies and said that they could outsource all these nuisance calls to The Work Number. All the company had to do was supply a feed of its payroll data. 

The Work Number then went to major credit grantors such as banks and said that instead of those painful verification calls they were making, credit grantors could just do a lookup on The Work Number website and instantaneously get the exact data they needed.

The best part? The Work Number was able to charge credit grantors for access to the database because of the big productivity gains it offered. But The Work Number was also able to charge the companies supplyingthe data because it increased their productivity as well by eliminating all these annoying verification calls. Yes, The Work Number charges both to collect the data and provide access to it!

If this sounds like an interesting but one-off opportunity to you, it’s not. Opportunities exist in vertical markets as well. Consider National Student Clearinghouse, which does the same thing as The Work Number, only with college transcripts.

Is there an opportunity in your market? Look for areas where relatively important or high-value information is being exchanged by phone or one-off emails or even by fax. If the information exchange constitutes a serious pain point or productivity drag for either or both parties, you’ve probably got a new data product. 

Read More

Standard Stuff Is Actually Cool

In the not-too-distant past, there was something close to an agreed-upon standard for the user interface for software applications. Promoted by Microsoft, it is the reason that so much software still adheres to conventions such as a “file” menu in the upper left corner of the screen.

The reason Microsoft promoted this open standard is that it saw clear benefit in bringing order out of chaos. If most software functioned in largely the same way, users could become comfortable with new software faster, meaning greater productivity, reduced training time and associated cost, and greater overall levels of satisfaction.

Back up a bit more and you can see that the World Wide Web itself represented a standard – it provides one path to access all websites that function in all critical respects in the same way. Before that, companies with online offerings had varying login conventions, different communications networks, and totally proprietary software that looked like nobody else’s software. Costs were high, learning curves were steep and user satisfaction was low.

There are clear benefits to adhering to high-level user interface standards, even ones that bubble up out of nowhere to become de facto standards. Consider the term “grayed out.” By virtue of this de facto user standard, users learned that website features and functions that were “grayed out” were inaccessible to them, either because the user hadn’t paid for them, or because they weren’t relevant to what the user was currently doing within the application. Having a common understanding of what “grayed out” meant was important to many data publishers because it was a key part of the upsell strategy.

That’s why I am so disappointed to see the erosion of these standards. On many websites and mobile apps now, a “grayed out” tab now represents the active tab the user is working in, not an unavailable tab. And virtually all other standards have evaporated as designers have been allowed to favor “pretty” and “cool” over functional and intuitive. I could go on for days about software developers who similarly run amok, employing all kinds of functionality mostly because it is new and with absolutely no consideration for the user experience. What we are doing is reverting to the balkanized state of applications software before the World Wide Web.

And while I call out designers and developers, the fault really lies with the product managers who favor speed above all, or who themselves start to believe that “cutting edge” somehow confers prestige or competitive advantage. Who’s getting left out the conversation? The end-user customer. What does the customer want? At a basic level the answer is simple: a clean, intuitive interface that allows them to access data and get answers as quickly and painlessly as possible. Standard stuff, and the best reason that being different for the sake of being different isn’t in your best interest.

Read More