A Healthy New Year

We’re in the midst of a transformational shift in the healthcare industry. Likely you have experienced it yourself, and it’s probably already hit you in the pocketbook. It’s the shift to what is called consumer-directed healthcare.

While on the surface consumer-directed healthcare may seem like nothing more than an attempt by employers to shift some of their spiraling healthcare costs onto their employees, there is much more going on behind the scenes. There is a lot of public policy driving this shift. The general idea is that healthcare costs are out of control because those buying healthcare services traditionally haven’t been the ones paying for them. By shifting healthcare costs to the consumer, the reasoning goes, consumers will demand better value for their money by becoming smart healthcare shoppers, and healthcare costs will begin to decline.

It all makes sense on paper, but there is one huge stumbling block in making this approach work: it’s hard to be a smart shopper when none of the things you are buying have price tags on them.

Data entrepreneurs have already seen this opportunity. Companies like Healthcare Blue Book and ClearCost Health have made real strides, but it’s a big and enormously complicated problem to solve. In part, that’s because hospitals don’t like to disclose their prices and insurers are often contractually prohibited from sharing what they pay specific hospitals for specific procedures.   

 Recognizing the issue, the federal government had mandated that as of January 1 of this year, hospitals must post their pricing for common procedures on their websites in an easily downloadable format.

 There’s a quick opportunity here to put your website scraping tools to work to gather all this pricing data in one place and normalize it. Certainly, there is an analytical product in there somewhere. But it’s less of an opportunity than it seems because what hospitals are generally posting are their list prices – and virtually nobody pays these prices. 

The challenge in hospital pricing is to find out what a specific insurance plan pays a specific hospital for, say, a hip replacement. This could be an ideal opportunity to turn to the crowd.

 One approach might be to aggregate all the pricing data that hospitals are now required to publish and use it as a data backbone – essentially a starting point. Then you could turn to consumers and ask them to anonymously submit their hospital bills and insurance statements. Take those images, use optical character recognition to get them into raw data format, then develop software to extract the valuable pricing data. When specific price data isn’t available, you could back off to list price data that would at least show if a hospital is relatively more or less expensive.

 Obviously it will take a long time to build a comprehensive database consisting of millions of price points, but there are a lot of consumer groups and other constituencies that would be very interested in your success and would work with you to increase the number of bills submitted. Hospitals won’t like this a bit, but as is so often the case, if one group doesn’t want the data out there, you have immediate confirmation that the data are valuable to some other group. Ironically, hospitals submit their price quotes for medical devices to a fascinating data company called MDBuyline to make sure they aren’t over-paying for their purchases.

 Sure, there is lots of complexity hiding under this simple framework. Also, it’s obvious that it will take a long time to build a comprehensive database. But the bromide “don’t let the perfect be the enemy of the good” nicely describes a key to success in the data business. As long as your database is the best available, it doesn’t have to be either complete or perfect. In almost every case, data is so important to decision-making that buyers will take what they can get, warts and all. This is not an invitation to be lazy or sloppy. Rather, it is recognition that you’ll have a marketable product long before you have a complete and perfect product. Just one more reason data is such a great business. Should hospital price data be on your New Year’s resolution list?

Relationship Scoring

No, this is not about online dating.  I am referring to the growing use of consumer scores to help companies determine how much time and energy to invest with individual customers.

We’re all familiar with credit scores that yield a single number meant to reflect how dependably you pay your bills. A high credit score can mean easy access to credit, often at lower interest rates that reflect your low re-payment risk. A poor credit score can mean limited access to credit and loans, in addition to higher interest rates.

The folks behind the credit scores have been relentless in their work to find new markets for their product. With the notion that a credit score is also a reflection of someone’s level of personal responsibility as well, credit information is increasingly used in hiring decisions. You’ll also find credit scores used to determine pricing for such things as automobile insurance, the insurance companies having concluded that if you pay your bills on time, you likely drive carefully as well.

But credit scores are not the only consumer scores out there. In parallel with credit scores, a number of companies have been building out consumer scores based on Customer Lifetime Value (CLV). The CLV concept has been around forever. What’s changed recently is increasingly easy access to a wide variety of input datasets (a/k/a/ “signals”) that work to increase the precision of these scores, along with increasing computer power that makes it possible to access and act on these scores in real-time.

And how are these scores used? A recent Wall Street Journal articles suggests that CLV scores are increasingly used by companies to determine how they will interact with their customers. A higher scoring customer may actually get faster and better customer service. Companies will offer bigger incentives and better deals to their best customers in order to retain them. CLV scores start with numeric calculations of the likely dollar value of a customer over the entirety of the projected relationship (and yes, your score typically declines as you get older because … less lifetime). More recently, these relatively simple calculations have been enhanced with demographic overlays and a wide array of lifestyle and even behavioral data points. For example, customers who complain too much or call customer service too often may have their scores reduced as a result.

Currently, companies implement their own CLV scoring systems, sometimes with the help of third-party vendors. CLV scores as a data-driven way to make sure better customers are treated better sounds benign. Where it could take a more worrisome turn is if a third-party vendor tries to centralize all of this information to build a single CLV score for all consumers. This would be a fraught undertaking, especially since it would likely not be subject to any regulatory scrutiny and control. Such a scoring system would also look uncomfortably similar to the social credit system recently introduced by the Chinese government, the implications of which are not yet fully understood but are likely to be profound.

Models for Being the Best

There is endless innovation and variety in what I call the “best guides” segment of the market. These are guides, print and online, that help consumers find and select the best of something – from hotels to restaurants to contractors to consumer electronics.

It’s a huge market segment. Consider such vast scale businesses as Yelp and TripAdvisor that largely focus on restaurants and hotels respectively, though you can find on their platforms crowd-sourced reviews for just about anything. If you have huge ambitions, it’s pretty well established that the fastest way to develop massive amounts of content is via crowd-sourcing. If you build it, the crowd will comment on it and rate it for you. The downside to crowdsourcing has always been lack of control. Too many people posting ratings and reviews have malign motives. More fundamentally, two people can honestly have diametrically opposed opinions about a restaurant, for example, and it’s close to impossible to reflect this in an overall rating. Crowdsourcing depends on sheer volume for accurate reviews to drown out inaccurate and biased reviews. It works, more or less.

Knowing the limitations of crowdsourcing, there have been many who have tried to refine the concept. Angie’s List was an early pioneer, aggregating reviews but only fromits members and only forits members, what we call a “closed pool” model in our Business Information Framework. The concept worked, but was difficult to scale, in large part because members didn’t want to pay for ongoing memberships. Angie’s List has since shifted to a lead generation model. I also wrote in 2016 about a company called BestPickReports.com, that’s building its business both online and with expensive print guides mailed to consumers. It appears to be a long-term play, and is backed, somewhat surprisingly, by EBSCO.

 We all know about Consumer Reports. For decades, Consumer Reports was the first place to check before making a major consumer purchase such as a car or a dishwasher. Consumer Reports did all the testing and rating itself, and understood that its reputation was everything, so much so that it prohibited manufacturers from citing its reviews, and the owner was a non-profit organization. In the days of print, Consumer Reports did very well selling subscriptions to its print magazine. It wasn’t an ideal way to distribute information (how do you buy that new car in March when the new car reviews didn’t come out until the May issue?) but it worked for a long time because there weren’t a lot of options. That’s why Consumer Reports had some struggles when it moved online because consumers didn’t want an online subscription as much as they wanted to be able to buy just dishwasher reviews and only when they were in the market for dishwashers. Consumer Reports continues to flourish, the result of momentum, its pristine reputation and quality reviews, but it’s quite possible its business model will come under increasing pressure for the same reason as Angie’s List: selling a continuous information service to consumers who don’t continuously need information is just plain hard.

Finally, let’s look at the original arbiters of what’s good, better and best: newspapers and magazines. 

Hearst Magazines has a review site called BestProducts.com. While the name might imply product testing, the site recommendations appear to be closer to the traditional “editor’s picks.” There is heavy use of the phrase “what welike,” and the site overall seems to be much more about informed personal preferences of the writer – more taste-making than research. Indeed, aside from a great (and arguably misleading) domain name, these are product recommendations that would not look out of place as print magazine articles from ten or twenty years ago. Online forced a change in business model, however. Hearst links to vendors of all the items it recommends, hoping to profit from online referral fees.

 The New York Times blends a few models together through its Wirecutter.com site, a business it acquired in 2016. Wirecutter offers much more than the personal opinion model of Hearst, but less than the rigorous product testing of Consumer Reports. It walks a middle ground, doing real product research, but not actual product testing. In terms of business model, Wirecutter follows Hearst, generating revenue from product referral fees.

 Depending on product referral fees is a risky business because of “leakage.” Simply put, it’s too easy to take your recommendation but not click your link. When that happens, the business generates no revenue. The only real solution to the leakage problem is sheer traffic volume, something both Hearst and the New York Timesalready have and can easily leverage. The New York Times, for example, is increasingly citing Wirecutter in its own news stories, albeit with full disclosure of its ownership.

 There is no single best model, and here’s a rundown of the tradeoffs. Crowdsourcing works, and it is cheap, but the quality of the content is uneven. Closed pool crowdsourcing yields a huge step-up in quality, but it’s a tough model to execute. 

 You can generate your own reviews to guarantee the quality, but you have to fight the trend towards unbundling. Consumers will pay for reviews and recommendations, but only the ones they want when they want them. It’s tough to generate adequate revenue on that basis.

 Online referral fees are an inherently dicey business because it’s too hard to mask the name of the manufacturer, and there are far too many sellers, all a click away. You can make it work if you have gobs of traffic, and this is even a better business if you can leverage your existing traffic and not start from scratch.

 If I was trying to build a “best guide” site, I’d select Wirecutter as my starting point. It has the benefit of offering true product research without the huge testing costs incurred by Consumer Reports. It totally controls both the research process and resulting recommendations. It can leverage the brand and traffic of its parent, the New York Times. What would I change? First, I’d see if I could sell recommendations on an a la carte basis. Buying a dishwasher? Then buy our dishwasher reviews. I might also be able to generate some additional revenue from national retailers or manufacturers who could offer special deals along with the recommendations, though I would need to be careful to make it clear nobody had paid for a preferential rating. I’d ditch the referral fee model because it’s catnip for free-riders. Finally, I’d wrap the New York Timesbrand more aggressively around Wirecutter to reinforce the quality of the recommendations. I understand why the New York Times is moving cautiously here, but at some point, if you want to be in this business, you need to be in this business. You can’t hold it at arms-length. 

If you’re considering getting into the business, leverage your strengths, choose the right content and business model, and plan for the worst and hope for the best, because as of yet, there is no clear pathway to success in this huge and tantalizing area.

AI in Action

Two well-known and highly successful data producers, Morningstar and Spiceworks, have both just announced new capabilities built on artificial intelligence (AI) technology. 

Artificial Intelligence is a much-abused umbrella term for a number of distinctive technologies. Speaking very generally, the power of AI initially came from sheer computer processing power. Consider how early AI was applied to the game of chess. The “AI advantage” came from the ability to quickly assess every possible combination of moves and likely responses, as well as having access to a library of all the best moves of the world’s best chess players. It was a brute force approach, and it worked.

Machine learning is a more nuanced approach to AI where the system is fed both large amounts of raw data and examples of desirable outcomes. The software actually learns from these examples and is able to generate successful outcomes of its own using the raw data it is supplied. 

There’s more, much more, to AI, but the power and potential is clear.

So how are data producers using AI? In the case of Morningstar, it has partnered with a company called Mercer to create a huge pool of quantitative and qualitative data, to help investment advisors make smarter decisions for their clients. The application of AI here is to create what is essentially a next generation search engine that moves far beyond keyword searching to make powerful connections between disparate collections of data to identify not only the most relevant results, but to pull meaning out of those search results as well.

 At Spiceworks (a 2010 Model of Excellence), AI is powering two uses. The first is also a supercharged search function, designed to make it easier for IT buyers to more quickly access relevant buying information, something that is particularly important in an industry with so much volatility and change.

Spiceworks is also using AI to power a sell-side application that ingests the billions of data signals created on the Spiceworks platform each day to help marketers better target in-market buyers of specific products and services.

As the data business has evolved from offering fast access to the most data to fast access to the most relevant data, AI looks to play an increasingly important and central role. These two industry innovators, both past Models of Excellence m are blazing the trail for the rest of us, and they are well worth watching to see how their integration of AI into their businesses evolves over time.

For reference:

Spiceworks Model of Excellence profile
Morningstar Model of Excellence Profile

 

 

Form Follows Function

Numerous online marketing trade associations have announced their latest initiative to bring structure and transparency to an industry that can only be called the Wild, Wild West of the data world: online audience data. Their approach offers some useful lessons to data publishers.

At their brand-new one-page website (www.datalabel.org) this industry coalition is introducing its “Data Transparency Label.” In an attempt to be hip and clever, the coalition has modeled its data record on the familiar nutrition labels found on most food packaging today. It’s undeniably cute, but it’s a classic case of form not following function. Having decided on this approach, the designers of this label immediately boxed themselves in as to what kind and how much data they could present to buyers. I see this all the time with new data products: so much emphasis is placed on how the data looks, its visual presentation, that important data elements often end up getting minimized, hidden or even discarded. Pleasing visual presentation is desirable, but it shouldn’t come at the expense of our data.

The other constraint you immediately see is that this label format works great if an audience is derived from a single source by a single data company. But the real world is far messier than that. What if the audience is aggregated from multiple sources? What if its value derives from complex signal data that may be sourced from multiple third parties? What about resellers? Life is complicated. This label pretends it is simple. Having spent many years involved with data cards for mailing lists, during which time I became deeply frustrated by the lost opportunities caused by a simple approach used to describe increasingly sophisticated products, I see history about to repeat itself.

My biggest objection to this new label is that its focus seems to be 100% on transparency, with little attention being paid to equally valuable uses such as sourcing and comparison. The designers of this label allude to a taxonomy that will be used for classification purposes, but it’s only mentioned in passing and doesn’t feel like a priority focus at all. Perhaps most importantly, there’s no hint of whether or not these labels will be offered as a searchable database or not. There’s a potentially powerful audience sourcing tool here, and if anyone is considering that, they aren’t talking about it.

 Take-aways to consider:

·     When designing a new data product, don’t allow yourself to get boxed in by design

·     The real world is messy, with lots of exceptions. If you don’t provide for these exceptions, you’ll have a product that will never reach its full potential

·     Always remember that a good data product is much more than a filing cabinet that is used to look up specific facts. A thoughtful, well-organized dataset can deliver a lot more value to users and often to multiple groups of users. Don’t limit yourself to a single use case for your product – you’ll just be limiting your opportunity.