Viewing entries in
Publishing Trends

AI in Action

Two well-known and highly successful data producers, Morningstar and Spiceworks, have both just announced new capabilities built on artificial intelligence (AI) technology. 

Artificial Intelligence is a much-abused umbrella term for a number of distinctive technologies. Speaking very generally, the power of AI initially came from sheer computer processing power. Consider how early AI was applied to the game of chess. The “AI advantage” came from the ability to quickly assess every possible combination of moves and likely responses, as well as having access to a library of all the best moves of the world’s best chess players. It was a brute force approach, and it worked.

Machine learning is a more nuanced approach to AI where the system is fed both large amounts of raw data and examples of desirable outcomes. The software actually learns from these examples and is able to generate successful outcomes of its own using the raw data it is supplied. 

There’s more, much more, to AI, but the power and potential is clear.

So how are data producers using AI? In the case of Morningstar, it has partnered with a company called Mercer to create a huge pool of quantitative and qualitative data, to help investment advisors make smarter decisions for their clients. The application of AI here is to create what is essentially a next generation search engine that moves far beyond keyword searching to make powerful connections between disparate collections of data to identify not only the most relevant results, but to pull meaning out of those search results as well.

 At Spiceworks (a 2010 Model of Excellence), AI is powering two uses. The first is also a supercharged search function, designed to make it easier for IT buyers to more quickly access relevant buying information, something that is particularly important in an industry with so much volatility and change.

Spiceworks is also using AI to power a sell-side application that ingests the billions of data signals created on the Spiceworks platform each day to help marketers better target in-market buyers of specific products and services.

As the data business has evolved from offering fast access to the most data to fast access to the most relevant data, AI looks to play an increasingly important and central role. These two industry innovators, both past Models of Excellence m are blazing the trail for the rest of us, and they are well worth watching to see how their integration of AI into their businesses evolves over time.

For reference:

Spiceworks Model of Excellence profile
Morningstar Model of Excellence Profile

 

 

Standard Stuff Is Actually Cool

In the not-too-distant past, there was something close to an agreed-upon standard for the user interface for software applications. Promoted by Microsoft, it is the reason that so much software still adheres to conventions such as a “file” menu in the upper left corner of the screen.

The reason Microsoft promoted this open standard is that it saw clear benefit in bringing order out of chaos. If most software functioned in largely the same way, users could become comfortable with new software faster, meaning greater productivity, reduced training time and associated cost, and greater overall levels of satisfaction.

Back up a bit more and you can see that the World Wide Web itself represented a standard – it provides one path to access all websites that function in all critical respects in the same way. Before that, companies with online offerings had varying login conventions, different communications networks, and totally proprietary software that looked like nobody else’s software. Costs were high, learning curves were steep and user satisfaction was low.

There are clear benefits to adhering to high-level user interface standards, even ones that bubble up out of nowhere to become de facto standards. Consider the term “grayed out.” By virtue of this de facto user standard, users learned that website features and functions that were “grayed out” were inaccessible to them, either because the user hadn’t paid for them, or because they weren’t relevant to what the user was currently doing within the application. Having a common understanding of what “grayed out” meant was important to many data publishers because it was a key part of the upsell strategy.

That’s why I am so disappointed to see the erosion of these standards. On many websites and mobile apps now, a “grayed out” tab now represents the active tab the user is working in, not an unavailable tab. And virtually all other standards have evaporated as designers have been allowed to favor “pretty” and “cool” over functional and intuitive. I could go on for days about software developers who similarly run amok, employing all kinds of functionality mostly because it is new and with absolutely no consideration for the user experience. What we are doing is reverting to the balkanized state of applications software before the World Wide Web.

And while I call out designers and developers, the fault really lies with the product managers who favor speed above all, or who themselves start to believe that “cutting edge” somehow confers prestige or competitive advantage. Who’s getting left out the conversation? The end-user customer. What does the customer want? At a basic level the answer is simple: a clean, intuitive interface that allows them to access data and get answers as quickly and painlessly as possible. Standard stuff, and the best reason that being different for the sake of being different isn’t in your best interest.

Search is Easy; Data is Hard

The New York Times magazine has just published a fascinating article about Google, discussing whether Google has become an aggressive monopolist in the area of search, and if so, whether or not it needs to be broken up under anti-trust law. The article, which is well worth a read, cites case after case where Google ostensibly derailed other companies that had seemingly developed better search tools than Google.

Better search tools than Google? Is that even possible? That’s where I take some slight exception to the article. Possibly in order to make this topic more accessible to a mass audience, it labels all these competitive search providers as “vertical search” companies.

Those of us with some history in the business remember back to around 2006 when “vertical search” was a thing, a thing that has long since faded. At the time, the concept of vertical search was a full-text search engine, much like Google, but one that was focused on a single vertical market or specific topic area. The thinking was that if publishers curated the content that was being indexed, the search results would be stronger, more accurate, more contextual and more precise. A prime example at the time was the word “pumps.” As a search term, it’s a tough one – the user could be looking for a device that moves fluids … or shoes. A vertical search engine, which would be oriented towards either equipment or fashion, would reliably return more relevant responses. Vertical search failed as a business not because it was a bad idea, but because (and let’s be honest here) most of the publishers rushing to get into it were too lazy to do the up-front curation work. And without quality, up-front curation, vertical search quickly becomes just plain bad search.

Vertical search as used in the article really refers to vertical databases. The difference is important because the article also states that parametric search is hard. That statement is simply more proof that vertical search as used in this article means databases. Parametric search is not hard: collecting and normalizing data so it can be searched parametrically is hard. Put another way, searching a database is easy, providing there is a database to search.

Google never wanted to do the work of building databases. It sometimes bought them (example: a $700 million acquisition of airline data company ITA) or “borrowed” them (pulled results from third-party databases into its own search results, effectively depriving the database owner of much of its traffic – think Yelp). What Google did instead was devote unfathomable resources to develop software code to try to make unstructured data as searchable as structured data. While it made some impressive strides in this area, overall Google failed.

With this context, you can clearly see why data products are so important and valuable. Data collection is hard. Data normalization is hard. But there’s still no substitute for it, something Google has learned the hard way. It may be disheartening to see survey after survey where we learn that users turn to Google first for information. But this is the result of habituation, not superior results. For those who need to search precisely, and for those who really depend on the information they get back from a search, data products win almost every time … provided that users can find them. Read this article and judge for yourself  just how evil Google may be…

Blockchain: The Next Big Thing

We all lived through the heights of the social media craze when every new product needed a social aspect in order to succeed (success is defined as getting funding). My personal favorite was the backyard grill thermometer that posted the temperatures of what you were cooking to Facebook and Twitter. (Okay, there was a little more to it than that, but not much).

But as an Internet fad, social is starting to cycle down, meaning that another Internet fad needs to take its place. My nomination: blockchain.

You have doubtless heard of blockchain, although the odds are you don’t know exactly what it is or what it does. Most people don’t. My understanding of it is sketchy. But when it comes to the Internet, complexity is a benefit because everyone salutes when they hear about a new service using blockchain, without being able to ask any tough questions about how or why.

A great example of this is a restaurant review site called Munchee. Munchee plans to disrupt sites such as Yelp and Zagat in part by using blockchain technology. Think about that for a while. Or better yet, don’t think about it. You’ll get a headache.

Munchee has a few interesting twists to it. First, it’s meant to be more granular than sites like Yelp, by focusing on the individual dishes a restaurant serves, based on the belief that all dishes served by a particular restaurant are unlikely to be of equal quality. You might doubt the need, but it’s a plausible idea.

Munchee also wants to correct for sample bias in reviews. It’s well understood that people are more likely to post a review when they are dissatisfied. Munchee wants to get around this problem be rewarding all reviews with tokens that can be redeemed at restaurants or even sold to other Munchee participants for cash. If you are getting paid for every review, the reasoning goes, you’re as likely to create a positive review as a negative one. Again, an interesting idea.

To get even more accuracy, Munchee wants all reviews to be peer-reviewed by other Munchee users. Munchee intends to recruit peer reviewers by using (buzzword alert) machine learning to find the other Munchee users best qualified to pass judgment on the review. Still again, the notion of peer review is an interesting one.

So where exactly does blockchain come in? Does it, for example, somehow definitively tie the reviewer to the restaurant, in order to eliminate false reviews? Well, no. Instead, those award tokens that Munchee offers are actually crypto-tokens that are tied to the Ethereum blockchain. That’s it.

Munchee actually has some fresh approaches to review platforms, but it apparently couldn’t resist the temptation to bolt on a tenuous blockchain application to sound even cooler and more cutting-edge. Unfortunately, that works to obscure the more basic ideas it has that are likely to be where the real value is created. We all need to be careful not to fall into the trap of rushing to adopt new technologies just because they get a buzz around them. You’ll only end up confusing your customers … and yourself … about the true ways you offer value.

 

Data Marketplaces: Almost There

There has been much excitement about the recent launch of the Salesforce Data Studio, a new data-sharing platform within the Salesforce Marketing Cloud.

The idea of the Data Studio is simple: marketers can, on a fully automated basis, identify, order and integrate datasets that others are offering for sale. In its early implementation, the Data Studio seems mostly like a cool way for marketers to buy email lists. But the vision is much bigger and more interesting: to allow marketers to augment and overlay existing email lists with more data so that they become smarter about their lists, target their efforts more effectively, and get better results.

Data Studio at time of launch is heavy on audience data, mostly from larger publishers, but there’s no reason any data publisher couldn’t participate as well, especially if the Data Studio wants to exploit its full potential.

Interestingly, Salesforce is not the only big player that has an interest in data marketplaces. The Amazon Web Services Marketplace sells software through its marketplace – again, a totally automated buying experience – but it also offers a selection of public domain datasets for free. It’s a small jump then for Amazon to start selling databases on behalf of others.

As you can see, neither of these two marketplaces is quite ready for prime time as far as becoming a meaningful sales channel for data publishers, but they’re tantalizingly close. Keep an eye on these marketplaces: they could become very important to data publishers very quickly.