Viewing entries in
Thoughts and Predictions

Data's Brave New World

The ACLU has just released a report highlighting the growing relationship between law enforcement agencies and a Chicago-based company called Geofeedia. In a nutshell, Geofeedia is apparently marketing to law enforcement agencies a crowd surveillance tool that mixes geolocation with social media sentiment analysis.

This illustrates the gray area we operate in as data providers, especially those of us dealing with consumer data. Things that are perfectly legal may be seen by others as unethical and inappropriate. And, perhaps ironically, the power and pervasiveness of social media means that reputational risk becomes an outsized area of concern for those of us who deal in data.

On the one hand, Geofeedia is simply aggregating and analyzing information that individuals have voluntarily and publicly posted on various social media platforms. On the other hand, its particular application for these data can be seen to be chilling to lawful speech, dissent and free assembly. And as noted earlier, the law lags far behind these new technologies, and thus provides little guidance.

Facebook reacted to the ACLU report by quickly severing ties with Geofeedia. It understands that anything that creates even the slightest hesitancy to use its platform is detrimental to its own business. Instagram suspended Geofeedia as well. Even Twitter, which we have previously noted seems content to be a datastream for others to monetize, has suspended Geofeedia from commercial access to its data.

As we have noted, it’s difficult to come down on one side or the other in this issue. As a data producer, I think that aggregating and analyzing publicly available data is generally a beneficial activity. Indeed, what Geofeedia is doing is conceptually not all that different than the many social sentiment analysis companies selling aggregated insights to hedge funds seeking early warning on news and emerging trends. Yet at the same time, even if Geofeedia was working with the best of intentions, the optics of its product offering should have received greater attention. And that’s the lesson here for data publishers: just because you can do something doesn’t always mean you should do it. Perception has become as important as reality. Don’t let ignorance or arrogance crater your products or your entire business. Keep firmly in mind at all times that, especially when it comes to data, optics do matter.

 

 

 

Should Governments Sell Data?

Under the broad label of “open data,” governments around the world are opening up increasing numbers of fascinating and often valuable datasets to public access, in many cases, via API.
 
As a recent article in Network World notes, London makes nearly 500 datasets available, and even smaller cities in the UK like Leeds make hundreds of datasets available as well. Perhaps most interesting of all is the initiative by the city of Copenhagen, called City Data Exchange, which takes open data in two important new directions. First, it intends to charge for its data, and second, it is also offering relevant databases from for-profit data producers, also for a fee.

The US has not been a leader in the open data movement, though more government data comes online on almost a daily basis now. Typically, the model in the US is that government data made available to the public is made available for free. That makes sense, since it was gathered at taxpayer expense and should therefore be made available for free – keeping the “free” in Freedom of Information if you will.
 
But when you think about it, there may be some merit to governments charging reasonable fees to access public datasets. Simply put, it forces governments to treat their data and the people using their data with more professionalism and respect. I’ve been involved in several promising projects that were to be based on government databases that suddenly disappeared because funding was cut, or the person who was responsible for the initiative left the agency and wasn’t replaced. It’s great to have a business based on free government data – until it isn’t. You are at the mercy of an organization that collects data its own way, for its own purposes, and only for as long as it feels it needs to collect it. Putting a revenue stream behind a dataset starts to change that dynamic.
 
Also of interest is Copenhagen’s plan to be a reseller of private databases. On the one hand, I celebrate the innovation and progressive thinking in this move. On the other hand, it feels backwards to me. If there is a commercial database that complements a government-created database, I think it makes a lot more sense for the commercial database publisher to resell the government data alongside its own. After all, it has the larger financial incentive, it has the staff that really understands data, and it has the marketing and sales capability the government lacks. Government entities are not well positioned to sell their own data, much less someone else’s data, and the better they get at it, the more likely they will cross the line and start competing with private business.
 
Government is a great source of data, though historically it has been a somewhat undependable source of data. Perhaps putting some modest revenue around it could improve that situation. But moving into the business of selling commercial data products, however well intentioned, is a bridge too far. There are too many specialized skills involved that government entities don’t have and shouldn’t develop.

Meet DiscoverOrg at BIMS
 
Want to find out why DiscoverOrg won a 2016 Model of Excellence Award?
 
This year’s winners will be showcased at BIMS, November 14-16 in Ft. Lauderdale. It’s a peer-to-peer forum complete with exclusive tracks on Data and the unique opportunity to hear from the MOE founders firsthand.  Register now to attend!
 
Here’s just a taste of the brilliance behind DiscoverOrg – be sure to attend BIMS to get the full story.
 
DiscoverOrg is a leading global sales and marketing intelligence tool used by over 2,000 companies to accelerate growth. DiscoverOrg’s solutions provide a constant stream of accurate and actionable company, contact, and buying intelligence that can be used to find, connect with, and sell to target buyers more effectively. CMO, Katie Ballard, dsays, “We believe accurate data is the foundation to faster revenue growth. You can’t make good decisions without it. How are you going to grow if you don’t have accurate data to build your sales and marketing strategy on? DiscoverOrg offers the most accurate, actionable, and integrated sales and marketing intelligence—covering contact, company, org charts, buying triggers, and predictive purchase data—that allows our customers to generate more leads, set more meetings, and close more deals.  One of the reasons that the data is so accurate is that we have a team of 150 in-house researchers that verifies every single piece of data in our platform.  We work mostly with technology, staffing, marketing, and consulting firms. Our clients run the gamut from the biggest brands and companies down to startups, and about 80% fall into technology (including hardware, software, information security, etc…), 10% in staffing, and 10% in the other industries.”
 
Hear more at BIMS!
 

Don't Turn Strength Into Weakness

For some time now, the publishing world has been crying foul over the growing power of ad blocking software products. Several studies suggest that as many as 50% of all online users have some ad blocking software installed. Some see this as a death knell for the industry, which is already struggling to maintain viability living off so-called “digital dimes,” a term to describe how much less lucrative online advertising is compared to traditional print advertising which is in decline.

One of the more prominent ad blocking software tools, Adblock Plus, which is published by a German company called Eyeo GmbH, is somewhat less militant than some its competitors, and has come up with a concept called “acceptable ads” that allows specific advertisements to be whitelisted. Some third-party research has concluded that nearly one-third of all U.S. Internet users may be using AdBlock Plus.

Ad blocking software that allows some ads to appear? It may seem odd, but that’s what Adblock Plus does. And how does Eyeo decide what ads are acceptable? Well, that’s where things get really strange. You see, Eyeo will accept payment from “larger organizations” in exchange for whitelisting their advertising. Don’t ask about the specifics of these deals because they are not disclosed. Not surprisingly, some publishers refer to this as a “protection racket.”

If you’re starting to see that Eyeo is compromising its entire brand promise, hold onto your seat. That’s because Eyeo has just rolled out its own real time bidding platform for whitelisted ads. Yes, the company that built its business blocking ads is now in the business of selling ads!

Eyeo justifies all this is by allowing users to click on any of the ads Eyeo serves to them to rate them. How users rate various ads will determine what ads they see in the future. This ostensible innovation is supposed to make this initiative palatable to Adblock plus users.

You probably already see the issue. Having built a popular tool to block ads that may be used by as many as a third of all Internet users, Eyeo has a chokehold on almost every ad-supported website, giving it tremendous market power. And it exercised that power by accepting payments to allow ads to slip through its blocking software. It’s an approach that isn’t totally satisfactory to either Adblock Plus users or website owners. My experience has been that when you are not absolutely clear who your customer is, things end badly. It’s one thing to be a marketplace where you match buyers and sellers for a fee. It’s entirely another thing to try to get paid to match reluctant sellers to reluctant buyers. Indeed, it’s not even clear that what Eyeo has is even a marketplace at all.

The object lesson here is that having tremendous market power is always a two-edged sword and thus must be handled with extreme care. The more greedily and ruthlessly you wield your market power, the more likely you will ultimately lose it as you offend all the various constituents in your market. Through its actions, Eyeo may be sowing the seeds of its own demise. There’s a lesson here for data publishers. 

Make the Product, Not Just the Raw Material

Twitter exhausts me. Even though I feel I have been very selective in who I choose to follow, the volume is overwhelming. Every time I go to review my Twitter feed, I waste far too much time in an exercise to separate the wheat from the chaff to find useful nuggets of news or insight. Twitter ought to be incredibly valuable, but in its current design, users find that to overcome the sheer volume of tweets to get noticed, they have to pump out an increasing number of tweets themselves. It’s an endless game of volumetric one-upmanship that is ultimately self-defeating.

A recent article in the Wall Street Journal takes the view that Twitter is very good as a raw content creation platform, but a failure at making that content useful or even intelligible. We know that Twitter content has value: consider the number of companies looking for trends, breaking news and other signals to gain an edge and generate profits. But it is companies other than Twitter that are adding the value and making the money.

This got me to thinking. Many data publisher still focus on the quantity of the data they provided, not its value. And this inevitably leads to a mentality of selling data by the pound. These publishers deliver lots of data, and their customers figure out what to do with it. For a long time, this was a good business approach for publishers, but hardly an optimized one.

By wrapping their content in software, publishers have added value by allowing customers to act on their data more powerfully. But while data-software integration has been a boon for data publishers, there may still be entirely new products and even entirely new businesses hiding in your data. There are clues to this. Do you have lots of consultants buying your data year after year? Do they renew easily, rarely complaining about price increases? Chances are at least a few of them are productizing your data in some way. Get familiar with their specialties and their services, and you can often come away with new product ideas.

Have you ever changed your file layouts or stopped delivering a specific data field, only to get immediate panic calls from some of your customers? Chances are, they’ve built software around your content and are doing something very valuable with it. A few casual inquiries about how they’re using your data will often yield tremendous insights. Do you have whole categories of customers where you have no idea why they buy your data? Chances are, it will be worth your time to find out. It’s not unusual to find that markets you never considered are making valuable use of your data.

Data-software integration is great, but in the majority of cases, publishers are simply helping their customers better manipulate their data. But there’s a whole additional of level of value that can be created by turning your data into finished products. And while I am not arguing that you should try to run all your customers out of business, if some of them have found a way to make money by re-formatting, augmenting or manipulating your data to add value to it, I’d argue that such opportunities properly belong to the owner of that data. And your subscriber file is often the first best place to look for clues to such opportunities.

Ebay Revamps By Adding Structure

Ebay, the giant online marketplace/flea market, is reacting to lackluster growth in an interesting way: with a new focus on structured data. The goal, simply put, is to make it easier for users to find merchandise on its site.

Currently, eBay merchants upload free-text descriptions of the products they are offering for sale. This works reasonably well, but as we all know, searching on unstructured text is ultimately a hit-or-miss proposition. And with over one million merchants on eBay doing their own data entry with very few rules and little data validation, you can imagine the number of errors that result, ranging from typos, to use of inconsistent terminology to missing data elements, etc. The consequence of this is that buyers can’t efficiently and confidently discover all items available for sale, and sellers can sell their products because they are not being seen.

It may seem odd that after several decades in business, eBay is just getting around to this. But in fact it hasn’t been standing still. Rather, it’s been investing its resources in perfecting its search software, trying to use algorithms to overcome weaknesses in the descriptive product data. And while eBay has made great strides, this shift to structured data is really an admission that there are limits to free text searching.

Granular, precise search results can’t be better or more accurate than the underlying data. If you want to be able to distinguish between copper and aluminum fasteners in your search results, you need your merchants to specify copper or aluminum, spell the words correctly and consistently, and have agreement on how to handle exceptions such as copperplate aluminum. Ideally, you also want your merchants to tag the metal used in the fastener so that you don’t have to hunt for the information in a block of text, with the associated chance of an erroneous result.

While we’ve come to believe there are no limits to full-text search wizardry, remember the best software in the world breaks down when the data is wrong or doesn’t exist. Google spent many years and millions of dollars trying to build online company directories, before finally admitting that even it couldn’t overcome missing and incorrect data.

Databases and data products are all about structure. Cleaning up and organizing data is slow, expensive and not a lot of fun, but it is a huge value-add. Indeed, one of the biggest complaints of those working in the Big Data arena is that the data they want to analyze is simply too inconsistent and undependable to use.

These days, anyone can aggregate giant pots of data. But increasingly, value is being created by making these pots of data more accessible by adding more structure. This is the essence of data publishing, and something successful data publishers fully appreciate and never forget.