With the arrival of a new administration in Washington, government websites have been in the news, as many groups have been closely watching them as a way to read the political tea leaves. And the new administration has not been shy about making changes. Within days of the Inauguration, there were reports of substantial changes to the White House website, with whole categories of content suddenly disappearing. Similarly, controversy erupted over removal of content from the Environmental Protection Agency website.
Leaving aside the highly-charged politics driving these actions, there is an important point here for data publishers: online data doesn’t last forever. And that’s a big area of opportunity for publishers.
Company websites are a useful source of current information on companies. They generally do a great job of keeping information on their leadership teams, office locations, products and the like all current and accurate. But while current data is what most people want, those who really want to understand a company also want to know what came before. Even more importantly, if you have enough history, you can start to see trends. But as I just noted, websites only tell you about the present, and they tell you about the present in a way designed to put them in the best possible light.
For example, knowing the name of a company’s current CEO has some value. But there is often as much or greater value in knowing the name of the company’s prior CEO – perhaps she is a recruiting target, or perhaps her biography can provide insight into the changing focus and strategy of the company. And if you keep track of all prior CEO’s and how long each served, you can, among other things, offer high-value insight into the stability of the company.
It’s the same idea with product information. Companies generally announce new products with great fanfare on their websites – usually a press release and often much more. But when new products fail and are discontinued, most companies scrub their websites to remove all traces of these products. There are lots of use cases where knowing what a company is no longer doing is at least as valuable as knowing what it is currently doing. But this kind of information disappears quite quickly online, except in cases where a savvy publisher held onto it.
Perhaps the most intriguing example of preserving online data is The Internet Archive, which takes periodic snapshots of millions of websites. This non-profit project has become a goldmine for researchers, lawyers, investigators, historians, analysts and even savvy salespeople looking to understand how companies have grown and evolved over time.
While it’s easy to conclude that “everything is online now,” the fact is that a lot of information, particularly company information, disappears fairly quickly from the web both by accident and design. Smart publishers are the ones to understand this, and who set themselves up to capture and preserve this information as a way to enhance the value of their own online data products.