Last week, I discussed how the Internet of Things creates all sorts of potential opportunities to create highly valuable, highly granular data. The Billion Prices Project, which is based at MIT, provides another route to the same result. Summarized very simply, two MIT professors, Alberto Cavallo and Roberto Rigobon, collect data from hundreds of online retailers all over the world to build a massive database of product-level pricing data, updated daily. It’s an analytical goldmine that can be applied to solve a broad range of problems.
One obvious example is the measurement of inflation. Currently, the U.S. Government develops its Consumer Price Index inflation data the old fashioned way: mail, phone and field surveys. And inherently, this process is slow. Contrast that with the Billion Price Project that can measure inflation on a daily basis, and do so for a large number of countries.
But measuring inflation is just the beginning. The Billion Prices Project is exploring a range of intriguing questions, such as the premiums that are charged for organic foods and the impact of exchange rates on pricing. You’re really only limited by your specific business information needs – and your imagination.
The Billion Prices Project also offers some useful insights for data publishers. First, the underlying data is scraped from websites. The Billion Prices Project didn’t ask for it or pay for it. That means you can build huge datasets quickly and economically. Secondly, the dataset is significantly incomplete. For example, it entirely ignores the huge service sector of the economy. But’s it’s better than the existing dataset in many ways, and that’s what really matters.
When considering building a database, new web extraction technology gives you the ability to build massive, useful and high quality datasets quickly and economically. And as we have seen time after time, the old aphorism, “don’t let the perfect be the enemy of the good” still holds true. If you can do better than what’s currently available, you generally have an opportunity. Don’t focus on what you can’t get. Instead, focus on whether what you can get meaningfully advances the ball.