Tuesday, October 12, 2010

Google is creating a Google Price Index using the vast data they have on prices of goods and services available online. This provides a daily measure of inflation.

Hal Varian, Google's Chief Economist, says that the GPI shows a “very clear deflationary trend” for web-traded goods in the US since last year. In contrast, the GPI rose during the same period a year ago. Meanwhile, the official government "core" consumer price index shows a small 0.9% increase since last year.

When I met with Ben Bernanke last fall, I encouraged the Fed to rely more heavily on the vast amounts of "nanodata" and "nowcasting" available via the web and other sources to get more fine-grained and timely information about the state of the economy. Given the damage that deflation can cause to an economy, I have no doubt he's looking closely at the GPI as he thinks about how to manage monetary policy.

12 comments:

This is a very interesting concept and it coincides with Google's CEO who stated: "we are in the business of making all the world's information accessible and useful".But I do not believe that this index could be a benchmark index afterall. First of all it does not include all goods in order to build a reliable index. Furthermore, what if another company tries to do the same thing? i.e. Yahoo! or Microsoft via their search engines. Could Amazon or eBay do the same thing for more specialized goods? An interesting project would be to build a weighted index with all internet companies participating, having thus the majority of goods in order to create a really reliable internet-universal price index.

Did you know that there is already a price index created by ex-Google software engineer available at website Numbeo.com? It's crowedsourced like wikipedia. It might be interesting to your readers as well.

Per our discussion today, it is clear that nanodata offers information that is simply unavailable (or very difficult to find) elsewhere. Given inevitable inaccuracy in current benchmarks, it seems silly to move forward without taking into account the GPI.

Whether the GPI should be a sole benchmark is unlikely to be determined immediately. But, it should certainly be evaluated to see if it stands the test of time (and effecitveness).

Assuming that the GPI is eventually seen as valuable by the Fed and/or the private sector, I would be most curious about Google's thoughts on its pricing.

Google obviously has a competitive advantage in its data capabilities, and I'm not sure the best answer is to publish information such as the GPI for free. Even though there are benefits to the economics of free (as we discussed in class), I can imagine a large number of businesses that would pay to have these daily insights into pricing trends. Although this would represent a slight shift in business model for Google, the quantity of information that goes into the GPI and the potential power it offers may justify it.

I agree with Kostis about the challenge of finding suitable weights for the GPI. I do assume that housing and transport (currently taking up More than anything, however, I love how the GPI is offering a competing definition to measures of inflation, and certainly begs the question of why the federal government is still retrieving stats manually and not using technology to collect and analyze real-time data for the CPI. Could it be that there are expenditures that the fed felt could not be accurately attained from the internet?

As with most things there will be a lot of resistance to change the status quo. Especially with government matters, the ability to say we are not the authoritative source any longer and are relying on a bunch of tech "geeks" to measure core inflation would be politically damaging. There clearly is a better way to measure CPI but getting consensus on how it should be changed will be extremely difficult.

My prediction is that CPI will change in the next few years to reflect better methods of data collection. The agency won't openly admit to using someone else's method (like Google's GPI) but will instead copy 90%+ of it and make a few excuses to tweak it as they see fit.

I think more data, as long as it's true, is always a good thing, as we have more angles to look at things. Google data may be a good weatherglass for some key measures, like consumer confidence, CPI. But I definitely agree with Kostis that this set of data has some inherent problems to fully reflect the economic, for example, the population bias, the goods selection

However, it would seem to me, reading the paper, that the authors are not conducting the correct statistical tests to arrive at the conclusion that a model including twitter data represents an improvement over the current methodology.

They calculate a p-value of 3.4% to the hypothesis that an up-down accuracy of 87.6% could have arisen randomly using a classifier that achieves 50% accuracy over 10.9 instances of 20 day periods. I don't know how that test proves the point - it is a straw man argument.

First of all, they are not comparing their model against a model with 50% accuracy, they are comparing their model against an existing base model (I0) to predict the stock market upon which they are adding the Twitter factors. In this paper, I0 is a model that doesn't include twitter data, but does include the DJIA from the past three time periods. The accuracy of this model is given as 73.3% in the same time period.

This is an important point because evaluating whether a test with 73.3% accuracy can randomly generate 87.6% accuracy for a given time period is very different than increasing accuracy from 50% to 87.6%. In fact, using the very same statistical test they propose (which is also a bit off, they should be looking at the probability of achieving an accuracy of 87.6% OR HIGHER), the p value is 91.2%, which means there is a 91.2% chance this accuracy could have arisen randomly using the base model. In other words, there is a 91.2% chance that adding Twitter results is completely useless.

However, even if the baseline model WAS 50% accurate, they are still using the wrong approach for testing. They are basing their accuracy metrics on a test set of one 20 day period in December, which, they even admit in their paper, is a period of "stabilization" (of course only known in retrospect). A more convincing analysis would have tested a number of 20 day periods. Even if they couldn't have gotten more data, which seems surprising given the nature of the datasets, they could have used simple cross validation methods to create a variety of training and test sets from the same data and combined them using readily available methods. This would have given a more precise sense of the overall accuracy. Instead, they use a tortured approach to calculate their p-value, but why not just go ahead and classify 10.9 periods, and see what happens? I doubt their classifier will be 87.6% accurate for every one of those 10.9 periods, but that is essentially what they are assuming when they calculate that p-value.

My main concern is that no intuition at all is given for why it may be that a public who feels "calm" as measured by Google via Twitter has any causal influence over whether the stock market goes up or down, let alone why the other emotions don't seem to matter. This is why I look at all this type of research with skepticism, and it doesn't help when hypotheses are not clearly identified or irrelevant to the point of the research.

Sorry, had to vent. I congratulate those of you who stayed through the end.

Thanks Leo :) I do not believe you did the calculations. I have to admit I did not read the paper. Just the article. Thanks to your analysis, I will read the article now with better understanding. Glad I made you practise your analytical skills!