Are Chinese statistics manipulated?

In his comment on one of my previous blog posts, CrisisMaven asked me whether or not Chinese statistics are manipulated to meet political objectives. Indeed, it is not unusual for a foreign observer to question Chinese data like that. Given the importance of this question—if the underlying data is manipulated, all analysis based on it will be distorted—I would like to use this blog post to answer it.

Is the data manipulated by the authorities to meet political objectives? My answer to it, at least regarding the data in recent years, is a firm NO. How do I know that? The reason is pretty simple: in such a big economy as China, it is extremely hard, if not impossible at all, to massage the data without leaving any trace—any manipulation in the data will be quickly spotted by observers—and I personally haven’t found any signs of data manipulation in recent years.

In China, there are a whole lot of government departments, semi-government organizations, and private sector institutions providing various statistics. For example, in addition to the National Statistical Bureau (NBS) which is the main official data provider in China, there are dozens of other government branches including the People’s Bank of China (central bank of China), Ministry of Finance, Ministry of Commerce, and National Development and Reform Commission compiling and releasing their own data. Meanwhile, most sector associations like China Iron and Steel Association and China Association of Automobile Manufacturers have their data available to the public. Moreover, some companies like Soufun and CLSA also report data compiled by them. This is not to mention the high quality price data generated by various markets (Shanghai Futures Exchange, Dalian Commodity Exchange, etc.), everyday as China has become a market-based economy. Just like how different parts of the economy are connected to each other within an organic whole, data released by different institutions is interrelated and need to be consistent with each other. Hence, any attempt to manipulate data—say, flattering the GDP number a little bit—requires concerted efforts of various government departments as well as private sector institutions. If there are some participants in the circle not doing this kind of massage to their own data, inconsistency arises, leaving footprints of manipulators in statistics.

To be concrete, I’ll use the V-shaped recovery of China’s GDP growth in the last several quarters as an example. As shown in figure 1 with the dark blue line, China’s GDP growth started to bottom out in the second quarter of 2009. During that time, however, a lot of foreign observers believed the growth number was overstated and the recovery was unreal. In fact, to tell whether this improvement in the GDP growth is real or not, one only need compare the GDP series with other related indicators. As shown in figure 1, the uptick in the GDP number is corroborated by value added to industry, electricity consumption, and value added tax revenue, which all posted V-shaped recovery almost simultaneously. If one is not satisfied with the series I listed above, there are more left at his/her disposal, ranging from domestic transportation and product price numbers to international trade data released by other countries. In fact, all these data all show similar V-shaped patterns, suggesting the improvement in the GDP number is driven by underlying growth momentum. This example shows how one can detect manipulations by comparing related indicators.

As a consumer of a wide range of Chinese statistics, I carry out exercises like what is shown above fairly frequently. I personally haven’t found any significant inconsistency in Chinese data over the last decade. It makes me believe that data manipulation is not a big problem in China in recent years. Indeed, this is the consensus in the academic field as most researchers on this topic agree that there is little sign that China is manipulating its economic data.

Comments

An interesting topic. Seems that the arguments mainly focus on the national level data consistency among various institutions. However, the possibility of local government level data overstatement or “manipulation” is not discussed at all – this might be the crux of the matter.

Thanks for your comment. You are right that the post is completely focused on the national level data. I avoided discussions on the local level data in the post because otherwise this piece will become too long. In fact, it is an interesting topic and I have two points to make at here.
First, at the local level, government officials have strong incentives to manipulate economic data, and that’s why the quality of local level data is generally worse than the national level one. Among all the evaluation objectives of local officials, the economic growth (GDP growth to be specific) has a predominant role, leading to strong incentives for local officials to flatter local GDP numbers. It is a well known fact that the sum of provincial GDP numbers is larger than the national figure. In 2009, provincial GDP numbers were totaled at 36.2 trillion RMB, 8% larger than the national figure (33.5 trillion) reported by the central government.
Second, I don’t think the local data problem is the crux of the matter. In fact, the national level data is not compiled by aggregating local level data reported by local governments. Instead, the national level data is worked out by the central government pretty independently from the local government. That’s why we can see discrepancies between local and national data in the first place. Hence, data manipulation at the local government level does not necessarily lead to distorted national level data.

Thank you Gao, this is copious material and a lot of work to correlate. At first, many thanks for this painstaking argument! I will look more closely at these figures the sources of several of which I didn't yet have in my Statistical Reference List (thanks for that too) and will eventually report back as to what feeling i have after that.

Mr Gao:
Thanks for your detailed blog posts.
My experience of Chinese companies is that the profitability numbers look rosy, but some may have aggressive accounting of accounts receivables and inventories. Eventually some of them will have to take a haircut on these fronts.
Case in point is the Chinese operation of telecom equipment vendor UT Starcom. For many years they had rosy revenue growth and profit numbers by supplying equipment to Chinese telecom SOEs. However this picture was built on inventories and accounts receivables, quite a lot of which had to be eventually written off. The division was eventually sold off.
Official macroeconomic statistics may be accurate but I am not so sure about the micro-economic details that go into that macro-economic story.
thanks,
Raj

Hi Raj, thanks for your comment. I’m not a microeconomic expert, but I’ll give my thoughts on it anyway.
I agree with you that in China, like in other economies, there might be some companies manipulating their books to get rosy numbers for whatever reasons. However, in addition to anecdote evidences, there is little data to quantify the situation. So, we don’t know whether the book manipulation is particular severe in China than in other countries.
Regarding the macro implication of the possible book manipulation at the micro level, I think the quantitative effects are not very big. Macroeconomic data is not merely an aggregation of microeconomic data reported by individual firms. Instead, a substantial part of macro data is compiled by the statistical agencies trough sample survey. Besides, data such as money supply, fiscal revenue, and foreign trade is not based on information from companies at all. Hence, even there are some data problems at the micro level, they do not necessarily translate into problems in macro data.

The local government level data manipulation would be a very interesting topic. Systematically, I can't see the direction of biases of the local government level data. There are incentives for some local government to "overstate" to get the performance credit, and for some to "understate" to get more subsidies from the central government.

Hi Yueqing, thanks for your comment. I personally see the direction of biases tilt towards the up side. As what I explained in my responses to the first comment, the evaluation system for the local government officials give them strong incentive to flatter the growth numbers. I cannot completely rule out the possibility that some local officials might understate local economic performance to get more subsidies. But I think this is confined to lagging areas and therefore only has a small share in the national economy.

Dear Mr. Gao,
thank you for your interessting post.
In general, I agree with you.
However, I don't understand why you "personally haven't found any significant inconsistency in Chinese data..." given that if I take the sum of regional GDP in CSY and compare it with "national" GDP gives us two very different numbers!!
For 2007 (the latest available data in CSY) gives us a differences of approx. 10% (i.e. the sum of regional GDP is 10% higher than the GDP given in the same statistics).
How do you explain that?

Hi Phma, thanks for your comment and good observation. I should have been clearer in the post that the inconsistency I mentioned was actually referred to the inconsistency among national level data only. Between the national and local level data, there are substantial and persistent discrepancies, with the provincial GDP numbers a widely cited example. Please see my response to the first comment for my views on this topic.

Mr. Xu,
In the third paragraph you mentioned CLSA data to support your argument. Ironically, there was actually huge discrepancy between the CLSA data and government data. Someone even had to jump out to “explain” such difference. Please refer to the following Caijing article (in Chinese).
http://www.caijing.com.cn/2009-04-03/110133317.html

Thanks a lot for your good observation. In fact, it highlights the difficulties in interpreting Chinese economic statistics. In principal, related economic indicators should be consistent with each other – their developments should be roughly the same. Hence, checking consistencies among economic series can be used as a tool to detect data manipulation, like what I said in the post. However, it is easy to say, hard to do. In practice, one should be very clear about the data (its definition, coverage, compiling method, etc.) before carrying out the consistency checking exercise. Otherwise, he/she might actually compare apple with orange, and the “inconsistency” found could be a result of different data definition.
The PMI is a very good example. In China, there are two PMI, one is released by the National Bureau of Statistics (NBS) and the other is compiled by the CLSA. In early 2009, these two diverged visibly with the NBS one recovering to 50 and the CLSA reading slipping to below 45. For some observers, this was a clear sign that one of the two PMI must be wrong (and for them it was more plausible that the official PMI released by the NBS was flattered). However, for me, this discrepancy is more like a result of different data coverage. In the NBS PMI, surveyed information of over 700 enterprises is included, and most of the surveyed firms are big and state-owned enterprises. Meanwhile, in the CLSA PMI about 400 enterprises are surveyed to which small and medium firms make up a substantial part. As the big SOEs felt the forceful policy stimulus more directly in early 2009, it is no wonder that the NBS PMI had a better reading than the CLSA one during that time. In fact, with the hindsight of the last year behind us, it is fair to say the official PMI (released by the NBS) is more informative in predicting the economic developments.
The PMI case underlines the importance of a clear understanding of data in interpreting Chinese economic statistics. Sometimes, accusations of manipulation made by some observers are merely results of their misunderstanding of the data. In my future blog post – are Chinese statistics reliable? – I will talk about this issue and, more broadly, how to interpret Chinese statistics effectively. That will complete my discussions on the Chinese economic data.

Thank you for bringing up an important issue in the debate on the Chinese economy.
"this is the consensus in the academic field as most researchers on this topic agree that there is little sign that China is manipulating its economic data"
I believe you are right if "manipulating" is the direct willfull act of changing numbers reported to you, then no the NBS is not doing that.
The question might be are the NBS asking the right statistical questions?
There has been accusations that parts of the Chinese economy (tourism, small getihu enterprises, restaurants) are not measurable in any meaningfull way because they don't register correctly with the authorities.
That is an issue you have not adresse in this blog: how strong are the verifications of the data that the NBS send out?
The question is not: do the NBS "manipulate" the numbers? but instead it is: do you think those that report to the NBS (SOE, Private companies, local energy producing units) could be acting in their own interest when reporting?
I can think of the R&D numbers in 2007, where it was in the interest of a company to classify work as R&D, because it gave them a tax cut. Not sure that was meaningful statistical data
I could be wrong and both the external and internal logic of the NBS numbers fit.

Can't speak for economic statistics but health statistics are most certainly manipulated for political reasons. The incidence and mortality figures reported for H1N1 pandemic influenza by China were wildly different to those of neighbouring coutries during the peak activity of the infection. At a time when India had almost a thousand cases and Hong Kong had around 50, China had reported about five cases. Ultimately, China reported only a handful of H1N1 strain -related deaths, even when the 'normal' seasonal influenza would be expected to cause many deaths. No doubt, statistics were manipulated.

As someone who needs to work with data down to a neighborhood level, I find it interesting how the better packaged data always comes from districts with better infrastructure and real estate projects. Just yesterday in Changsha, I visited two neighboring district government statistics offices. After some detective work, the first office produced a small iPhone sized 20 page photocopied pamphlet which was for "internal use only," while 5km away at another district they provided me published bounded editions of their local data with color photographs.
Guess which district the new hi-speed rail station was in?
I think local data is only as good as the interests and aspirations of the local government offices. Accurate? That's a different topic entirely. As much of the macro level data is built up from local level data, it makes me wonder about the amount of gap there is once it gets carried all the way up to State level.