I think the FT case is blown out of proportion. It is well-known
that wealth data are uncertain. I for one do not know where
Piketty's wealth data come from and I am sure very few
people do. There is also a myriad choices you have to make
re. wealth estimations (e.g. capitalization or not;
forward-looking or backward-looking) which you do not have to do
when you use income.

(Although there are there, that is re. income too, many issues
and many choices. If one were to go through my data, point by
point, he could also detect a number of problems or
inconsistencies: treatment of zero and negative incomes,
imputation for housing, imputation of home consumption, what
prices do you use for home consumption, how to get correct
self-employment income etc etc. And many of these decisions vary
from survey to survey and are not well documented, or the
documentation is so immense that you cannot go through it or
figure it out.)

The situation with wealth data -- that much I know -- is
much worse. I was a referee twice for Davies et al. global wealth inequality
papers: there were many assumptions used in their papers, and
there are even many more things you have no idea about, e.g. how
is wealth defined in India, who is covered or not, how reliable
it is, what prices are used etc. You just have to accept the
numbers they (Indian statistical office or Davies et al) come up
with. People may not realize that behind one such summary number
there are 1000s of household-level data or even hundreds of
thousands and no one can go through hundreds of surveys and
1000s of individual data to verify them all.

And if you create (as Piketty did) bunch of data for a bunch of
countries, there are bound to be issues. The question is, was
there intentional data manipulation to get the answer one
desires. I do not know it but it strikes me as unlikely that if
one wanted to do it, he would have posted all the data, complete
with formulas, on the Internet. And Thomas's data are not there
since the book was published but were there for months or even
years.

Now: Consider FT points one by one:

"One apparent example of straightforward transcription error
in Prof Piketty’s spreadsheet is the Swedish entry for 1920. The
economist appears to have incorrectly copied the data from the
1908 line in the original source."

Okay, quite likely. When you transcribe hundreds of data,
transcribing some wrongly is very likely. They give only one
example. Are there more?

"A second class (sic!) of problems relates to unexplained
alterations of the original source data. Prof Piketty adjusts his
own French data on wealth inequality at death to obtain
inequality among the living. However, he used a larger adjustment
scale for 1910 than for all the other years, without explaining
why."

Piketty has to explain why he used a a different adjustment
scale. Let's wait to hear from him.

"In the UK data, instead of using his source for the wealth
of the top 10 per cent population during the 19th century, Prof
Piketty inexplicably adds 26 percentage points to the wealth
share of the top 1 per cent for 1870 and 28 percentage points for
1810."

Same thing.

"A third problem is that when averaging different countries
to estimate wealth in Europe, Prof Piketty gives the same weight
to Sweden as to France and the UK – even though it only has
one-seventh of the population."

This is neither here nor there. Perhaps the weights should
be country wealth shares, not population shares. At times, you
want to have unweighted averages and at times population- or
income- or wealth-weighted. The question is whether one or
another averaging makes more sense for the issue at hand and
whether you stick to whatever you have chosen.

"There are also inconsistencies with the years chosen for
comparison. For Sweden, the academic uses data from 2004 to
represent those from 2000, even though the source data itself
includes an estimate for 2000."

I do not understand this well. I have sometimes used (say)
a 2003 survey to stand for the benchmark year 2000, sometimes for
the benchmark year 2005. It just depends for what countries you
have what data and also when. My data for (say) benchmark year
2011 improve as time goes by and I get more countries and more
recent surveys. So if you compare my global inequality
estimate for a given year in the first draft of the paper and in
the final version, they would often differ a
bit.

In conclusion, the only real issue is why Piketty adjusted
the data for several years differently, whether it is explained
in the files, whether that explanation is reasonable, and if it
is not explained, whether he can provide one. Out of the three
"classes" of issues raised by FT, only the second has some
validity. So far.