Meng Hu's Bloghttps://menghublog.wordpress.com
Anecdote is not data. Why so many people do not understand that ?
Mon, 01 Apr 2019 18:04:04 +0000 en
hourly
1 http://wordpress.com/https://s0.wp.com/i/buttonw-com.pngMeng Hu's Bloghttps://menghublog.wordpress.com
Bad experience at MDPI (J. Intell.) journalhttps://menghublog.wordpress.com/2015/06/18/bad-experience-at-mdpi-j-intell-journal/
Thu, 18 Jun 2015 21:59:12 +0000http://menghublog.wordpress.com/?p=2788Continue reading →]]>After OP journal, here, I had problems with MDPI (J. Intell.) as well. Since I have decided not to publish at OP, my Wordsum article had to be published somewhere. I thought about MDPI because it is the one that resembles OP the most.

MDPI journals are open-access, have faster time of review than the other ones. The journal charges some fees if there are many english (grammatical) errors to correct. MDPI does not charge fees for open access journals but only for young journals, e.g., within 5 years-old (which thing I have learned only later). Some MDPI journals even have the nice idea to make the reviews open access, although it is not a common practice even among MDPI journals.

Currently, the editor in chief of J. Intell. is Paul De Boeck. And I have a problem with him. I have sent a first mail the February 9th 2015 and De Boeck was fast to reply (after 24 minutes). I have asked if it was a problem that I am not an affiliated researcher (I have no institution) and if I can upload ODT file (Open Office) for the manuscript because I do not have Microsoft Word and I find it hard to use LaTex. The editor answered and asked if I can convert it to PDF. In reply, I sent the pdf, ODT file and all of the related material (syntax, data, XLS file). He didn’t reply anymore. I have sent three more emails, but nothing at all. And the third one (February 16th) was important enough that his silence is really troublesome, as I will explain below.

At that moment, I didn’t know why he didn’t answer me. I have looked at the website again, and I discovered that the manuscript needs to be uploaded by completing the submission form. First page (step), select the name of the journal, section, article type, title, abstract, number of pages, authors, etc. Second page, email address, name of the author. Third page, select the suggested reviewers, if there are any. Fourth page, upload the manuscript and materials (not mandatory but highly recommended). Fifth page, confirm. This is the usual way to do. I learned that after. Silly me. But whatever.

If I send the manuscript to the editor, he would, I guess, tell me to use the submission form. But De Boeck said nothing. I think I understand why. He probably opened the files and discovered I was talking about “black-white cognitive difference” and perhaps didn’t appreciate.

But I wasn’t sure about that. Anyway, I have uploaded the paper the February 15th 2015. And I got this mail :

Dear Dr. Hu,

Thank you very much for uploading the following manuscript to the MDPI submission and editorial system at http://www.mdpi.com. One of our editors will be in touch with you soon.

Yet, I discovered that there is a special template to use when submitting the manuscript : the MDPI template. It seemed to be a recommended step. Unfortunately, I was unable to edit or change the files I have already sent for the submission. So, I decided to contact De Boeck, the February 16th. I sent a mail, explained the problem and attached the corrected manuscript. I have asked if there is no possibility to change the uploaded file “manuscript” in the submission menu.

I thought he would respond, because a non response would be a lack of professionalism. Yet, he didn’t answer me. I started to have a bad feeling. And I was right. I have uploaded the manuscript the February 15th and there was no return after several weeks. I got nothing. No email from MDPI saying that De Boeck will be in charge of publishing it, or that some reviewers have been contacted or whatever. If I’m not mistaken, it is said in the MDPI website that the editor is the first to check the manuscript and if it is receivable, the editor will invite several reviewers.

I didn’t have any response after several weeks. And I didn’t think that it would take several weeks for an editor to decide whether it is receivable for being peer-reviewed. So, in March 9th 2015, I sent that mail :

Dear Editor of J. Intell.,

I have sent my article several weeks ago, and I don’t know about the current situation. I would like to know if it is receivable for being peer-reviewed, or if the article in its current form (presentation) needs to be modified in some ways (e.g., use of the template).

The article is :

An update on the secular narrowing of the black-white gap in the Wordsum vocabulary test (1974-2012)

We are writing to inform you that we will not be able to process your paper further. Your manuscript was not given a high priority rating during the initial screening process. Please understand that we receive more interesting papers than we can publish. Hence only those papers most likely to be published in J. Intell. are sent for in-depth peer-review. Papers are selected on the basis of discipline, novelty and general significance in addition to the usual criteria for publication in specialized journals. Therefore, our decision is not necessarily a reflection of the quality of your research but rather of our stringent resource limitations.

“This must be a joke” was my reaction. Apparently, my paper was not given a high priority rating. In other words, the editorial board is uninterested in publishing it. Lack of motivation, thus. This is the first time I have heard an author being rejected due to lack of motivation from the editor. And what is the last sentence about their “stringent resource limitations” ? It is not as if J. Intell. has lot of publications published per month (in general, one per month). In fact, only a few number of people publish there. And most of the papers I have read are uninteresting. That is, they are not big projects or big findings. Just quick comments on questions about intelligence (and some of these papers were published because apparently their authors have been contacted by De Boeck who invited them to publish). Papers published at Intelligence are infinitely better. And more numerous. I presume this is because most researchers prefer Intelligence (Elsevier) or they do not know J. Intell. (MDPI).

Oh, I guess it is not a problem. After all, the above message seems to be some sort of standard. If you copy paste the paragraph, you will see. Other people also received exactly the same message when they are rejected before they get the review. In my situation, it is even more frustrating that I have to ask De Boeck in person, before he decides what to do. If I didn’t ask, how much longer I will have to wait ? And since De Boeck immediately forwarded my mail to Wang who, in turn, replied to me, I have sometimes the feeling that he already made his decision about my paper well before I emailed him.

Yes, you read well. It’s Wang who sent me the above message, despite the fact that my mail was sent to Paul De Boeck. No one else. I thought he should have the responsibility to reply to my query but no. He forwarded the mail to someone else. Is this cowardice or is there something I do not understand with the MDPI procedure ? But I do not care anymore. Because I decided I would never publish at J. Intell. anymore. And I prefer now Elsevier.

To be honest, my paper has also been rejected by Detterman, editor of Intelligence (Elsevier) very recently. But probably the main reason is the first reviewer, who said that the paper should not be submitted in Intelligence because such papers must evaluate, test theories, whereas mine is mostly about which statistical method is most suitable for analyzing score changes over time in a cohort analysis from a survey data like the GSS. James Flynn was the second reviewer and he disagreed 1) about my interpretation of Dickens & Flynn (2006) who didn’t admit there is no black-white narrowing for adult samples and 2) about my speculation of the predicted black-white genetic gap because of the correlation (i.e., confounding) between genes and environment and he referred to Dickens & Flynn’s (2001) social multiplier model. Oh, I could have answered that, or I could have avoided the exhaustion from the struggle by merely accepting the changes or modifications he suggested. The problem being that Flynn didn’t request anything. He didn’t reject but he didn’t accept. But it’s very likely he would have rejected it if I refused to make the necessary changes.

In any case, Elsevier took one month for proceeding the review. But still, I get a response. Although I find it silly, I accept the rejection. I guess it is my fault if the abstract (and conclusion to a lesser extent) was written the way it was.

I do not believe MDPI is a respectable journal given this unfair procedure. I do not know how well the other MDPI journals are doing. I hope, for MDPI, that they are doing better.

]]>menghublog1001Pop Internationalism (Paul Krugman 1996)https://menghublog.wordpress.com/2015/05/10/pop-internationalism-paul-krugman-1996/
Sun, 10 May 2015 22:50:02 +0000http://menghublog.wordpress.com/?p=2736Continue reading →]]>In Pop Internationalism, Krugman defends international trade. Several ideas are put forward. The shrinking in manufacturing sectors (and its related jobs) has domestic causes, in particular the growing share of the sectors of services in the GDP. International trade with low-wage countries has nothing to do with this. The United States buys most of its imports from other advanced countries, whose workers have similar skills and wages. Imports are not so much greater than exports in the United States. While foreign competition can reduce domestic income through the terms of trade effect, it had negligible effect on the United States notably because importation represents a small share of the U.S. GDP. A country tends to export more when his relative advantage is greater than the other countries. Comparative (rather than absolute) advantage is only what matters; that is, a country with lower productivity than another country in all of his industries will still gain from trade rather than not. Competitiveness makes no sense because the U.S. income growth would not be any different in a situation where all other countries grow at equal rate than in a situation where all other countries have faster growth than the United States. A low-wage country, since it receives a large inflow of capital from a high-wage country, cannot have trade surpluses while having investment greater than savings (due to foreign capital) so they must run trade deficits. The formidable growth of the asian tigers, just like the soviet union before, is accounted for by growth in inputs (subjected to diminishing returns) but not by growth in efficiency (i.e., output per unit of input).

Below, I have selected the important passages of the books. They are highlighted.

1. Competitiveness: A Dangerous Obsession

Mindless Competition

One might suppose, naively, that the bottom line of a national economy is simply its trade balance, that competitiveness can be measured by the ability of a country to sell more abroad than it buys. But in both theory and practice a trade surplus may be a sign of national weakness, a deficit a sign of strength. For example, Mexico was forced to run huge trade surpluses in the 1980s in order to pay the interest on its foreign debt since international investors refused to lend it any more money; it began to run large trade deficits after 1990 as foreign investors recovered confidence and began to pour in new funds. Would anyone want to describe Mexico as a highly competitive nation during the debt crisis era or describe what has happened since 1990 as a loss in competitiveness? […]

But surely this changes when trade becomes more important, as indeed it has for all major economies? It certainly could change. Suppose that a country finds that although its productivity is steadily rising, it can succeed in exporting only if it repeatedly devalues its currency, selling its exports ever more cheaply on world markets. Then its standard of living, which depends on its purchasing power over imports as well as domestically produced goods, might actually decline. In the jargon of economists, domestic growth might be outweighed by deteriorating terms of trade.2 So “competitiveness” could turn out really to be about international competition after all.

2 An example may be helpful here. Suppose that a country spends 20 percent of its income on imports, and that the price of its imports are set not in domestic but in foreign currency. Then if the country is forced to devalue its currency – reduce its value in foreign currency – by 10 percent, this will raise the price of 20 percent of the country’s spending basket by 10 percent, thus raising the overall price index by 2 percent. Even if domestic output has not changed, the country’s real income will therefore have fallen by 2 percent. If the country must repeatedly devalue in the face of competitive pressure, growth in real income will persistently lag behind growth in real output.

Careless Arithmetic

Trade Deficits and the Loss of Good Jobs. In a recent article published in Japan, Lester Thurow explained to his audience the importance of reducing the Japanese trade surplus with the United States. U.S. real wages, he pointed out, had fallen six percent during the Reagan and Bush years, and the reason was that trade deficits in manufactured goods had forced workers out of high-paying manufacturing jobs into much lower-paying service jobs.

This is not an original view; it is very widely held. But Thurow was more concrete than most people, giving actual numbers for the job and wage loss. A million manufacturing jobs have been lost because of the deficit, he asserted, and manufacturing jobs pay 30 percent more than service jobs.

Both numbers are dubious. The million-job number is too high, and the 30 percent wage differential between manufacturing and services is primarily due to a difference in the length of the workweek, not a difference in the hourly wage rate. But let’s grant Thurow his numbers. Do they tell the story he suggests?

The key point is that total U.S. employment is well over 100 million workers. Suppose that a million workers were forced from manufacturing into services and as a result lost the 30 percent manufacturing wage premium. Since these workers are less than 1 percent of the U.S. labor force, this would reduce the average U.S. wage rate by less than 1/100 of 30 percent that is, by less than 0.3 percent.

This is too small to explain the 6 percent real wage decline by a factor of 20. Or to look at it another way, the annual wage loss from deficit-induced deindustrialization, which Thurow clearly implies is at the heart of U.S. economic difficulties, is on the basis of his own numbers roughly equal to what the U. S. spends on health care every week.

2. Proving My Point

Sloppy Math: Part II

For example, Thurow says that imports are 14 percent of U.S. GNP, while exports are only 10 percent, and that reducing imports to equal exports would add $250 billion to the sales of U.S. manufacturers. But according to Economic Indicators, the monthly statistical publication of the Joint Economic Committee, U.S. imports in 1993 were only 11.4 percent of GDP, while exports were 10.4 percent. Even the current account deficit, a broader measure that includes some additional debit items, was only $109 billion. If the United States were to cut imports by $250 billion, far from merely balancing its trade as Thurow asserts, the United States would run a current account surplus of $140 billion that is, more than the 2 percent maximum of GDP U.S. negotiators have demanded Japan set as a target!

Or consider Prestowitz, who derides my claim that high-technology industries, commonly described as “high value” sectors, actually have much lower value added per worker than traditional “high volume,” heavy industrial sectors. I have aggregated too much by looking at broad sectors like electronics, he says; I should look at the highest-tech lines of business, like semiconductors, where value added per worker is $234,000. Prestowitz should report the results of his research to the Department of Commerce, whose staff has obviously incorrectly calculated (in the Annual Survey of Manufactures) that in 1989 value added per worker in Standard Industrial Classification 3674 (semiconductors and related devices) was $96,487 closer to the $76,709 per worker in SIC 2096 (potato chips and related snacks) than to the $187,569 in SIC 3711 (motor vehicles and car bodies). […]

Beyond these petty, if revealing, errors of fact are a series of conceptual misunderstandings. For example, Prestowitz argues that productivity in sectors that compete on world markets is much more important than productivity in non-traded service sectors because the former determine wage rates throughout the economy. For example, because U.S. manufacturing workers are much more productive than their Third World counterparts, U.S. barbers, who do not have a comparable productivity advantage, also get high wages. But Prestowitz fails to notice that the converse is also true: service productivity affects the real wages of manufacturing workers. Because the high relative productivity of U.S. manufacturing is not matched in the haircut sector, haircuts by those well-paid barbers are much more expensive than haircuts in the Third World; as a result real wages of U.S. manufacturing workers (that is, wages in terms of what they can buy, including haircuts) are not as high as they would be if U.S. barbers were more productive. With careful thought, one realizes that real wages depend on the overall productivity of the economy, with no special presumption that productivity in manufacturing or in internationally traded sectors in general deserves any more attention or active promotion than productivity elsewhere.

Cohen makes essentially the same mistake when he complains that I underestimated the effects of competitive pressure because I focused only on import and export prices and did not consider the further impacts of that pressure on profits and wages. He somehow fails to realize that a change in wages or profits that is not reflected in import or export prices cannot change overall U.S. real income – it can only redistribute profits to one group within the United States at the expense of another. That is why the effect of international price competition on U.S. real income can be measured by the change in the ratio of export to import prices – full stop. And the effects of changes in this ratio on the U.S. economy have, as I showed in my article, been small.

Or consider Thurow’s analysis of the benefits that would accrue to the United States if it could roll back imports (leaving aside the inaccuracy of his numbers). He asserts that the United States could create five million new jobs in import-competing sectors, and he assumes that all five million jobs represent a net addition to employment. But this assumption is unrealistic. As this reply was being written, the Federal Reserve was raising interest rates in an effort to rein in a recovery that it feared would proceed too far, that is, lead to excessive employment, producing a renewed surge in inflation. Some people think that the Fed is tightening too soon, but the essential point is that the growth of employment is not determined by the ability of the United States to sell goods on world markets or to compete with imports, but by the Fed’s judgement of what will not set off inflation. So suppose that the United States were to impose import quotas, adding millions of jobs in import-competing sectors. The Fed would respond by raising interest rates to prevent an overheated economy, and most if not all of the job gains would be matched by job losses elsewhere.

Things Add Up

In each of these cases, my critics seem to have forgotten the most basic principle of economics: things add up. Higher employment in import-competing industries must come either through a reduction in unemployment, in which case one must ask whether the implied unemployment rate (about three percent in Thurow’s example) is feasible, or at the expense of jobs elsewhere in the economy, in which case no overall job gain takes place. If higher manufacturing wages lead to a higher wage rate for barbers without higher tonsorial productivity, the gain must come at someone else’s expense. Since it is hard to see how foreigners pay for more expensive American haircuts, that wage gain can only redistribute the benefits of manufacturing productivity from one set of American workers to another, not increase the total gains. In their haste to assign great importance to international competition, my critics, like the inventors of perpetual motion machines, have failed to realize that there are conservation principles that any story about the economy must honor. […]

Well, that’s not quite the real story. It is true that in the early 1980s professional economists became aware that one of the implications of new theories of international trade was a possible role for strategic policies to promote exports in certain industries. Confronted with a new idea that was exciting, potentially important but untested, these economists began a sustained process of research, probing the weak points, confronting the new idea with the data. After all, lots of things could be true in principle. For example in certain theoretical situations a tax cut could definitely stimulate the economy so much that government revenues would actually rise, and it would be very nice if that were the actual situation; but unfortunately it isn’t. Similarly, it is definitely possible to imagine a situation in which, because of all of the market imperfections Thurow dwells on, a clever strategic trade policy would sharply raise U.S. real income. And it would be very nice if the United States could devise such a policy. But is that possibility really there? To answer that question requires looking hard at the facts.

And so over the course of the last ten years a massive international research program has explored the prospects for strategic trade policy. Two broad conclusions emerge. First, to identify which industries should receive strategic promotion or the appropriate form and level of promotion is very difficult. Second, the payoffs of even a successful strategic trade policy are likely to be very modest certainly far less even than Thurow’s “seven percent solution,” which is closer to the entire share of international trade in the U.S. economy.

3. Trade, Jobs, and Wages

The real wage of the average American worker more than doubled between the end of World War II and 1973. Since then, however, those wages have risen only 6 percent. Furthermore, only highly educated workers have seen their compensation rise; the real earnings of blue-collar workers have fallen in most years since 1973.

Why have wages stagnated? A consensus among business and political leaders attributes the problem in large part to the failure of the U.S. to compete effectively in an increasingly integrated world economy. This conventional wisdom holds that foreign competition has eroded the U.S. manufacturing base, washing out the high-paying jobs that a strong manufacturing sector provides. More broadly, the argument goes, the nation’s real income has lagged as a result of the inability of many U.S. firms to sell in world markets. And because imports increasingly come from Third World countries with their huge reserves of unskilled labor, the heaviest burden of this foreign competition has ostensibly fallen on less educated American workers.

Many people find such a story extremely persuasive. It links America’s undeniable economic difficulties to the obvious fact of global competition. In effect, the U.S. is (in the words of President Bill Clinton) “like a big corporation in the world economy” and, like many big corporations, it has stumbled in the face of new competitive challenges.

Persuasive though it may be, however, that story is untrue. A growing body of evidence contradicts the popular view that international competition is central to U.S. economic problems. In fact, international factors have played a surprisingly small role in the country’s economic difficulties. The manufacturing sector has become a smaller part of the economy, but international trade is not the main cause of that shrinkage. The growth of real income has slowed almost entirely for domestic reasons. And contrary to what even most economists have believed recent analyses indicate that growing international trade does not bear significant responsibility even for the declining real wages of less educated U.S. workers.

The fraction of U.S. workers employed in manufacturing has been declining steadily since 1950. So has the share of U.S. output accounted for by value added in manufacturing. (Measurements of “value added” deduct from total sales the cost of raw materials and other inputs that a company buys from other firms.) In 1950 value added in the manufacturing sector accounted for 29.6 percent of gross domestic product (GDP) and 34.2 percent of employment; in 1970 the shares were 25.0 and 27.3 percent, respectively; by 1990 manufacturing had fallen to 18.4 percent of GDP and 17.4 percent of employment.

Before 1970 those who worried about this trend generally blamed it on automation that is, on rapid growth of productivity in manufacturing. Since then, it has become more common to blame deindustrialization on rising imports; indeed, from 1970 to 1990, imports rose from 11.4 to 38.2 percent of the manufacturing contribution to GDP.

Yet the fact that imports grew while industry shrank does not in itself demonstrate that international competition was responsible. During the same 20 years, manufacturing exports also rose dramatically, from 12.6 to 31.0 percent of value added. Many manufacturing firms may have laid off workers in the face of competition from abroad, but others have added workers to produce for expanding export markets.

To assess the overall impact of growing international trade on the size of the manufacturing sector, we need to estimate the net effect of this simultaneous growth of exports and imports. A dollar of exports adds a dollar to the sales of domestic manufacturers; a dollar of imports, to a first approximation, displaces a dollar of domestic sales. The net impact of trade on domestic manufacturing sales can therefore be measured simply by the manufacturing trade balance the difference between the total amount of manufactured goods that the U.S. exports and the amount that it imports. (In practice, a dollar of imports may displace slightly less than a dollar of domestic sales because the extra spending may come at the expense of services or other nonmanufacturing sales. The trade balance sets an upper bound on the net effect of trade on manufacturing.)

Undoubtedly, the emergence of persistent trade deficits in manufactured goods has contributed to the declining share of manufacturing in the U.S. economy. The question is how large that contribution has been. In 1970 manufactured exports exceeded imports by 0.2 percent of GDP. Since then, there have been persistent deficits, reaching a maximum of 3.1 percent of GDP in 1986. By 1990, however, the manufacturing deficit had fallen again, to only 1.3 percent of GDP. The decline in the U.S. manufacturing trade position over those two decades was only 1.5 percent of GDP, less than a quarter of the 6.6 percentage point decline in the share of manufacturing in GDP.

Moreover, the raw value of the trade deficit overstates its actual effect on the manufacturing sector. Trade figures measure sales, but the contribution of manufacturing to GDP is defined by value added in the sector that is, by sales minus purchases from other sectors. When imports displace a dollar of domestic manufacturing sales, a substantial fraction of that dollar would have been spent on inputs from the service sector, which are not part of manufacturing’s contribution to GDP.

To estimate the true impact of the trade balance on manufacturing, one must correct for this “leakage” to the service sector. Our analysis of data from the U.S. Department of Commerce puts the figure at 40 percent. In other words, each dollar of trade deficit reduces the manufacturing sector’s contribution to GDP by only 60 cents. This adjustment strengthens our conclusion: if trade in manufactured goods had been balanced from 1970 to 1990, the downward trend in the size of the manufacturing sector would not have been as steep as it actually was, but most of the deindustrialization would still have taken place. Between 1970 and 1990 manufacturing declined from 25.0 to 18.4 percent of GDP; with balanced trade, the decline would have been from 24.9 to 19.2, about 86 percent as large.

International trade explains only a small part of the decline in the relative importance of manufacturing to the economy. Why, then, has the share of manufacturing declined? The immediate reason is that the composition of domestic spending has shifted away from manufactured goods. In 1970 U.S. residents spent 46 percent of their outlays on goods (manufactured, grown or mined) and 54 percent on services and construction. By 1991 the shares were 40.7 and 59.3 percent, respectively, as people began buying comparatively more health care, travel, entertainment, legal services, fast food and so on. It is hardly surprising, given this shift, that manufacturing has become a less important part of the economy.

In particular, U.S. residents are spending a smaller fraction of their incomes on goods than they did 20 years ago for a simple reason: goods have become relatively cheaper. Between 1970 and 1990 the price of goods relative to services fell 22.9 percent. The physical ratio of goods to services purchased remained almost constant during that period. Goods have become cheaper primarily because productivity in manufacturing has grown much faster than in services. This growth has been passed on in lower consumer prices.

Ironically, the conventional wisdom here has things almost exactly backward. Policymakers often ascribe the declining share of industrial employment to a lack of manufacturing competitiveness brought on by inadequate productivity growth. In fact, the shrinkage is largely the result of high productivity growth, at least as compared with the service sector. The concern, widely voiced during the 1950s and 1960s, that industrial workers would lose their jobs because of automation is closer to the truth than the current preoccupation with a presumed loss of manufacturing jobs because of foreign competition.

Because competition from abroad has played a minor role in the contraction of U.S. manufacturing, loss of jobs in this sector because of foreign competition can bear only a tiny fraction of the blame for the stagnating earnings of U.S. workers. Our data illuminate just how small that fraction is. In 1990, for example, the trade deficit in manufacturing was $73 billion. This deficit reduced manufacturing value added by approximately $42 billion (the other $31 billion represents leakage goods and services that manufacturers would have purchased from other sectors). Given an average of about $60,000 value added per manufacturing employee, this figure corresponded to approximately 700,000 jobs that would have been held by U.S. workers. In that year, the average manufacturing worker earned about $5,000 more than the average nonmanufacturing worker. Assuming that any loss of manufacturing jobs was made up by a gain of nonmanufacturing jobs an assumption borne out by the absence of any long-term upward trend in the U.S. unemployment ratethe loss of “good jobs” in manufacturing as a result of international competition corresponded to a loss of $3.5 billion in wages. U.S. national income in 1990 was $5.5 trillion; consequently, the wage loss from deindustrialization in the face of foreign competition was less than 0.07 percent of national income.

Many observers have expressed concern not just about wages lost because of a shrinking manufacturing sector but also about a broader erosion of U.S. real income caused by inability to compete effectively in world markets. But they often fail to make the distinction between the adverse consequences of having slow productivity growth which would be bad even for an economy that did not have any international trade and additional adverse effects that might result from productivity growth that lags behind that of other countries.

To see why that distinction is important, consider a world in which productivity (output per worker-hour) increases by the same amount in every nation around the world – say, 3 percent a year. Under these conditions, all other things remaining equal, workers’ real earnings in all countries would tend to rise by 3 percent annually as well. Similarly, if productivity grew at 1 percent a year, so would earnings. (The relation between productivity growth and earnings growth holds regardless of the absolute level of productivity in each nation; only the rate of increase is significant.)

Concerns about international competitiveness, as opposed to low productivity growth, correspond to a situation in which productivity growth in the U.S. falls to 1 percent annually while elsewhere it continues to grow at 3 percent. If real earnings in the U.S. then grow at 1 percent a year, the U.S. does not have anything we could reasonably call a competitive problem, even though it would lag other nations. The rate of earnings growth is exactly the same as it would be if other countries were doing as badly as we are.

The fact that other countries are doing better may hurt U.S. pride, but it does not by itself affect domestic standards. It makes sense to talk of a competitive problem only to the extent that earnings growth falls by more than the decline in productivity growth.

Foreign competition can reduce domestic income by a well-understood mechanism called the terms of trade effect. In export markets, foreign competition can force a decline in the prices of U.S. products relative to those of other nations. That decline typically occurs through a devaluation of the dollar, thereby boosting the price of imports. The net result is a reduction in real earnings because the U.S. must sell its goods more cheaply and pay more for what it buys.

During the past 20 years, the U.S. has indeed experienced a deterioration in its terms of trade. The ratio of U.S. export prices to import prices fell more than 20 percent between 1970 and 1990; in other words, the U.S. had to export 20 percent more to pay for a given quantity of imports in 1990 than it did in 1970. Because the U.S. imported goods whose value was 11.3 percent of its GDP in 1990, these worsened terms of trade reduced national income by about 2 percent.

Real earnings grew by about 6 percent during the 1970s and 1980s. Our calculation suggests that avoiding the decline in the terms of trade would have increased that growth to only about 8 percent. Although the effect of foreign competition is measurable, it can by no means account for the stagnation of U.S. earnings.

A more direct way of calculating the impact of the terms of trade on real income is to use a measure known as command GNP (gross national product). Real GNP, the conventional standard of economic performance, measures what the output of the economy would be if all prices remained constant. Command GNP is a similar measure in which the value of exports is deflated by the import price index. It measures the quantity of goods and services that the U.S. economy can afford to buy in the world market, as opposed to the volume of goods and services it produces. If the prices of imports rise faster than export prices (as will happen, for example, if the dollar falls precipitously), growth in command GNP will fall behind that of real GNP.

Between 1959 and 1973, when U.S. wages were rising steadily, command GNP per worker-hour did grow slightly faster than real GNP per hour - 1.87 percent per year versus 1.85. Between 1973 and 1990, as real wages stagnated, command GNP grew more slowly than output, 0.65 percent versus 0.73. Both these differences, however, are small. The great bulk of the slowdown in command GNP was caused by the slower growth of real GNP per worker – by the purely domestic impact of the decline in productivity growth.

If foreign competition is neither the main villain in the decline of manufacturing nor the root cause of stagnating wages, has it not at least worsened the lot of unskilled labor? Economists have generally been quite sympathetic to the argument that increased integration of global markets has pushed down the real wages of less educated U.S. workers.

Their opinion stems from a familiar concept in the theory of international trade: factor price equalization. When a rich country, where skilled labor is abundant (and where the premium for skill is therefore small), trades with a poor country, where skilled workers are scarce and unskilled workers abundant, the wage rates tend to converge. The pay of skilled workers rises in the rich country and falls in the poor one; that of unskilled workers falls in the rich country and rises in the poor nation. Given the rapid growth of exports from nations such as China and Indonesia, it seems reasonable to suppose that factor price equalization has been a major reason for the growing gap in earnings between skilled and unskilled workers in the U.S. Surprisingly, however, this does not seem to be the case. We have found that increased wage inequality, like the decline of manufacturing and the slowdown in real income growth, is overwhelmingly the consequence of domestic causes.

That conclusion is based on an examination of the evidence in terms of the underlying logic of factor price equalization, first explained in a classic 1941 paper by Wolfgang F. Stolper and Paul A. Samuelson. The principle of comparative advantage suggests that a rich country trading with a poor one will export skill-intensive goods (because it has a comparative abundance of skilled workers) and import labor-intensive products. As a result of this trade, production in the rich country will shift toward skill-intensive sectors and away from labor-intensive ones. That shift, however, raises the demand for skilled workers and reduces that for unskilled workers. If wages are free to rise and fall with changes in the demand for different kinds of labor (as they do for the most part in the U.S.), the real wages of skilled workers will rise, and those of unskilled workers will decline. In a poor country, the opposite will occur.

All other things being equal, the rising wage differential will lead firms in the rich country to cut back on the proportion of skilled workers that they employ and to increase that of unskilled ones. That decision, in turn, mitigates the increased demand for skilled workers. When the dust settles, the wage differential has risen just enough to offset the effects of the change in the industry mix on overall demand for labor. Total employment of both types of labor remains unchanged.

According to Stolper and Samuelson’s analysis, a rising relative wage for skilled workers leads all industries to employ a lower ratio of skilled to unskilled workers. Indeed, this reduction is the only way the economy can shift production toward skill-intensive sectors while keeping the overall mix of workers constant.

This analysis carries two clear empirical implications. First, if growing international trade is the main force driving increased wage inequality, the ratio of skilled to unskilled employment should decline in most U.S. industries. Second, employment should increase more rapidly in skill-intensive industries than in those that employ more unskilled labour.

Recent U.S. economic history confounds these predictions. Between 1979 and 1989 the real compensation of white-collar workers rose, whereas that of blue-collar workers fell. Nevertheless, nearly all industries employed an increasing proportion of white-collar workers. Moreover, skill-intensive industries showed at best a slight tendency to grow faster than those in which blue-collar employment was high. (Although economists use many different methods to estimate the average skill level in a given industrial sector, the percentage of blue-collar workers is highly correlated with other measures and easy to estimate.)

Thus, the evidence suggests that factor price equalization was not the driving force behind the growing wage gap. The rise in demand for skilled workers was overwhelmingly caused by changes in demand within each industrial sector, not by a shift of the U.S.’s industrial mix in response to trade. No one can say with certainty what has reduced the relative demand for less skilled workers throughout the economy. Technological change, especially the increased use of computers, is a likely candidate; in any case, globalization cannot have played the dominant role.

It may seem difficult to reconcile the evidence that international competition bears little responsibility for falling wages among unskilled workers with the dramatic rise in manufactured exports from Third World countries. In truth, however, there is little need to do so. Although the surging exports of some developing countries have attracted a great deal of attention, the U.S. continues to buy the bulk of its imports from other advanced countries, whose workers have similar skills and wages. In 1990 the average wages of manufacturing workers among U.S. trading partners (weighted by total bilateral trade) were 88 percent of the U.S. level. Imports (other than oil) from low-wage countries those where workers earn less than half the U.S. level were a mere 2.8 percent of GDP.

Finally, increasing low-wage competition from trade with developing nations has been offset by the rise in wages and skill levels among traditional U.S. trading partners. Indeed, imports from low-wage countries were almost as large in 1960 as in 1990 – 2.2 percent of GDP - because three decades ago Japan and most of Europe fell into that category. In 1960 imports from Japan exerted competitive pressure on labor-intensive industries such as textiles. Today Japan is a high-wage country, and the burden of its competition falls mostly on skill-intensive sectors such as the semiconductor industry.

4. Does Third World Growth Hurt First World Prosperity?

Model 1: A One-Good, One-Input World

Imagine a world without the complexities of the global economy. In this world, one all-purpose good is produced - let’s call it chips – using one input, labor. All countries produce chips, but labor is more productive in some countries than in others. In imagining such a world, we ignore two crucial facts about the actual global economy: it produces hundreds of thousands of distinct goods and services, and it does so using many inputs, including physical capital and the “human capital” that results from education.

What would determine wages and standards of living in such a simplified world? In the absence of capital or differentiation between skilled and unskilled labor, workers would receive what they produce. That is, the annual real wage in terms of chips in each country would equal the number of chips each worker produced in a year – his or her productivity. And since chips are the only good consumed as well as the only good produced, the consumer price index would contain nothing but chips. Each country’s real wage rate in terms of its CPI would also equal the productivity of labor in each country.

What about relative wages? The possibility of arbitrage, of shipping goods to wherever they command the highest price, would keep chip prices the same in all countries. Thus the wage rate of workers who produce 10,000 chips annually would be ten times that of workers who produce 1,000, even if those workers are in different countries. The ratio of any two nations’ wage rates, then, would equal the ratio of their workers’ productivity.

What would happen if countries that previously had low productivity and thus low wages were to experience a large increase in their productivity? These emerging economies would see their wage rates in terms of chips rise - end of story. There would be no impact, positive or negative, on real wage rates in other, initially higher-wage countries. In each country, the real wage rate equals domestic productivity in terms of chips; that remains true, regardless of what happens elsewhere.

What’s wrong with this model? It’s ridiculously oversimplified, but in what ways might the simplification mislead us? One immediate problem with the model is that it leaves no room for international trade: if everyone is producing chips, there is no reason to import or export them. (This issue does not seem to bother such competitiveness theorists as Lester Thurow. The central proposition of Thurow’s Head to Head is that because the advanced nations produce the same things, the benign niche competition of the past has given way to win-lose head-to-head competition. But if the advanced nations are producing the same things, why do they sell so much to one another?)

While the fact that countries do trade with one another means that our simplified model cannot be literally true, this model does raise the question of how extensive the trade actually is between advanced nations and the Third World. It turns out to be surprisingly small despite the emphasis on Third World trade in such documents as the Delors white paper. In 1990, advanced industrial nations spent only 1.2% of their combined GDPs on imports of manufactured goods from newly industrializing economies. A model in which advanced countries have no reason to trade with low-wage countries is obviously not completely accurate, but it is more than 98% right all the same.

Another problem with the model is that without capital, there can be no international investment. We’ll come back to that point when we put capital into the model. It’s worth noting, however, that in the U.S. economy, more than 70% of national income accrues to labor and less than 30% to capital; this proportion has been very stable for the past two decades. Labor is clearly not the only input in the production of goods, but the assertion that the average real wage rate moves almost one for one with output per worker, that what is good for the United States is good for U.S. workers and vice versa, seems approximately correct.

One last assertion that may bother some readers is that wages automatically rise with productivity. Is this realistic?

Yes. Economic history offers no example of a country that experienced long-term productivity growth without a roughly equal rise in real wages. In the 1950s, when European productivity was typically less than half of U.S. productivity, so were European wages; today average compensation measured in dollars is about the same. As Japan climbed the productivity ladder over the past 30 years, its wages also rose, from 10% to 110% of the U.S. level. South Korea’s wages have also risen dramatically over time. Indeed, many Korean economists worry that wages may have risen too much. Korean labor now seems too expensive to compete in low-technology goods with newcomers like China and Indonesia and too expensive to compensate for low productivity and product quality in such industries as autos.

The idea that somehow the old rules no longer apply, that new entrants on the world economic stage will always pay low wages even as their productivity rises to advanced-country levels, has no basis in actual experience. (Some economic writers try to refute this proposition by pointing to particular industries in which relative wages don’t match relative productivity. For example, shirtmakers in Bangladesh, who are almost half as productive as shirtmakers in the United States, receive far less than half the U.S. wage rate. But as we’ll see when we turn to a multigood model, that is exactly what standard economic theory predicts.)

Our one-good, one-input model may seem silly, but it forces us to notice two crucial points. First, an increase in Third World labor productivity means an increase in world output, and an increase in world output must show up as an increase in somebody’s income. And it does: it shows up in higher wages for Third World workers. Second, whatever we may eventually conclude about the impact of higher Third World productivity on First World economies, it won’t necessarily be adverse. The simplest model suggests that there is no impact at all.

Model 2: Many Goods, One Input

In the real world, of course, countries specialize in the production of a limited range of goods; international trade is both the cause and the result of that specialization. In particular, the trade in manufactured goods between the First and Third worlds is largely an exchange of sophisticated hightechnology products like aircraft and microprocessors for labor-intensive goods like clothing. In a world in which countries produce different goods, productivity gains in one part of the world may either help or hurt the rest of the world.

This is by no means a new subject. Between the end of World War II and the Korean War, many nations experienced a series of balance-of-payments difficulties, which led to the perception of a global “dollar shortage.” At the time, many Europeans believed that their real problem was the overwhelming competitiveness of the highly productive U.S. economy. But was the U.S. economy really damaging the rest of the world? More generally, does productivity growth in one country raise or lower real incomes in other countries? An extensive body of theoretical and empirical work concluded that the impact of productivity growth abroad on domestic welfare can be either positive or negative, depending on the bias of that productivity growththat is, depending on the sectors in which such growth occurs.

Sir W. Arthur Lewis, who won the 1979 Nobel Prize in economics for his work on economic development, has offered a clever illustration of how the effect of productivity growth in developing countries on the real wages in advanced nations can work either way. In Lewis’s model, the world is divided into two regions; call them North and South. This global economy produces not one but three types of goods: high-tech, medium-tech, and low-tech. As in our first model, however, labor is still the only input into production. Northern labor is more productive than Southern labor in all three types of goods, but that productivity advantage is huge in high-tech, moderate in medium-tech, and small in low-tech.

What will be the pattern of wages and production in such a world? A likely outcome is that high-tech goods will be produced only in the North, low-tech goods only in the South, and both regions will produce at least some medium-tech goods. (If world demand for high-tech products is very high, the North may produce only those goods; if demand for low-tech products is high, the South may also specialize. But there will be a wide range of cases in which both regions produce medium-tech goods.)

Competition will ensure that the ratio of the wage rate in the North to that in the South will equal the ratio of Northern to Southern productivity in the sector in which workers in the two regions face each other head-to-head: medium-tech. In this case, Northern workers will not be competitive in low-tech goods in spite of their higher productivity because their wage rates are too high. Conversely, low Southern wage rates are not enough to compensate for low productivity in high-tech.

A numerical example may be helpful here. Suppose that Northern labor is ten times as productive as Southern labor in high-tech, five times as productive in medium-tech, but only twice as productive in low-tech. If both countries produce medium-tech goods, the Northern wage must be five times higher than the Southern. Given this wage ratio, labor costs in the South for low-tech goods will be only two-fifths of labor costs in the North for this sector, even though Northern labor is more productive. In high-tech goods, by contrast, labor costs will be twice as high in the South.

Notice that in this example, Southern low-tech workers receive only one-fifth the Northern wage, even though they are half as productive as Northern workers in the same industry.

Many people, including those who call themselves experts on international trade, believe that kind of gap shows that conventional economic models don’t apply. In fact, it’s exactly what conventional analysis predicts: if low-wage countries didn’t have lower unit labor costs than high-wage countries in their export industries, they couldn’t export.

Now suppose that there is an increase in Southern productivity. What effect will it have? It depends on which sector experiences the productivity gain. If the productivity increase occurs in low-tech output, a sector that does not compete with Northern labor, there is no reason to expect the ratio of Northern to Southern wages to change. Southern labor will produce low-tech goods more cheaply, and the fall in the price of those goods will raise real wages in the North. But if Southern productivity rises in the competitive medium-tech sector, relative Southern wages will rise. Since productivity has not risen in low-tech production, low-tech prices will rise and reduce real wages in the North.

What happens if Southern productivity rises at equal rates in low- and medium-tech? The relative wage rate will rise but will be offset by the productivity increase. The prices of low-tech goods in terms of Northern labor will not change, and thus the real wages of Northern workers will not change either. In other words, an across-the-board productivity increase in the South in this multigood model has the same effect on Northern living standards as productivity growth had in the one-good model: none at all.

It seems, then, that the effect of Third World growth on the First World, which was negligible in our simplest model, becomes unpredictable once we make the model more realistic. There are, however, two points worth noting.

First, the way in which growth in the Third World can hurt the First World is very different from the way it is described in the Schwab letter or the Delors White Paper. Third World growth does not hurt the First World because wages in the Third World stay low but because they rise and therefore push up the prices of exports to advanced countries. That is, the United States may be threatened when South Korea gets better at producing automobiles, not because the United States loses the automobile market, but because higher South Korean wages mean that U.S. consumers pay more for the pajamas and toys that they were already buying from South Korea.

Second, this potential adverse effect should show up in a readily measured economic statistic: the terms of trade, or the ratio of export to import prices. For example, if U.S. companies are forced to sell goods more cheaply on world markets because of foreign competition or are forced to pay more for imports because of competition for raw materials or a devalued dollar, real income in the United States will fall. Because exports and imports are about 10% of GNP, each 10% decline in the U.S. terms of trade reduces U.S. real income by about 1%. The potential damage to advanced economies from Third World growth rests on the possibility of a decline in advanced country terms of trade. But that hasn’t happened. Between 1982 and 1992, the terms of trade of the developed market economies actually improved by 12%, largely as a result of falling real oil prices.

In sum, a multigood model offers more possibilities than the simple one-good model with which we began, but it leads to the same conclusion: productivity growth in the Third World leads to higher wages in the Third World, end of story.

Model 3: Capital and International Investment

Let’s move a step closer to reality and add another input to our model. What changes if we now imagine a world in which production requires both capital and labor? From a global point of view, there is one big difference between labor and capital: the degree of international mobility. Although large-scale international migration was a major force in the world economy before 1920, since then all advanced countries have erected high legal barriers to economically motivated immigration. There is a limited flow of very highly skilled people from South to North – the notorious “brain drain”and a somewhat larger flow of illegal migration. But most labor does not move internationally.

In contrast, international investment is a highly visible and growing influence on the world economy. During the late 1970s, many banks in advanced countries lent large sums of money to Third World countries. This flow dried up in the 1980s, the decade of the debt crisis, but considerable capital flows resumed with the emerging-markets boom that began after 1990.

Many of the fears about Third World growth seem to focus on capital flows rather than trade. Schwab’s fear that there will be a “massive redeployment of productive assets” presumably refers to investment in the Third World. The famous estimate by the Economic Policy Institute that NAFTA would cost 500,000 U.S. jobs was based on a completely hypothetical scenario about diversion of U.S. investment. Even Labor Secretary Robert Reich, at the March 1994 job summit in Detroit, attributed the employment problems of Western economies to the mobility of capital. In effect, he seemed to be asserting that First World capital now creates only Third World jobs. Are those fears justified?

The short answer is yes in principle but no in practice. As a matter of standard textbook theory, international flows of capital from North to South could lower Northern wages. The actual flows that have taken place since 1990, however, are far too small to have the devastating impacts that many people envision.

To understand how international investment flows could pose problems for advanced-country labor, we must first realize that the productivity of labor depends in part on how much capital it has to work with. As an empirical matter, the share of labor in domestic output is very stable. But if labor has less capital at its disposal, productivity and thus real wage rates will fall.

Suppose, then, that Third World nations become more attractive than First World nations for First World investors. This might be because a change in political conditions makes such investments seem safer or because technology transfer raises the potential productivity of Third World workers (once they are equipped with adequate capital). Does this hurt First World workers? Of course. Capital exported to the Third World is capital not invested at home, so such North-South investment means that Northern productivity and wages will fall. Northern investors presumably earn a higher return on these investments than they could have earned at home, but that may offer little comfort to workers. […]

How much capital has been exported from advanced countries to developing countries? During the 1980s, there was essentially no net North-South investment – indeed, interest payments and debt repayments were consistently larger than the new investment. All the action, then, has taken place since 1990. In 1993, the peak year of emerging-markets investment so far, capital flows from all advanced nations to all newly industrializing countries totaled about $100 billion.

That may sound very high, but compared with the First World economy, it isn’t. Last year, the combined GNPs of North America, Western Europe, and Japan totaled more than $18 trillion. Their combined investment was more than $3.5 trillion; their combined capital stocks were about $60 trillion. The record capital flows of 1993 diverted only about 3% of First World investment away from domestic use and reduced the growth in the capital stock by less than 0.2%. The entire emerging-market investment boom since 1990 has reduced the advanced world’s capital stock by only about 0.5% from what it would otherwise have been.

How much pressure has this placed on wages in advanced countries? A reduction of the capital stock by 1% reduces productivity by less than 1%, since capital is only one input; standard estimates put the number at about 0.3%. A back-of-the-envelope calculation therefore suggests that capital flows to the Third World since 1990 (and bear in mind that there was essentially no capital flow during the 1980s) have reduced real wages in the advanced world by about 0.15% – hardly the devastation that Schwab, Delors, or the Economic Policy Institute presume.

There is another way to make the same point. Anything that draws capital away from business investment in the advanced countries tends to reduce First World wages. But investment in the Third World has become considerable only in the last few years. Meanwhile, there has been a massive diversion of savings into a purely domestic sink: the budget deficits run up by the United States and other countries. Since 1980, the United States alone has run up more than $3 trillion in federal debt, more than ten times the amount invested in emerging economies by all advanced countries combined. The export of capital to the Third World attracts a lot of attention because it is exotic, but the amounts are minor compared with domestic budget deficits.

At this point, some readers may object that one cannot compare the two numbers. Savings absorbed by the federal budget deficit simply disappear; savings invested abroad create factories that make products that then compete with ours. It seems plausible that overseas investment is more damaging than budget deficits. But that intuition is wrong: investing in Third World countries raises their productivity, and we’ve seen in the first two models that higher Third World productivity per se is unlikely to lower First World living standards.

The conventional wisdom among many policymakers and pundits is that we live in a world of incredibly mobile capital and that such mobility changes everything. But capital isn’t all that mobile, and the capital movements we have seen so far change very little, at least for advanced countries.

Model 4: The Distribution of Income

We seem to have concluded that growth in the Third World has almost no adverse effects on the First World. But there is still one more issue to address: the effects of Third World growth on the distribution of income between skilled and unskilled labor within the advanced world.

For our final model, let’s add one more complication. Suppose that there are two kinds of labor, skilled and unskilled. And suppose that the ratio of unskilled to skilled workers is much higher in the South than in the North. In such a situation, one would expect the ratio of skilled to unskilled wages to be lower in the North than in the South. As a result, one would expect the North to export skill-intensive goods and services – that is, employ a high ratio of skilled to unskilled labor in their production, while the South exports goods whose production is intensive in unskilled labor.

What is the effect of this trade on wages in the North? When two countries exchange skill-intensive goods for labor-intensive goods, they indirectly trade skilled for unskilled labor; the goods that the North ships to the South “embody” more skilled labor than the goods the North receives in return. It is as if some of the North’s skilled workers migrated to the South. Similarly, the North’s imports of labor-intensive products are like an indirect form of low-skill immigration. Trade with the South in effect makes Northern skilled labor scarcer, raising the wage it can command, while it makes unskilled labor effectively more abundant, reducing its wage.

Increased trade with the Third World, then, while it may have little effect on the overall level of First World wages, should in principle lead to greater inequality in those wages, with a higher premium for skill. Equally, there should be a tendency toward “factor price equalization,” with wages of low-skilled workers in the North declining toward Southern levels.

What makes this conclusion worrisome is that income inequality has been rapidly increasing in the United States and to a lesser extent in other advanced nations. Even if Third World exports have not hurt the average level of wages in the First World, might they not be responsible for the steep declines since the 1970s in real wages of unskilled workers in the United States and the rising unemployment rates of European workers?

At this point, the preponderance of the evidence seems to be that factor price equalization has not been a major element in the growing wage inequality in the United States, although the evidence is more indirect and less secure than the evidence we brought to our earlier models. In essence, trade with the Third World is just not that large. Since trade with low-wage countries is only a little more than 1% of GDP, the net flows of labor embodied in that trade are fairly small compared with the overall size of the labor force.

More careful research may lead to larger estimates of the effect of North-South trade on the distribution of wages, or future growth in that trade may have larger effects than we have seen so far. At this point, however, the available evidence does not support the view that trade with the Third World is an important part of the wage inequality story.

Moreover, even to the extent that North-South trade may explain some of the growing inequality of earnings, it has nothing to do with the disappointing performance of average wages. Before 1973, average compensation in the United States rose at an annual rate of more than 2%; since then it has risen at a rate of only 0.3%. This decline is at the heart of our economic malaise, and Third World exports have nothing to do with it.

5. The Illusion of Conflict in International Trade

Who Is Right?

The World Competitiveness Report puts that threat starkly: “Today, the so-called industrialized nations employ 350 million people who are paid an average hourly wage of $18. However, during the past ten years, the world economy gained access to large and populated countries, such as China, the former Soviet Union, India, Mexico, etc. Altogether, it can be estimated that a labour force of some 1,200 million people has thus become reachable, at an average hourly cost of $2, and in many regions, under $1 […]

This offers a clear and compelling vision. Low-wage nations are now able to attract capital and technology from the advanced world. As a result, they can achieve productivity close to Western levels, while paying much lower wages. The result seems obvious: the low-wage countries will run huge trade surpluses, creating either large-scale unemployment or sharply falling wages in the erstwhile high-wage nations.

Sounds persuasive, doesn’t it? There’s only one problem: it is a vision that quite literally makes no sense. The reason lies in a basic fact of accounting, perhaps the most essential equation in international economics:

Savings - Investment = Exports – Imports

This is not a hypothetical theory: it is an unavoidable accounting identity, a statement of an adding-up constraint that any consistent story about any economy must honor. And yet it is an equation that the story in the World Competitiveness Report clearly violates.

Consider that story again. It asserts that capital will move from Western nations to low-wage countries – that is, that those nations will be able to invest more than their domestic savings because foreign capital will also be investing there. So for these economies the left-hand side of the equation is negative: investment exceeds savings. At the same time, it asserts that low-wage countries will export much more than they import, “deindustrializing” the advanced nations. So the right hand side is… positive?

When I have tried to explain this problem to people who find the story about low-wage competition persuasive, their first reaction is to ask what alternative story I propose. The obvious answer is that as capital and technology flow to low-wage nations, their wage rates will rise along with their productivity. As a result they will not run huge trade surpluses with advanced nations, indeed, they will run deficits, as the counterpart to the capital inflows. The usual reaction to this is that it is implausible, and that it is a typical economist’s assertion that markets will always do the right thing. I then ask what the questioner proposes; he replies that he believes that low-wage countries will run big trade surpluses. “So you think that low-wage countries are going to export large quantities of capital to high-wage nations?” At this point the conversation gets unpleasant, with some remark about this kind of thing being the reason why people hate economists.

It might also be worth noting that in these arguments people often bring in the observation that when multinational corporations have opened plants in low-wage countries, they often achieve near-First-World productivity but continue to pay Third World wages. The economist’s answer to this is that it is exactly what one should expect: wage rates should reflect average national productivity, not productivity in a particular factory; if only a few modern factories have opened in a country, they will not raise that country’s average productivity by much and should therefore not be expected to pay high wages. (And of course a country with low overall productivity that is able to achieve near-U.S. productivity in a few goods will tend to export those goods; it’s called comparative advantage). But no matter how much one tries to explain that this outcome is exactly what the standard model predicts, it seems to be viewed as somehow a decisive rejection of the economist’s optimism about the trade balance.

6. Myths and Realities of U.S. Competitiveness

Myths of Competitiveness

The classic analysis of the equilibrating forces in international trade is more than two centuries old. David Hume, living in a world in which precious metals were still the principal medium of exchange, pointed out that a country that had for some reason become uncompetitive, and as a result was importing more than it exported, would suffer a steady drain of gold and silver coins. This fall in the money supply, however, would lead to a fall in the level of prices and wages in that country; eventually goods and labor would become sufficiently cheap in the deficit nation that its goods would again become attractive to buyers, and the trade deficit would be corrected.

In the modern world the adjustment process is more complex and less automatic. In a world of national currencies no longer backed by gold, deficit countries usually adjust by depreciating their currencies rather than by letting wages and prices fall. Also, international capital movements have as their counterpart trade imbalances: A country that is able to attract an inflow of foreign capital will (as a matter of sheer accounting identity) also run a trade deficit, whereas a country that is exporting capital will run a surplus. Nonetheless, over the long term, major industrial countries show a strong tendency toward equality of imports and exports, regardless of their productivity and technological performance. Table 6.1 shows the balance on current account (a broad definition of trade in goods and services) of the three major industrial countries as a percentage of their national incomes for selected time periods. The average imbalances over the long term are quite small. During the mid-1980s large imbalances emerged, attributed by many economists to the unprecedented U.S. budget deficit and other special factors. By early 1991 about half of this divergence had again been eliminated (due in large part to a sharp rise of the dollar value of the yen and the mark), and the United States in particular was experiencing a broad-based export recovery.

Suppose that a country lags behind other nations in productivity. The equilibrating forces first noticed by Hume ensure that it will nonetheless be able to find a range of goods and services to export. But what will it export? The answer, pointed out by David Ricardo in 1817, is that a country whose productivity lags that of its trading partners in all or almost all industries will export those goods in which its productivity disadvantage is smallest. In the standard terminology of international economics, a country will always find a range of goods in which it has a “comparative advantage” even if there are no goods in which it has an “absolute advantage.”

The classic empirical example of the principle of comparative advantage at work comes from the early post-war comparison of Britain and the United States. At that time, British productivity was far less than that of the United States – labor productivity in manufacturing was below U.S. levels in all major industries, and on average was less than half of the United States. The British economy, however, was much more dependent on foreign trade, and therefore was obliged to generate approximately the same dollar value of export earnings. If one looks at the comparative pattern of exports, one seems a clear picture of comparative advantage at work. Figure 6.1, plotted from data for a set of 22 industries, shows that there is a clear-cut association between relative productivity and relative exports. U.S. productivity was higher in all cases; but only in industries in which U.S. productivity was more than about 2.5 times U.K. productivity did the United States have larger exports. That is, Britain did not have an absolute advantage in anything, but it had a comparative advantage in those goods in which its productivity exceeded 40% of the U.S. level.

Britain’s ability to outsell the United States in industries in which its productivity was inferior depended, of course, on the fact that British workers were paid less than U.S. workers – a pay differential that was greatly widened by the 1949 devaluation of the pound from $4.80 to $2.80. A common reaction to this observation, and to such events as the recovery of U.S. exports that followed the decline in the dollar between 1985 and 1988, is that coping with international competition by lowering relative wages must lower a country’s living standards. Ricardo’s 1817 discussion of comparative advantage showed, however, that trade between two nations ordinarily raises the standard of living of both, even if one must compete on the basis of low wages.

We may see this point with a hypothetical example, similar to one introduced by Ricardo. Imagine a world in which the United States and Britain are the only trading countries and that there are only two goods, wool and aircraft. Suppose also that labor is the only input into production, and that U.S. labor is more productive than British in both. The U.S. advantage is, however, much more pronounced in aircraft. Table 6.2 illustrates a hypothetical set of productivity numbers.

Clearly, if these two countries are going to be able to sell goods to each other, the U.S. wage rate must be at least 1.5 times that of Britain – otherwise both goods would be cheaper to produce in America – but no more than 6 times as high. The actual wage rate would depend on demand conditions and the relative size of the economies, but let us simply suppose that the relative wage rate is 3. At that wage rate, wool would be cheaper to produce in Britain, which would therefore export it, whereas aircraft would be cheaper to produce in the United States. If prices are proportional to labor cost, one unit of wool, which requires one-half unit of British labor, would trade for one unit of aircraft, which requires one-sixth unit of the more expensive U.S. labor.

Now we ask, “Is Britain better or worse off trading with the United States, on the basis of a wage rate only one-third as high, than it would be in the absence of trade?” The answer is that it is better off. In the absence of trade, it would take one unit of British labor to produce one unit of aircraft. By trading with America, Britain can acquire an aircraft by trading a unit of wool for it, which requires the use of only one-half unit of labor. That is, the opportunity to trade with America raises the purchasing power of British labor.

This is a grossly simplified example, but it makes a crucial point. A country that is less productive than its trading partners across the board will be forced to compete on the basis of low wages rather than superior productivity. But it will not suffer catastrophe, and indeed will normally still benefit from international trade. The point is that international trade, unlike competition among businesses for a limited market, is not a zero-sum game in which one nation’s gain is another’s loss. It is positive-sum game, which is why the word “competitiveness” can be dangerously misleading when applied to international trade.

Although this is a crucial point to appreciate, it is also important to understand what the example has and has not demonstrated. Returning to our thought experiment, we have not shown that the United States, with its 1% annual productivity growth, is as well off as it would be if it shared the rest of the world’s 4% growth; clearly, it is not. Nor have we even shown that the United States is better off with the rest of the world growing at 4% than at 1%. In fact, it could be either better or worse off; this depends on details, specifically on whether rest-of-world growth is biased toward goods the U.S. exports (in which case the United States is hurt) or toward goods that the United States imports (in which case the United States is helped)5. All that we have shown is that low productivity does not pose a worse problem for a country that is engaged in international trade than for one that is not. Britain in 1950 had a productivity problem (and still does); the negative impact of that problem on Britain’s standard of living, however, was no greater, and in fact less, because Britain was a trading nation rather than a self-sufficient society.

We should also note that the discussion here has so far omitted a factor that is critical in the real-world politics of international trade: income distribution. Changes in international trading patterns often have strong effects on the distribution of income within countries, so that even a generally beneficial change produces losers as well as winners (at least in the short run). If foreigners are willing to sell us high-quality goods cheaply, that is a good thing for most of us, but a bad thing for the domestic industry that competes with the imports. This observation cuts both ways. On one side, economists sometimes blithely speak of the benefits of free trade, ignoring the sometimes substantial costs of adjustment. On the other hand, much opposition to free trade represents special interest pleading, and an appeal to the need for “competitiveness” is often used as a cloak for narrow self-interest.

Realities of Competitiveness

The discussion so far seems to suggest that competitiveness, if it means anything, is a non-issue: Even unproductive countries have a range of goods in which they have a comparative advantage, and more or less automatic forces will always ensure that a country is competitive in industries in which it has a comparative advantage. Yet we should not be too quick to dismiss the idea that there is some real problem to which concerns about competitiveness are a response. For in the discussion above I have made an implicit assumption that is clearly untrue in some instancesthat countries’ comparative advantages determine their pattern of trade, rather than the other way around.

Much international trade is driven by enduring national differences in resources, climate, and society. Brazil is a coffee exporter because of soil and climate, Saudi Arabia an oil exporter because of geology, Canada a wheat exporter because of the abundance of land relative to labor, and so on.Trade in manufactured goods among advanced industrial countries, however, particularly in higher sophisticated products, is harder to explain6. In many cases industries seem to create their own comparative advantage, through a process of positive feedback.

The process through which comparative advantage can be created is illustrated in Fig. 6.2. Suppose that a country has for whatever reason established a strong presence in a particular industry. Then this presence may produce what in standard terminology are called “external economies” that reinforce the industry’s strength. External economies come in two main variants. So-called technological external economies involve the spillover of knowledge between firms: to the extent that firms can learn from each other, a strong national industry can give rise to a national knowledge base that reinforces the industry’s advantage. Pecuniary external economies depend on the size of the market: a strong domestic industry offers a large market for specialized labor and suppliers, and the availability of a flexible labor pool and an efficient supplier base reinforces the industry’s strength.

When external economies are powerful, international specialization can have a strong arbitrary quality. During an industry’s informative years, or during a transitional period when shifts in technology or markets have invalidated existing patterns of advantage, a country may establish a lead in an industry due to historical accident or government support. Once this lead is established, it becomes self-reinforcing and tends to persist.

The importance of external economies is obvious in interregional specialization within the United States. Such famous industry clusters as Silicon Valley and Route 128, as well as less well-known examples like the cluster of carpet manufacturers around Dalton, Georgia (or the insurance cluster in Hartford, Connecticut) clearly reflect the self-reinforcing effects of success rather than underlying resources. International examples include Swiss watches, Italian ceramic tiles, and the role of London as a financial center.

It is probably true that external economies are a more important determinant of international trade in high-technology sectors than elsewhere, although they are by no means restricted to high tech. There is some dispute over whether the basis of international trade has shifted away from traditional comparative advantage toward created advantage. What is definitely true is that although the idea of external economies is an old one, going back to Marshall, recent developments in the analysis of international trade have placed increasing emphasis on the role of history, accident, and government policy in producing trade patterns.

The proposition that comparative advantage may be created rather than exogenously given somewhat qualifies the generally benign picture of international competition given in the first part of this paper. It suggests that under some circumstances countries may lose, or fail to establish, industries in which in the long run they might have been able to acquire a comparative advantage. This, in turn, provides a potential case for government intervention.

The traditional version of this line of reasoning is the infant industry argument for developing countries. Countries new to industrialization, the argument goes, face established competitors who already have the knowledge base, suppliers, and specialized skills in industries where these are important. Absent government intervention, the new entrants will therefore find themselves producing only goods in which external economies are unimportant, and will be stuck with permanently lower wages. By promoting targeted industries, they can in principle escape from this trap.

The new version of the argument involves established countries but new industries. Let us set up an exaggerated case, bearing in mind that it overstates the reality. Suppose that the United States trades with Japan and that Japan systematically promotes new high-technology industries as they emerge. This promotion may take the form of government subsidy, but it can also take the form of explicit or implicit protection of the domestic market, which both denies U.S. firms an important market and ensures Japanese firms of sales. Then, other things equal, Japan will tend to establish a competitive advantage in emerging high-technology sectors. This will not be catastrophic for the United States: the principle of comparative advantage still applies, and the United States will still find a range of goods to export. It will, however, increasingly be forced to compete on the basis of low wages rather than high productivity.

This story bears enough resemblance to reality to touch some raw nerves. Japan does not engage in extensive subsidy to industry, and on paper its markets are quite open to imports of manufactured goods. In practice, however, as indicated in Table 6.3, the Japanese market for high-technology goods has remained a virtually closed preserve for Japanese firms, whereas such markets have become increasingly internationalized not only in the United States but also in Europe.

This, then, is the real competitiveness issue: The possibility that international competition will exclude the United States from some industries in which it could or should have had a comparative advantage. Having identified this as a valid argument, we need to offer some strong warnings against overuse.

First, although government subsidy and unequal access to markets have surely played an important role in determining the outcome of international competition in a few industries, they are unlikely to be the major explanation of disappointing U.S. economic performance. Most of the output of U.S. economy is not traded internationally: in 1990, imports and exports were only 13 and 12.3% of gross national product, respectively. Furthermore, as Table 6.4 shows, since 1980 the United States has actually experienced a striking revival of productivity growth in manufacturing, which is precisely the sector most exposed to international competition. To the extent that the United States continues to perform poorly compare with other major industrial nations, this has a great deal to do with a low national savings rate, low spending on R&D, and low-quality basic education. Failure to create advantage is at best a contributing factor.

Second, the national pursuit of competitive advantage should not be unrestrained, because unilateral pursuit of advantage can work to everyone’s disadvantage. For example, the United Kingdom undoubtedly derives significant benefits from the London’s role as the financial capital of Europe, benefits that would be lost if that capital were in, say, Frankfurt instead. Yet Europe as a whole would almost surely be worse off if nationalistic policies led to a fragmented financial system divided among Frankfurt, Paris, Milan, and London. That is, it is better for the British that the City be in Britain rather than elsewhere; but it is in the common interest that there be a City (or a Silicon Valley or Route 128) somewhere, so that the advantages of such a cluster’s external economies can be realized.

Finally, competitiveness is one of those issues, like national defense, that can easily be used as a patriotic cloak for special interest politics. The infant industry argument, mentioned above, is intellectually impeccable. In practice, however, it has been used in many developing countries to justify policies that maintain highly inefficient industries and generate large economic benefits for a politically influential elite. The risks of a similar misuse of intellectually legitimate concerns about U. S. competitiveness mean that arguments for a more nationalistic trade policy, while they should not be dismissed out of hand, need to be treated with caution.

5. For analysis of the effects of foreign growth on domestic welfare, see H. Johnson, Manch. Sch. Econ. Stud. 23, 95 (1955). Any adverse effects would come through a worsening of the terms of trade, that is, the price of exports relative to that of imports. Excluding oil and agricultural goods, U.S. terms of trade have in fact shown a slight downward trend, but the trend is too small to have a significant negative effect on U.S. welfare [R. Lawrence, Brookings Pap. Econ. Activity 2: 1990, 343 (1990)].

6. Most trade in manufactured goods among industrial countries consist of “intra-industry” trade, that is, exchange of goods that seem to be produced using similar ratios of capital to labor and of skilled to unskilled workers. Thus it is difficult to explain the pattern of comparative advantage among industrial countries by differences in their resource mixes, which are in any case quite similar [H. Grubel and P. Lloyd, Intra-Industry Trade (Wiley, New York, 1975); E. Helpman, J. Jpn. Int. Econ. 1, 62 (1987).

What’s going to happen if these countries begin to trade with each other? The answer obviously depends on the ratio of their wage rates. If the Mexican wage is too high, almost everything will be more cheaply produced by the more productive American workers. If, on the other hand, Mexican wages are low enough, most goods will be cheaper to produce in Mexico. But relative wage rates do not fall from the sky, they are determined in the marketplace. And they will therefore tend to settle somewhere in the middle, at a level at which each country has a range of goods that it can produce more cheaply. In the useful jargon of international trade theorists, America has an absolute advantage in producing just about everything, but each country has a range of goods in which it has a comparative advantage.

Both sides may complain about the resulting pattern of trade. Mexicans will lament that they are able to compete only on the basis of low wages; Americans will worry that their standard of living will be dragged down by the necessity of competing with cheap Mexican labor. In fact, however, in our example, trade raises real incomes in both countries. Each country imports only those goods in which the other country’s relative productivity is higher than its relative wage, implying that the imports cost less in terms of the importing country’s labor than what would be required to produce them at home. That is, each country is better off specializing in producing a limited range of goods and importing the rest than it would be if it cut itself off from trade and this is true regardless of the relative wage rates in the two countries.

8. What Do Undergrads Need to Know about Trade?

II Common Misconceptions

2. “Competing in the world marketplace:” One of the most popular, enduring misconceptions of practical men is that countries are in competition with each other in the same way that companies in the same business are in competition. Ricardo already knew better in 1817. An introductory economics course should drive home to students the point that international trade is not about competition, it is about mutually beneficial exchange. Even more fundamentally, we should be able to teach students that imports, not exports, are the purpose of trade. That is, what a country gains from trade is the ability to import things it wants. Exports are not an objective in and of themselves: the need to export is a burden that a country must bear because its import suppliers are crass enough to demand payment.

6. “A new partnership:” The bottom line for many pop internationalists is that since U.S. firms are competing with foreigners instead of each other, the U.S. government should turn from its alleged adversarial position to one of supporting our firms against their foreign rivals. A more sophisticated pop internationalist like Robert Reich (1991) realizes that the interests of U.S. firms are not the same as those of U.S. workers (you may find it hard to believe that anyone needed to point this out, but among pop internationalists this was viewed as a deep and controversial insight), but still accepts the basic premise that the U.S. government should help our industries compete.

What we should be able to teach our students is that the main competition going on is one of U.S. industries against each other, over which sector is going to get the scarce resources of capital, skill, and, yes, labor. Government support of an industry may help that industry compete against foreigners, but it also draws resources away from other domestic industries. That is, the increased importance of international trade does not change the fact the government cannot favor one domestic industry except at the expense of others.

Now there are reasons, such as external economies, why a preference for some industries over others may be justified. But this would be true in a closed economy, too. Students need to understand that the growth of world trade provides no additional support for the proposition that our government should become an active friend to domestic industry.

9. Challenging Conventional Wisdom

The Perils of Success in the Developing World

In between are a number of countries that did very badly during much of the 1980s but are doing much better recently. In this class are Chile, Argentina, and, of course, Mexico.

Why did these countries do so much better in the last few years? The short answer is, of course, debt reduction and policy reform; but that’s too short an answer, because it fails to reveal the strangeness of the process and some of the weaknesses involved.

In Mexico there was a dramatic trade liberalization between 1985 and 1989. The fraction of imports subject to licenses fell from more than 90 to less than 25 percent, the maximum tariff was cut by 3/4, and even the average tariff fell by half. Add in the wave of privatization, and one has a major economic reform.

Why were such reforms politically possible? It is clear that the conventional wisdom played a crucial role. If trade liberalization is presented as a detailed, microeconomic policy, the industries that stand to lose will be well-informed and vociferous in their opposition, while those who stand to gain will be diffuse and usually ineffective. What reformers in a number of countries were able to do, however, was to present trade liberalization as part of a package that was presumed to yield large gains to the country as a whole. That is, it wasn’t presented as “Let’s open up imports in these 20 industries and there will be efficiency gains”; that kind of argument doesn’t work very well in ordinary times. Instead, it was “We have to follow the strategy that everyone serious knows works: free markets including free trade and sound money, leading to rapid economic growth.” It is a unified package, and it has been adopted by countries where one would have thought such change was impossible.

The packages have also, by and large, worked – if anything, worked too well.

Let’s consider Mexico. The turning point for Mexico was the debt-reduction package negotiated under the Brady Plan. That debt reduction was intelligently handled: Mexico negotiated effectively and toughly with its creditors, and the mechanism of debt reduction was a good one. (In fact, I give Mexico’s debt negotiators credit not just for making a good deal for themselves but for saving the whole Brady Plan. The original U.S. plan was confused and unworkable; it was Mexico that devised an intelligent way of reducing debt without giving banks a windfall, providing a blueprint for subsequent debt deals.)

Everyone realized, however, that the actual debt relief under the Mexican debt package was fairly small. It was not nearly enough to make much direct difference to Mexico’s growth prospects.

And yet what actually followed the debt reduction was a transformation of the economic picture. With stunning speed, Mexico’s problems seemed to melt away. Internal real interest rates were 30-40 percent before the debt deal, with the payments on internal debt a major source of fiscal pressure; they fell to 5- 10 percent almost immediately. Mexico had been shut out of international financial markets since 1982; soon after the debt deal, large-scale voluntary capital inflows resumed on an ever-growing scale. And, of course, growth resumed in the long-stagnant economy.

Why did a seemingly modest debt reduction spark such a major change in the economic environment? I think we all know the answer: international investors saw the debt deal as part of a package of reforms that they believed would work. Debt reduction went along with free markets and sound money; free markets and sound money mean prosperity; and so capital flows into the country that follows the right path.

10. The Uncomfortable Truth about NAFTA

The Gains from NAFTA

NAFTA will neither create nor destroy jobs, but it will make the existing North American labor force slightly more productive. No serious study – defined as a study by someone whose mind could conceivably have been changed by the evidence has failed to find that NAFTA will produce a small net gain for the United States. This benefit will come from the usual sources of gains from international trade. First, each country will tend to increase its output in industries in which it is relatively productive, raising the efficiency of the North American economy as a whole. Second, larger markets will allow for better exploitation of economies of scale. Finally, the larger market will lead to greater competition, reducing the inefficiency associated with monopoly power.

The operative word, however, is “small.” Few studies indicate that NAFTA could add much more than 0.1 percent to U.S. real income.

Why are the gains so small? First, the United States and Mexico have already moved most of the way to free trade in advance of NAFTA; the agreement does not do all that much more to integrate markets. Second, Mexico’s economy is so small – its GDP is less than four percent that of the United States – that for the foreseeable future it will be neither a major supplier nor a major market.

The gains to Mexico from NAFTA are, not surprisingly, much larger as a percentage of that country’s national income, if only because the Mexican economy is so much smaller to start with. One recent estimate is representative: it finds that the dollar value of gains from NAFTA will be roughly equally divided between the United States and Mexico (about $6 billion each annually). But this represents a gain of only a little more than 0.1 percent of U.S. GDP, compared with more than 4 percent for Mexico.

NAFTA and Low-wage U.S. Workers

When a country with a highly skilled labor force increases its trade with a country in which skill is at a greater premium, it can expect a decline in the real wages of its own unskilled workers. As a matter of economic principles, we should expect to see at least some adverse impact of NAFTA on the wages of American manual workers.

All the evidence suggests, however, that this effect will be extremely small. For one thing, since the existing barriers to trade between the United States and Mexico are already quite low, it is hard to see how removing them could have any dramatic effect on wage rates.

Moreover, while economic theory suggests that trade between the United States and Mexico should involve an exchange of skill-intensive for labor-intensive products, such a bias in trade against low-wage U.S. workers is surprisingly elusive in the actual trade data. Most notably, the widely cited study of NAFTA by Gary Hufbauer and Jeffrey Schott finds that U.S. industries that compete with imports from Mexico pay almost exactly the same average wage as industries that export to Mexico.

It’s worth pointing out that this lack of evidence that trade really does worsen American income distribution is not unique to the Mexican case. Two economists who expected to find a significant effect of trade on wages have concluded that virtually none of the growth in wage inequality in the United States since 1979 is due to international factors. A survey by Lawrence Katz reaches the same conclusion.

11. Asia’s Miracle

While the growth of communist economies was the subject of innumerable alarmist books and polemical articles in the 1950s, some economists who looked seriously at the roots of that growth were putting together a picture that differed substantially from most popular assumptions. Communist growth rates were certainly impressive, but not magical. The rapid growth in output could be fully explained by rapid growth in inputs: expansion of employment, increases in education levels, and, above all, massive investment in physical capital. Once those inputs were taken into account, the growth in output was unsurprising or, to put it differently, the big surprise about Soviet growth was that when closely examined it posed no mystery.

This economic analysis had two crucial implications. First, most of the speculation about the superiority of the communist system – including the popular view that Western economies could painlessly accelerate their own growth by borrowing some aspects of that system – was off base. Rapid Soviet economic growth was based entirely on one attribute: the willingness to save, to sacrifice current consumption for the sake of future production. The communist example offered no hint of a free lunch.

Second, the economic analysis of communist countries’ growth implied some future limits to their industrial expansionin other words, implied that a naive projection of their past growth rates into the future was likely to greatly overstate their real prospects. Economic growth that is based on expansion of inputs, rather than on growth in output per unit of input, is inevitably subject to diminishing returns. It was simply not possible for the Soviet economies to sustain the rates of growth of labor force participation, average education levels, and above all the physical capital stock that had prevailed in previous years. Communist growth would predictably slow down, perhaps drastically.

Accounting for the Soviet Slowdown

It is a tautology that economic expansion represents the sum of two sources of growth. On one side are increases in “inputs:” growth in employment, in the education level of workers, and in the stock of physical capital (machines, buildings, roads, and so on). On the other side are increases in the output per unit of input; such increases may result from better management or better economic policy, but in the long run are primarily due to increases in knowledge.

The basic idea of growth accounting is to give life to this formula by calculating explicit measures of both. The accounting can then tell us how much of growth is due to each input – say, capital as opposed to labor – and how much is due to increased efficiency.

We all do a primitive form of growth accounting every time we talk about labor productivity; in so doing we are implicitly distinguishing between the part of overall national growth due to the growth in the supply of labor and the part due to an increase in the value of goods produced by the average worker. Increases in labor productivity, however, are not always caused by the increased efficiency of workers. Labor is only one of a number of inputs; workers may produce more, not because they are better managed or have more technological knowledge, but simply because they have better machinery. A man with a bulldozer can dig a ditch faster than one with only a shovel, but he is not more efficient; he just has more capital to work with. The aim of growth accounting is to produce an index that combines all measurable inputs and to measure the rate of growth of national income relative to that index – to estimate what is known as “total factor productivity.”2

So far this may seem like a purely academic exercise. As soon as one starts to think in terms of growth accounting, however, one arrives at a crucial insight about the process of economic growth: sustained growth in a nation’s per capita income can only occur if there is a rise in output per unit of input.3

Mere increases in inputs, without an increase in the efficiency with which those inputs are used – investing in more machinery and infrastructure - must run into diminishing returns; input-driven growth is inevitably limited.

How, then, have today’s advanced nations been able to achieve sustained growth in per capita income over the past 150 years? The answer is that technological advances have led to a continual increase in total factor productivity a continual rise in national income for each unit of input. In a famous estimate, MIT Professor Robert Solow concluded that technological progress has accounted for 80 percent of the long-term rise in U.S. per capita income, with increased investment in capital explaining only the remaining 20 percent.

When economists began to study the growth of the Soviet economy, they did so using the tools of growth accounting. Of course, Soviet data posed some problems. Not only was it hard to piece together usable estimates of output and input (Raymond Powell, a Yale professor, wrote that the job “in many ways resembled an archaeological dig”), but there were philosophical difficulties as well. In a socialist economy one could hardly measure capital input using market returns, so researchers were forced to impute returns based on those in market economies at similar levels of development. Still, when the efforts began, researchers were pretty sure about what they would find. Just as capitalist growth had been based on growth in both inputs and efficiency, with efficiency the main source of rising per capita income, they expected to find that rapid Soviet growth reflected both rapid input growth and rapid growth in efficiency.

But what they actually found was that Soviet growth was based on rapid growth in inputs – end of story. The rate of efficiency growth was not only unspectacular, it was well below the rates achieved in Western economies. Indeed, by some estimates, it was virtually nonexistent.

The immense Soviet efforts to mobilize economic resources were hardly news. Stalinist planners had moved millions of workers from farms to cities, pushed millions of women into the labor force and millions of men into longer hours, pursued massive programs of education, and above all plowed an evergrowing proportion of the country’s industrial output back into the construction of new factories. Still, the big surprise was that once one had taken the effects of these more or less measurable inputs into account, there was nothing left to explain. The most shocking thing about Soviet growth was its comprehensibility.

Paper Tigers

Consider, in particular, the case of Singapore. Between 1966 and 1990, the Singaporean economy grew a remarkable 8.5 percent per annum, three times as fast as the United States; per capita income grew at a 6.6 percent rate, roughly doubling every decade. This achievement seems to be a kind of economic miracle. But the miracle turns out to have been based on perspiration rather than inspiration: Singapore grew through a mobilization of resources that would have done Stalin proud. The employed share of the population surged from 27 to 51 percent. The educational standards of that work force were dramatically upgraded: while in 1966 more than half the workers had no formal education at all, by 1990 two-thirds had completed secondary education. Above all, the country had made an awesome investment in physical capital: investment as a share of output rose from 11 to more than 40 percent.

Even without going through the formal exercise of growth accounting, these numbers should make it obvious that Singapore’s growth has been based largely on one-time changes in behavior that cannot be repeated. Over the past generation the percentage of people employed has almost doubled; it cannot double again. A half-educated work force has been replaced by one in which the bulk of workers has high school diplomas; it is unlikely that a generation from now most Singaporeans will have Ph.D.s. And an investment share of 40 percent is amazingly high by any standard; a share of 70 percent would be ridiculous. So one can immediately conclude that Singapore is unlikely to achieve future growth rates comparable to those of the past.

But it is only when one actually does the quantitative accounting that the astonishing result emerges: all of Singapore’s growth can be explained by increases in measured inputs. There is no sign at all of increased efficiency. In this sense, the growth of Lee Kuan Yew’s Singapore is an economic twin of the growth of Stalin’s Soviet Union growth achieved purely through mobilization of resources. Of course, Singapore today is far more prosperous than the U.S.S.R. ever was even at its peak in the Brezhnev years because Singapore is closer to, though still below, the efficiency of Western economies. The point, however, is that Singapore’s economy has always been relatively efficient; it just used to be starved of capital and educated workers.

Singapore’s case is admittedly the most extreme. Other rapidly growing East Asian economies have not increased their labor force participation as much, made such dramatic improvements in educational levels, or raised investment rates quite as far. Nonetheless, the basic conclusion is the same: there is startlingly little evidence of improvements in efficiency. Kim and Lau conclude of the four Asian “tigers” that “the hypothesis that there has been no technical progress during the postwar period cannot be rejected for the four East Asian newly industrialized countries.” Young, more poetically, notes that once one allows for their rapid growth of inputs, the productivity performance of the “tigers” falls “from the heights of Olympus to the plains of Thessaly.”

2. At first, creating an index of all inputs may seem like comparing apples and oranges, that is, trying to add together noncomparable items like the hours a worker puts in and the cost of the new machine he uses. How does one determine the weights for the different components? The economists’ answer is to use market returns. If the average worker earns $15 an hour, give each person-hour in the index a weight of $15; if a machine that costs $100,000 on average earns $10,000 in profits each year (a 10 percent rate of return), then give each such machine a weight of $10,000; and so on.

3. To see why, let’s consider a hypothetical example. To keep matters simple, let’s assume that the country has a stationary population and labor forces, so that all increases in the investment in machinery, etc., raise the amount of capital per worker in the country. Let us finally make up some arbitrary numbers. Specifically, let us assume that initially each worker is equipped with $10,000 worth of equipment; that each worker produces goods and services worth $10,000; and that capital initially earns a 40 percent rate of return, that is, each $10,000 of machinery earns annual profits of $4,000.

Suppose, now, that this country consistently invests 20 percent of its output, that is, uses 20 percent of its income to add to its capital stock. How rapidly will the economy grow?

Initially, very fast indeed. In the first year, the capital stock per worker will rise by 20 percent of $10,000, that is, by $20,000. At a 40 percent rate of return, that will increase output by $800: an 8 percent rate of growth.

But this high rate of growth will not be sustainable. Consider the situation of the economy by the time that capital per worker has doubled to $20,000. First, output per worker will not have increased in the same proportion, because capital stock is only one input. Even with the additions to capital stock up to that point achieving a 40 percent rate of return, output per worker will have increased only to $14,000. And the rate of return is also certain to decline – say to 30 or even 25 percent. (One bulldozer added to a construction project can make a huge difference to productivity. By the time a dozen are on-site, one more may not make that much difference.) The combination of those factors means that if the investment share of output is the same, the growth rate will sharply decline. Taking 20 percent of $14,000 gives us $2,800; at a 30 percent rate of return, this will raise output by only $840, that is, generate a growth rate of only 6 percent; at a 25 percent rate of return it will generate a growth rate of only 5 percent. As capital continues to accumulate, the rate of return and hence the rate of growth will continue to decline.

12. Technology’s Revenge

But the past 20 years have not been good ones for ordinary workers. Even as the earnings of many college-educated workers soared in the United States, young men without college degrees have seen their real wages drop by 20 percent or more – this in spite of productivity growth which, while disappointing, nonetheless allowed the average American worker to produce about 25 percent more in 1993 than in 1973. In Europe, the growth of wage inequality has been less dramatic, but there has been a steady, seemingly inexorable rise in unemployment, from less than three percent in 1973 to more than 11 percent today (versus six percent in the United States). […]

Most people who read intellectual magazines or watch public television know why this is happening. Growing international competition, especially from low-wage countries, is destroying the good manufacturing jobs that used to be the backbone of the working class. Unfortunately, what these people “know” happens to be flatly untrue. The real reason for rising wage inequality is subtler: Technological change since 1970 has increased the premium paid to highly skilled workers, from data processing specialists to physicians. The big question, of course, is whether this trend will continue. […]

Economists use the word “technology” somewhat differently from normal people. Webster’s defines technology as “applied science,” which is pretty much the normal usage. When economists speak of technological change, however, they mean any kind of change in the relationship between inputs and outputs. If, for example, a manufacturer discovers that “empowering” workers by giving them a voice in how the factory is run improves quality – and allows the plant to employ fewer supervisors – then in the economic sense this would be an improvement in the technology, one that is biased against employment of managers. If, however, a manufacturer discovers that workers will produce more when there are many supervisors constantly checking on them, this is also a technological improvement, albeit one biased toward employment of managers.

In this economist’s sense, it seems undeniable that over the past 20 years the advanced nations have experienced technological change that is strongly biased in favor of skilled workers. The evidence is straightforward. The wages of skilled workers, from technicians to corporate executives, have risen sharply relative to the wages of the less skilled. In 1979, a young man with a college degree and five years on the job earned only 30 percent more than one with similar experience and a high school degree; by 1989, the premium had jumped to 74 percent. If the technology of the economy had not changed, this sharp increase in the relative cost of skilled workers would have given employers a strong incentive to cut back and substitute less-skilled workers where they could. In fact, exactly the opposite happened: Across the board, employers raised the average skill level of their work forces.

It is hard not to conclude that this technologically driven shift in demand has been a key cause of the growth of earnings inequality in the United States as well as much of the rise in unemployment in Europe. It is not the only possible explanation. It could have been the case that rising demand for skilled workers was not so much the result of greater demand for skill within each industry as of a shift in the mix of industries toward those sectors that employ a high ratio of skilled to unskilled workers. That sort of shift could, for example, be the result of increased trade with labor-abundant Third World countries. But in fact the overwhelming evidence is that the demand for unskilled workers has fallen not because of a change in what we produce but because of a change in how we produce.

Is it really possible for technological progress to harm large numbers of people? It is and it has been. Economic historians confirm what readers of Charles Dickens already knew, that the unprecedented technological progress of the Industrial Revolution took a long time to be reflected in higher real wages for most workers. Why? A likely answer is that early industrial technology was not only labor saving but strongly capital using – that is, the new technology encouraged industrialists to use less labor and to invest more capital to produce a given amount of output. The result was a fall in the demand for labor that kept real wages stagnant for perhaps 50 years, even as the incomes of England’s propertied classes soared. […]

Probably the simplest story about how modern technology may promote inequality is that the rapid spread of computers favors those who possess the knowledge needed to use them effectively. Anecdotes are easy to offer. Economist Jagdish Bhagwati cites the “computer with a single skilled operator that replaces half a dozen unskilled typists.” Anecdotes are no substitute for real quantitative evidence, but for what it is worth, serious studies by labor economists do suggest that growing computer use can explain as much as one-half of the increase in the earnings edge enjoyed by college graduates during the 1980s.

Yet there is probably more to the story. The professions that have seen the largest increases in incomes since the 1970s have been in fields whose practitioners are not obviously placed in greater demand by computers: lawyers, doctors, and, above all, corporate executives. And the growth of inequality in the United States has a striking “fractal” quality: Widening gaps between education levels and professions are mirrored by increased inequality of earnings within professions. Lawyers make much more compared with janitors than they did 15 years ago, but the best-paid lawyers also make much more compared with the average lawyer. Again, this is hard to reconcile with a simple story in which new computers require people who know how to use them.

One intriguing hypothesis about the relationship between technology and income distribution, a hypothesis that can explain why people who do not operate computers or fax machines can nonetheless be enriched by them at the expense of others, is the “superstar” hypothesis of Sherwin Rosen, an economist at the University of Chicago. Almost 15 years ago, before the explosion of inequality had become apparent, Rosen argued in the Journal of Political Economy that communication and information technology extend an individual’s span of influence and control. A performance by a stage actor can be watched by only a few hundred people, while one by a television star can be watched by tens of millions. Less obviously, an executive, a lawyer, or even an entrepreneurial academic can use computers, faxes, and electronic mail to keep a finger in far more pies than used to be possible. As a result, Rosen predicted, the wage structure would increasingly come to have a “tournament” quality: A few people, those judged by whatever criteria to be the best, would receive huge financial rewards, while those who were merely competent would receive little. The point of Rosen’s analysis was that technology may not so much directly substitute for workers as multiply the power of particular individuals, allowing these lucky tournament winners to substitute for large numbers of the less fortunate. Television does not take the place of hundreds of struggling standup nightclub comedians; it allows Jay Leno to take their place instead. […]

So here is a speculation: The time may come when most tax lawyers are replaced by expert systems software, but human beings are still needed – and well paid – for such truly difficult occupations as gardening, house cleaning, and the thousands of other services that will receive an ever-growing share of our expenditure as mere consumer goods become steadily cheaper. The high-skill professions whose members have done so well during the last 20 years may turn out to be the modern counterpart of early- 19th-century weavers, whose incomes soared after the mechanization of spinning, only to crash when the technological revolution reached their own craft.

I suspect, then, that the current era of growing inequality and the devaluation of ordinary work will turn out to be only a temporary phase. In some sufficiently long run the tables will be turned: Those uncommon skills that are rare because they are so unnatural will be largely taken over or made easy by computers, while machines will still be unable to do what every person can. In other words, I predict that the current age of inequality will give way to a golden age of equality. In the very long run, of course, the machines will be able to do everything we can. By that time, however, it will be their responsibility to take care of the problem.

]]>menghublog1001Pop Internationalism (Krugman 1996) Table 6.1Pop Internationalism (Krugman 1996) Figure 6.1Pop Internationalism (Krugman 1996) Table 6.2The Great Depression in Britain (1873-1896) : the Myth that Deflation Lowers Economic Growthhttps://menghublog.wordpress.com/2015/05/03/the-great-depression-in-britain-1873-1896-the-myth-that-deflation-lowers-economic-growth/
Sun, 03 May 2015 19:03:40 +0000http://menghublog.wordpress.com/?p=2681Continue reading →]]>The problems of the Great Depression (1873-1896) have been falsely to secular declines in prices. To begin, it was not clear there were any troubles during this period and even if it was the case, they were of trivial importance. And the idea that the period of 1870-90s experienced lower growth rates than the 1850-70s receives only weak support. Furthermore, the period of inflation (1897-1913) was possibly associated with lower growth, lower profit, lower real wages, higher unemployment rates, and higher interest rates. This disproves the keynesian argument that mild deflation is better than mild deflation. The period of Great Depression, either taking the period of 1873-1896 or 1880-1896 disproves the idea that unemployment increases as the price level declines. The 1855-1872 period of stable prices did not necessarily have greater growth than the 1880-1896 period of deflation. The deflation of the 1873-1896 period probably did not influence growth, wasn’t a source of burden for investment, as interest rates decline along with price levels, didn’t induce people to consume less, wasn’t associated with worsening living condition, but instead to a better standard of living. If Britain had difficulties during the Great Depression, they could be related to the lack of skilled workers and, more significantly, to the lack of major innovations and decline in exports. The causes are generally not well understood but two conclusions at least have been confirmed. First, the most important factor explaining slower growth is a fall in labour productivity. Secondly, the so-called problems in the Great Depression have nothing to do with declining prices.

An introduction to the theories about deflation

I also talked about it at length in my post on the so-called Great Depression the U.S. economy of 1873-1896. I’ll summarize the ideas.

Selgin (1997) argued that deflation under productivity growth does not increase debt in percentage of income because increased debt burden is matched by increased income. Under deflation, consumers buy more things as prices are declining just as lenders earn more money due to greater debt payment. In the situation where borrowers expect further declines in prices, they may be able to negotiate a somewhat lower nominal interest rate. One example of this arrangement has been experienced in the U.S. economy of 1873-1896 (Higgs, 1971, pp. 97-99; Beckworth, 2007, Figure 4). Furthermore, allowing the prices to decline, instead of maintaining price stability, reduces wage-rate downward adjustments which are more likely to be a source of labor-market frictions and consequent labor misallocation than would be price declines. The enforcement of price stability can cause business cycles. Nominal prices do not adjust sluggishly to productivity changes but almost immediately. For this reason, changes in the demand for real money balances based on innovations to aggregate productivity are accommodated immediately by falling prices and well ahead of any possible monetary policy response. Thus, monetary injection may bring excess money in the hand of people and cause malinvestments, as predicted by the ABCT, which has strong theoretical relevance as well as empirical relevance.

Even the evidence does not validate the opinion that productivity-driven deflation is harmful. Friedman & Schwartz (1963 [1971], p. 93) demonstrate that the forces making for economic growth over the course of several business cycles are largely independent of the secular trend in prices. Similar conclusion is provided by Ryska (2014). Indeed, the argument that deflation causes lower growth (just because it has coincided with a relatively more difficult period) does not have strong empirical grounds. In a study using VAR analyses on the U.S. and European countries (U.K., Germany, France) of the 1880-1913, Bordo et al. (2010, pp. 536-542) found that world money supply shocks have an impact on country-specific price levels but the output was essentially driven by country-specific supply-side factors in the European countries (Figures 11, 14, 17). This implies that money was neutral, whereas money was non-neutral in the U.S. (Figure 8) only because of the presence of crises, probably due to an unstable banking system (Beckworth, 2007). Overall, it cannot be concluded from these studies that deflation (from fluctuations in country-specific monetary gold stocks) causes slower economic growth. In their own words :

Our key finding is that the European economies were essentially classical in the sense that money was neutral and output was mainly supply driven. We find that the price level shock was dominated by money supply factors, which in turn were partially explained by gold shocks. Typically these shocks did not affect output, which was largely explained by supply shocks.

For example, their Figure 11 shows the decomposition of UK output. If the solid-and-dot line “baseline+shock” is close to the solid line “actual” then the variable has an impact on ouput. But if the solid-and-dot line is close to the dashed line “baseline” then the variable doesn’t have an impact on output.

Money shock is the world price level originating from world money stock changes (p. 535), supply shock is the domestic supply shocks, e.g., reflecting productivity advances (p. 536), demand shock is the (residually defined) domestic money demand shock (p. 537). We see that the only variable of importance on output (real GDP) is the supply shock (i.e., country specific supply-side factors). Figure 22 includes UK gold stock, but the behavior was just identical. Adding UK gold stock to the “baseline” or “baseline+shock” line doesn’t help to match the “actual” line in any of the 3 variables. Figure 20 includes world gold production, and adding world gold to the “baseline” or “baseline+shock” causes these lines to move a little bit closer to the “actual” line in the case of money shock and demand shock, which means that world gold production is able to explain a small portion of the UK real GDP. But supply shock is still the dominant factor. All of the analysis used real GDP aggregate rather than per capita but their comment on footnote 14 says that the use of real GDP per capita does not change the results.

The above analysis can be compared with Capie et al.’s (1991, pp. 275-276, Figure 9.7a) results from vector ARMA modeling that shows that, in Great Britain between 1870 and 1913, real output (aggregate, not per capita) responded negatively to a shock in M0 and does not respond to a shock in M3 and responded negatively to a shock in prices. Price levels were positively affected by a shock to M0 but not all to M3 (see fn. 8).

The Great Depression of 1873-1896 in Britain

Deflation

Given Bordo et al.’s (2010) research, the idea that deflation has any causal relationship to output in Britain must be rejected. We have yet to enter into the details and understand what really happened.

Capie et al. (1991, p. 253) and Capie & Wood (1997, pp. 287-288) showed that money stock (on a broad definition including coin and bank deposits) grew by 1.3% a year from 1873 to 1896, and 2% a year from 1896 to 1913. Over the downswing as a whole the money stock grew by 33% and real output rose by just over 53%, while, over 1896-1913, the money stock grew by 40% and output by 36%. For 1873-1896 (or 1896-1913), they indicate that wholesale prices fell by 39% (rose by 40%) but Feinstein’s GNP deflator fell by 20% (rose by 17.6%). Then, Capie & Wood (1997, pp. 287-288) argue that such price declines (1873-1896) and price increases (1896-1913) would be immaterial if the demand for money is not stable relative to income. They say that numerous studies report that the income elasticity of demand (i.e., the rate of response of quantity demand for a good due to a change in a consumer’s income) is always close to unity. This means that the demand for a good or commodity (here, money) increases in the same proportion as the rise in income; this, in turn, implies that the demand for money was stable relative to income. For the UK, specifically in the 1871-1913, Turner (1991, Tables 10.4 & 10.5) found an income elasticity of demand of 0.896 in a multiple regression with M3 as dependent variable and GNP as one of the independent variables (along with bond rate and deposit rate). When M3 and non-bank financial institutions (NBFI) deposits are combined into a total quasi-money variable, the income (GNP) elasticity of money demand is 0.873. Additionally, Bordo & Schwartz (1981, Table 1, row 13) demonstrate that the real income elasticity of demand for money shows only a little change between 1880-1896 and 1897-1913, either in the US or in the UK; this means that money demand is not affected by the trend in price levels. Capie & Wood (1997) said that in Britain (and also the United States) the trend growth rate of money depended closely on the trend growth rate of the monetary base, which was gold. This view has been confirmed by Bordo et al. (2010).

What is usually misunderstood by people who argue that this period of deflation has caused many problems is that the economists (e.g., Selgin, 1997) whose proposition is to allow prices to decline in the face of productivity growth are arguing for a productivity-driven deflation. But things were somewhat different in Britain, where the lack of major innovation during that period was obvious (Richardson, 1965, p. 141; Crafts, 1995b, pp. 756-759). First, the higher level of consumption is influenced heavily by the favorable terms of trade which made import prices relatively cheaper (Musson, 1959, p. 217; Saul, 1969, pp. 19, 32, 38). And, according to Solomou & Catao (2000, p. 372), the declines (and increases) in import price during the deflationary (and inflationary) period(s) of 1879-1913 are partly due to nominal exchange rate appreciation (and depreciation). Second, it is without dispute that the decline in labour productivity has not been accompanied by a decline in labour cost (Musson, 1959, p. 218; Coppock, 1961, p. 230). There were many proposed causes of this decline. One is that labour was used less efficiently or that the gains from changes in technique and organisation fell off sharply, either of which would lead to a declining rate of profit (Coppock, 1961, p. 229). Another is that of declining quality of natural resources (Crafts & Mills, 2004, p. 170). In any case, the sticky nature of wages should not be held responsible for this. One has to understand beforehand the reason for the decline in labour productivity. Yet such decline argues against the idea that the period of declining prices (1873-1896) compared to the earlier period of stable prices (1855-1873) has been brought about by just productivity gains. If deflation is productivity-driven, labour productivity growth would have increased, instead of declining, thanks to increased productive efficiency (through education and better skills, method of production, greater quantity of capital goods, etc.). Third, as Selgin (1997, pp. 29-31) demonstrated, prices would decline along with costs, thus letting profits unchanged. And yet, the period of Great Depression has known a decline in profits (Saul, 1969, p. 42). Selgin also argued that productivity norm, a regime under which the price levels are allowed to vary to reflect changes in goods’ unit costs of production (Selgin, 1995a, pp. 736-737; Selgin, 1997, pp. 26-27), would minimize changes in nominal wages, whereas price stability regime would induce greater changes in nominal wages.

And transport cost is not a good explanation for the price declines. Although Coppock (1961, p. 210) said that transport costs declined dramatically at that time, Coppock (1961, p. 211) also said that most transport costs account for 1/6 or 0.17% of the fall in import prices from 1872-3 to 1895-9. But costs overall have probably an influential role. Coppock (1961, pp. 212-213) cited Brown & Ozga’s study showing some evidence that prices between 1830 and 1950 were driven by industrial demand (which is better termed “industrial needs” for making products, and should not to be confused with consumer demand) rather than by industrial supply. These authors argue that when industrial demand (or capacity) grows slower than does supply of products, the prices would fell because of downward cost-push on prices. Both costs and prices are lowered. The reverse would hold if industrial demand is slower than supply and in this case the rise in cost of raw materials pushes up the prices of final products. Both costs and prices are greater. This was also Capie et al.’s (1991, pp. 257-258) interpretation. This view of cost-push prices opposes the modern quantity theory which posits that inflation is solely a monetary effect, but it has been challenged by Bordo & Schwartz (1981).

Bordo & Schwartz (1981, Tables 2 & 3) analysed some of the proposed causes of the price changes in the United Kingdom between 1880 and 1913, notably the view that price levels were due to real factors (e.g., production costs) rather than monetary factors. The hypothesis that they are challenging is the one positing that the price of wheat, having an important role as both an input and as final output, has the dominant role in the fluctuations of the overall price level. One particular feature of the analysis is that they have combined UK and USA, on the grounds that “under the classical gold standard it would be incorrect to treat each country as if it were a closed economy since each country in such a monetary system must be viewed as an open economy with its money supply tied to the world price level through its balance of payments”. They conducted a regression analysis (after first differencing the variables), with (log) price levels as the dependent variable and the (log) ratio money/output and (log) terms of trade (i.e., relative prices) between agricultural (primary) and industrial (manufactured) products as independent variables. When each of these variables are entered individually, the R² amounted to, respectively, 0.241 and 0.042. When they are inserted altogether, the R² is 0.349. The R² is not an effect size, and one should rely on the correlation, i.e., SQRT of R², which amounts to 0.205 for the terms of trade. This is a small/modest effect. On the other hand, the money/output ratio was truly important, i.e., price levels were related to money stock changes relative to real output. They then test Rostow and Lewis’ hypothesis (which has theoretical problems according to Bordo & Schwartz, 1981, p. 116) that wheat prices were a key cause of the price movements in primary products. They use 5 separate regressions, using either irish potatoes, cotton, tobacco, corn or refined sugar as the dependent variable, and (log) money/output and (log) prices of wheat flour (or wheat for grain). Only when the dependent variable is cotton or sugar that the elasticity for wheat flour or wheat for grain is of somewhat importance (as seen by the unstandardized coefficient β’2). So, the Rostow-Lewis hypothesis of cost-push prices was rejected. Even so, that does not concern the point made by Brown & Ozga regarding reduced production costs.

If one wants to explain why Britain lagged behind the U.S. and Germany, deflation can hardly be a relevant factor. Bordo et al. (2010) reported that prices fell by 22%, 10% and 6% in the U.S., U.K., and Germany, respectively. Not only the U.S. growth was stronger, but it has also experienced a series of banking crises, unlike the U.K., which would have certainly hurt economic growth (Capie & Mills, 1991).

Even though the said period is not one of prosperity (due to repeated boom-bust cycles), it was still a good period. Even so, people didn’t feel good at the idea of deflation, and this may be due to the fact that their nominal wages were being reduced (Musson, 1959, p. 201).

GDP (per capita)

The data on GDP growth is taken from Maddison’s (2003 pp. 59, 61) figures, based on Feinstein (1972) for the years 1855-1960. At first glance, it would seem that the period before 1873 is associated with a stronger growth. Using Stata and the above numbers, I plot the following graph :

The spreadsheet is available here. I averaged the numbers for the following periods : 1855-1872, 1873-1896, 1880-1896, 1897-1913. The respective growth rate averages are 1.38%, 1.06%, 1.43%, 0.89%. Clearly, the period of inflation after 1896 has been the worse. The period of 1855-1872 does not necessarily appear to have been better than the period of 1980-1896. A look at the figures given by Floud (1981, p. 7) allows us to reach the same conclusion :

Similarly, the numbers on real growth rates of industrial production per head show that the inflationary period didn’t do better but probably worse than the deflationary period (Saul, 1969, p. 37).

Something seems to be happening at the end of the 19th century and, indeed, Musson (1963) and Coppock (1964, p. 390) revealed that there is no economic growth, after excluding building, in the inflationary period of 1896-1913. We can confirm this impression from Feinstein’s (1990) Table 2, which is also based on Feinstein (1972) :

Concerning Figure 1, Feinstein (1990, p. 351) concluded that the GDP estimates based on income data is more reliable because many indicators of industrial output are estimates of raw material inputs and many indicators for output of services are based simply on linear interpolation between decennial census figures for the numbers employed in those services, in some cases with the addition of an assumed steady increase in productivity. The latter, in particular, underestimates cyclical variations. It is clear from the income data based estimates that the growth becomes stronger, and the deceleration more serious at the end of the 19th century. Concerning Tables 2-3 shown above, all GDP series agreed on something : the fact that the period of 1899-1913 has been the worse by far. The estimates of GDP per worker between 1856-73 and 1882-99 seem very close when based on income data and expenditure data. In all three series, we don’t see a dangerous change in pace occuring in the Great Depression. The significant change really happened after the period of Great Depression. Feinstein (1990, p. 340) splits the periods not at 1896-7 but at 1899 because it is after that year that the growth of real wages were decelerating, which event is associated with slower growth and the turnaround in the terms of trade which began to be unfavorable to Britain (Saul, 1969, p. 29).

There is more to say. Crafts et al. (1989a, Figure 1) criticized earlier studies for not using inappropriate models. They say that the linear regression model commonly used takes the form Yt = α + βt + ut, with α the intercept, β the slope, u the error term which is assumed to be a stationary error process (i.e., its mean, variance and autocovariances are finite and constant through time). This model, which accounts for nonstationary behaviour in Y by the deterministic trend α + βt which leaves the cyclical component ut as a stationary series of deviations around this trend function, corresponds to what Nelson & Plosser (1982) call TSP or trend-stationary processes. Another model takes the form Yt = Yt-1 + β + ɛt, where Yt-1 is the lagged value of the dependent variable, ɛt is a stationary but not necessarily serially uncorrelated with mean zero and constant variance, β being the fixed mean of the differences, the growth rate in the present context. A model which corresponds to DSP or difference-stationary processes. The distinction is important because “If GDP is of the TSP class, then all variation in the series is attributable to changes in the cyclical component, whereas if GDP is a DSP its trend component must be a nonstationary stochastic process rather than a deterministic trend, so that an innovation of GDP has an enduring effect on the future path of the series. Thus treating GDP as a TSP rather than as a DSP is likely to lead to an overstatement of the magnitude and duration of the cyclical component and to an understatement of the importance and persistence of the trend component.” (pp. 109-110). Crafts et al.’s trend-plus-cycle model takes the form Yt = αt + βtt + ut, where αt = αt-1 + at, βt = βt-1 + bt, ut = p1ut-1 + p2ut-2 + et (given that the error process u, seems to be adequately modelled by an AR(2) process, which allows a cyclical component to be incorporated; for the definition, a process considered AR(1) is the first order process, meaning that the current value is based on the immediately preceding value and AR(2) process has the current value based on the previous two values; see Investopedia), with at, bt, et being zero mean, serially uncorrelated, and individually independent processes. This model thus allows the slope and intercept (growth rate) to drift continuously from period to period by way of random walks. According to them, “Such a formulation avoids forcing the slope and intercept coefficients to change at discrete points in time; rather they are allowed to vary sequentially in a manner that has been found to provide a sensible explanation for the evolution of many economic time-series.” (p. 111). The reason for using this approach is because, they say, “it is generally agreed that the actual rate of growth was subject to fluctuations stemming from short-term shocks to aggregate demand” and that “It is therefore important that changes in the unobservable, but estimatable, trend rate of growth are the result of appropriate time-series decomposition procedures rather than the outcome of an arbitrary division of the time-series into cyclical and trend components” (Crafts, 1989a, p. 105). Figure 1 below shows the time paths of β over time. The GDP (aggregate, not per capita) growth rate throughout the entire period of 1855-1915 does not change at all. That is, there is slowdown of growth rates after 1873 or 1899. The strongest change occurred after 1899, with growth rate per year down by only 0.1%.

Crafts et al. (1989b) say, once more, that “it is common practice to regard the trend as a deterministic function of time and the cyclical component as a stationary process that exhibits stochastic movement around the trend” (p. 47). Indeed, trend-cycle analysis in the literature on British economic history of the industrialization era has been generally to detrend series using a 9-year moving average and to investigate cycles by looking at deviations from this moving average. An approach that should be avoided. Crafts et al. (1989b) replicated the above trend-plus-cycle model (which decomposes trend and cycle components) by using Hoffman’s industrial production, which covers the whole of the period 1700-1913. That data have been criticized for applying inappropriate weighting on some sectors of activity for certain years. The authors thus applied the corrections of Harley (1982) and Lewis (1978) to Hoffman’s data. Figure 4 below shows the trend growth component, that uses their preferred estimates (Q3) which adopts both Lewis and Harley’s correction. It is clearly visible that the decline in growth has started during the 1850s-1860s or so. In fact, the decline in growth rates is slower during the 1870-90s than during the 1850-60s, and there is no change between 1880 and 1896. Whatever events that lowered the trend growth seem to be unrelated to deflation or the Great Depression.

Greasley (1992) argued that a proper test of the climacteric hypothesis must involve cointegration test between GDP and factor input growth series. Two measures of GDP are used : an income-side (y) series from Greasley (1989) and a compromise (i.e., aggregation) series based on Feinstein’s (1972) output and expenditure and Greasley’s (1989) income data. He concluded that “The findings suggest that British GDP series are integrated of order one, and hence that the rate of GDP growth reverted to a constant mean rate in the period 1856-1913, which militates against the climacteric. The cointegration results also show long-term convergence of GDP and factor input growth, which counters the view that the years to 1914 represent the productivity downswing phase within the longer ‘U’-shaped pattern of British productivity performance.” (p. 207). This implies that GDP growth stays constant over time and that one can predict GDP growth by knowing factor input growth and vice versa because the two variables move together.

Whether we side with Feinstein or with Crafts and Greasley, it can only be concluded that deflation has not been associated with slower growth.

These authors argue that this can be easily understood by the fact that the expectation of deflation (when there are such expectations) will reduce long-term nominal interest rates. An analysis from ARMA modeling confirmed that prices do not respond to shocks in interest rates (Capie et al., 1991, Figures 9.6c & 9.7c). But interest rates responded positively to shocks in prices (Capie et al., 1991, Figures 9.6d & 9.7d). This strongly suggests that variations in interest rates are caused by expectations in prices, not otherwise. This is surely what happened in the deflationary U.S. economy of 1873-1896. And in the deflationary british economy of 1873-1896 as well. There is no debt burden, since Harley (1977, pp. 79-82) has concluded that the nominal interest rates moved accordingly to prices so that the real rates of interest remained stable between 1873 and 1912, although it should be noted that external factors (e.g., gold discoveries in the mid-1880s) have influenced the changes in interest rates. On average, the real interest rate was about 0.25% higher during the deflationary period than in the inflationary period.

The positive correlation between interest rate, known as Gibson Paradox is not a historical accident. Dowd & Harrison (2000) analyzed the U.K. in the 1821-1913 period. By using 4 price series, they tested the cointegration of price and interest rate series. The long-run equilibrium relationship holds for 3 price series. They conclude that the Gibson Paradox is not an artifact resulting from the financial effects of war. Chadha & Perlman (2014) confirmed this long-term relationship in the U.K. for an even longer period (1702-1913).

One other argument used by keynesians is that of idle resources, criticized by Hutt (1963 [2011], pp. 24-29, 54, 100), resulting from lack of investment due to depressing environment caused by lack of consumption and gloomy expectations of profits. Ashworth (1966, p. 19) argues that there was a mis-use or under-utilization of capital. Production was becoming more capital intensive. The length of the working day and limited use of shift working, which may be the result of the increased unemployment rate since the end of the boom in 1873, seem to explain the under-utilization of the productive equipment. Another source of idle resource is building. Ashworth (1966, p. 21) also informs us that too many buildings have been produced and don’t contribute to economic growth. Most of the time, these buildings were used only during the appropriate season of a year. Ashworth writes :

An analogous (though lesser) influence appears in the disproportionate growth of residential towns and holiday resorts. This went on all through the nineteenth century, but it was not until fairly late that these towns were collectively big enough to have more than a negligible effect on the nation’s economy. But the increase in their size and in the number of holidaymakers involved them in capital investment and labour recruitment to deal with a short seasonal peak, with the consequence that resources were seriously under-utilized for the greater part of their existence.

Ashworth (1966, p. 19) remarked that the dormitory suburbs were prominent in the seventies, but their growth in the eighties and nineties was absolutely much greater. This shows, anyway, that the under-utilization of buildings has nothing to do with falling profits or deflation or gloomy expectations. It is due to an increasing amount of bad investments.

On a related note, Ashworth (1965, p. 72) mentions that the ratio capital/output (which assumes homogeneity of capital, which makes no sense from the Cambridge economists’ and austrian economists’ point of view) showed a decline during the Great Depression. But he says that “Fuller utilization of existing capital, common in the later stages of a long period of heavy developmental investment such as occurred in Britain in the mid-nineteenth century, could bring down the average ratio. Changes in the type and purpose of new investment, inevitable in an industrial society, could bring lower incremental ratios.” (p. 72).

Saul (1969, p. 42) takes for granted that if profits were to fall, due, e.g., to the increase in the share of wages in the sum of wages plus profits (Saul, 1969, p. 33), the industries will be less inclined to launch productive investment; finance being a major source of industrial profits. The first difficulty is theoretical. Hayek (1939) demonstrates that higher profits tend to favor investment that is less time-consuming (less capital-intensive); the consequence is a decline in profits among capital goods industries which causes a shortening of the structure of production. Weber (2009) showed that the U.S. economy has grown along with a lengthening of the production structure. The second difficulty is empirical. The period after 1896, when deflation was replaced by inflation, shows no increase in profits compared to the 1873-1896 period. Saul gives the following numbers, taken from Feinstein :

If anything, the profits in this later period (Edwardian era) of inflation were lower, not larger. The percentage corresponding to the period of 1870-74 is the highest due to the large railway boom, biasing upward the average share of profit in national income for the Middle Victorian Era. Now, the last period, not considered yet is before 1873, e.g., 1850s-1870s, the Middle Victorian Era. When looking at this entire period, one could perceive there was inflation. But Saul (1969, p. 13) correctly argues this is wrong. There is a sharp increase in prices at the beginning of the period (1952-3) but nothing more after this. The 1855-1872 period is better characterized as one of price stability, as was noted by Coppock (1961, p. 224).

Until now, we have assumed that profits caused investment in the british case. Pesmazoglu (1951) re-analyzed Tinbergen’s multiple regression analysis concluding that investment is highly influenced by profits. Unlike Tinbergen, Pesmazoglu (1951, pp. 53-56, 59-61) applied first-differences to the variables. He has established that current and past profits did not influence british home investment between 1870 and 1913, and found that variations in prices of investment goods and in the long-term rate of interest did not have an important influence on fluctuations of british home investment. He suspects that variations in income and activity in the primary producing areas which were borrowers from the U.K. probably considerably affected business expectations at home and thus, home investment.

One dissenting view is from Kennedy (1974, pp. 425-426, 429), according to whom the british wealth-holders had an aversion for risk, and favored lower-risk investments (and thus having lower yields), which promote slower growth. That risk-aversion has reduced the capital formation (by raising its cost) compared to the U.S. and Germany (Kennedy, 1974, p. 434). It has been proposed that the reason has to do with imperfect information pertaining capital market, which makes foreign investment unattractive due to higher risk (Kennedy, 1982, p. 112). British firms were more familiar with British engineering equipment than were foreigners. As a result, the existing foreign investment was skewed towards customers with a lower propensity to use British capital equipment (Kennedy, 1974, pp. 437-438). Even if we grant this point, it could also be that the cause of low-risk taking is related to the small size of many british firms and traditions of self-finance (Saul, 1969, p. 41). However, as argued above, the lower capital formation was not inherent to the Great Depression but prevailed even before and after (Saul, 1969, p. 41) :

These numbers were taken from Kuznets (1961, Table 3). In terms of net national capital formation to net national product, the ratios for the three periods were, respectively, 10.6, 10.6, 11.8. This again provides no good evidence for the idea that deflation was associated with lower investment, compared to the earlier period. Coppock (1961, p. 228, fn. 2) shows that the ratio of total investment (the sum of home and foreign investments) to national income has declined modestly by 1% between 1856-74 and 1875-96. Still another source (Lenfant, 1951, Table III, pp. 160, 166-167; see also Musson, 1959, p. 211) reports rather stable investment rates. Between 1873 and 1896, the ratio of net total investment to net income oscillated between 8% and 11 % while the ratio of net gross investment to gross income oscillated between 15% and 17%. For the inflationary period between 1897 and 1914, the numbers were 10%-12% and 16%-19%. Perhaps modestly higher.

Another version of Kennedy’s argument is that the british banks were reluctant to make business on long term loans. Kennedy (1987, p. 122) cited Jefferys :

But the shock of these failures in 1878 and the resulting turn toward timidity and amalgamation in banking, and the adoption of limited liability by industry, which lessened the demand for long term loans, brought to an end in the ‘eighties, this formative period of British banking. . . (1938:18)

The banks were by the ‘eighties no longer showing such a readiness to act as partners in industrial concerns. They were moving further and further away from the concept of long term loans and were concentrating on an efficient national short term credit system. (1939: 119)

So, according to Jefferys, there were two reasons. First is the series of banking failures which reached a peak in 1878, and second is the absence of unlimited liability on the part of the industries. Regarding 1878, it could well be that the crash of the boom of 1870-1873 (and with it, the rising unemployment after 1873) has been the cause of these banking failures. If so, deflation would not be held responsible for the boom. Regarding liability, as argued by Saul (1969, p. 41), the small size of many firms could have been a serious hindrance. Aldcroft (1964, pp. 126, 131-132) and Ashworth (1966, p. 23) indeed remarked that british firms failed to generate economies of scales, which made it difficult to establish selling organizations and agencies for dealing with foreign markets. Two forces may be working to reinforce each other : the lack of finance prevented large-scale expansion and the small size of the firms did not attract long-term banking loans. But this has nothing to do with the supposed lack of investment. Again, the entire argument is dislocated by the fact that capital formation stayed rather stable between 1855 and 1914 (for an explanation of capital formation, see Machlup, 1940 [2007], pp. 26-27). McCloskey & Sandberg (1971, p. 105) also come to the conclusion that there is no known evidence of under-investment in research, in the new industries, in marketing or in what they call the formation of cartels.

Overall, investments, whether home or abroad, show a cyclical pattern of ups and downs. One plausible reason is the change in interest rates. Ford (1969, p. 110; 1981, p. 42) argues that the direct cost effects of the Bank Rate on domestic investment spending is probably weak; variations in Bank Rate brought changes in short-term interest rates to which investment spending was insensitive, but did not influence substantially longer-term rates to which such spending might be more sensitive. Furthermore, the nature of finance of home investment from undistributed profits (but see Pesmazoglu, 1951) and private loans rather than the Stock Exchange would lessen sensitivity to interest-rate changes and their associated cost effects. Even though Ford (1981, pp. 38-39) found a link between exports and loans, as british exports follow one or two years behind the fluctuations of british loans abroad, he suspects the relationship may also be driven by a third factor. For instance, a rising economic activity could have caused both exports and loans. Yet, it seems that overseas issues precede exports and income while Bank Rate precede overseas issues (Ford, 1981, p. 47).

In definitive, there is no evidence that a continuous decline in prices would have brought about a continuous decline in profit and investment.

Unemployment

It could be argued that when prices fall, nominal wages don’t follow because of wage stickiness (workers are reluctant to wage cuts). As argued before, in the case of productivity-driven deflation, there is no increase in costs associated with sticky wage. It is not real wages that matter but the gap between real wages and productivity. Instead of blaming sticky wages, one could have blamed the declined labour productivity. But if one really wants to put the blame on deflation for the apparent wage stickiness in Britain that has supposedly caused the rise in unemployment, one needs to explain why the nominal wage in the U.S. between 1873 and 1879 has fallen by 16.11% (Newman, 2014, p. 495). One plausible reason for the difference could be due to the strong labour union in Britain (Saul, 1969, pp. 32-33).

Another related complaint is that deflation tends to increase debt and, ultimately, unemployment. That may be true of a recession, which is brought about by excessive debt (or, equivalently, insufficient savings). But not in the case of productivity-induced deflation. The data does not even support such idea.

The Table 6 from Boyer & Hatton (2002) shows that unemployment has increased between 1873 (2.8%) and 1879 (9.1%) but no more since then and yet wholesale prices continued to decline in a consistent way. This refutes the idea that continuous declines in prices causes continuous rises in unemployment. Looking at each subsequent years after 1879 until early 1890s, a decline in unemployment was the rule, not the exception. Boyer & Hatton even said that the unemployment rate among unskilled laborers was below 10% in every year from 1870 to 1892 and above 10% in all but four years from 1893-1913. Worse is the fact that the average unemployment rate in the period of 1892-1913 was greater, with 6.2%, than during the period of 1870-1891, with 5.4% (see Table 7 below). Even 5% is not particularly high (but not low either).

All other periods considered by Boyer & Hatton (2002) show that the british Great Depression has the lowest unemployment rate. These unemployment series are comparable to postwar unemployment series, thanks for adjustment to differing sources of data used for making unemployment estimates (Boyer & Hatton, 2002, p. 664). It is not reasonable to consider the 1946-1973 period, as this very low unemployment rate (coupled with a high economic growth; Crafts, 1995a, Table 1) could be attributed to an economy recovering from War, the so-called Golden Age, as demonstrated by Vonyó (2008, p. 239). Crafts (1995a) quoted Abramovitz :

Indeed, Abramovitz has suggested ‘The post-World War II decades … proved to be the period when – exceptionally – the three elements required for rapid growth by catching-up came together. The elements were large technological gaps, enlarged social competence . . . and conditions favouring rapid realization of potential’.

Besides, Saul (1969, p. 31) shows that real wages went through a strong, steady increase over time. Unemployment rates don’t seem to follow neither of these trends. Feinstein (1990) has concluded, in his revised estimates of the trend of real wages, that the slowdown in real incomes at the beginning of the 20th century could be attributed to slower economic growth, although Crafts & Mills (1994, p. 192) argue it should also be attributed to a rise in the cost of living relative to the GDP deflator (a measure of price inflation/deflation with respect to a specific base year) that occurred during the Edwardian period.

This period, we must remember, was marked by inflation. If we take that data at its face value, we are tempted to conclude that deflation is more conducive of economic growth and employment. Feinstein’s conclusion is consistent with McCloskey’s (1970a, 1974, 1979) in that the Victorian Britain did not fail. The failure was Edwardian.

Although, one could argue from the above numbers that the period of 1855-1872 could have been accompanied with lower unemployment, the data reported by Musson (1959, p. 201) and Ford (1981, pp. 28-30, Figures 2.1-2.2) tells no such thing. For the periods of 1855-1873, 1874-1900 and 1901-1913, Musson reported unemployment rates of 4.8%, 4.9%, 4.5%, respectively. Assuming we can really trust these numbers, it seems that prices (stability, deflation, inflation) don’t affect unemployment so much. But Coppock (1964, p. 394) shows, among other things, that the period of 1851-1866 is dominated by engineering, metal and shipbuilding unions which show higher unemployment rates than the other trades, although he gives no logical reason for removing them. He also said that the periods 1851-1887 and 1893-1914 were not comparable, although he provides not much information. Having made these corrections, he reports the unemployment rates for 1851-1873, 1874-1895, 1896-1914, being 5.0%, 7.2% and 5.4%, respectively. If we compare still different periods, e.g., 1867-1874, 1875-1883, 1884-1889, 1890-1899, 1900-1907, the unemployment rates are 3.9%, 4.8%, 7.0%, 4.1%, 3.9%, respectively. What is extraordinary is that the growth rates for the periods 1875-1883 and 1884-1889 were, respectively, 0.829% and 1.696% (see spreadsheet). Given Coppock’s numbers, one would not have expected greater growth for 1884-1889. Besides, Coppock’s numbers and those reported by Boyer & Hatton (2002) simply don’t correspond. In any case, even accepting Coppock’s numbers, he said that the causes of the high unemployment for 1884-1889 were due to decelerating growth of exports and decline in house-building. Wilson (1965, p. 186) explains that Coppock quotes derive in large measure from exceptionally high unemployment in heavy industries in one or two particular years – 1879, 1885 and 1886 especially, and that it is equally noticeable that the unions with a sizeable or complete stake in consumption industries – wood workers and (especially) printers and binders – contributed little to the swollen unemployment averages of the years of the Great Depression. This, then, refutes the idea that the higher unemployment was due to underconsumption. Wilson (1965, p. 186) also points to the unrepresentativeness of Musson and Coppock’s figures, on the ground that labour force was 9.3 millions in 1871 and 13.8 millions thirty years later, while the total membership of the trade unions was only 1.9 millions. But under- and over-representativeness of industries in the index of unemployment are still another issue, and Boyer & Hatton (2002, pp. 647, 649) have attempted to correct for such biases, by appropriately applying weights and adjustment for, e.g., changes in composition of the unemployment index. If more recent estimates are usually corrections to earlier estimates, it is logical to believe that the most recent estimates are more reliable; for instance, Boyer & Hatton (2002, fn. 45) argued that the estimates from the Board of Trade (Feinstein) is inaccurate : “The method of weighting adopted by the Board of Trade causes textiles to be underrepresented in their index in 1894 and overrepresented in 1913.” (p. 657). Furthermore, their estimate are much more consistent with the pattern of growth rates than are Coppock’s numbers. There is probably more certainty that the inflationary period after the Great Depression acknowledged a greater unemployment rate.

Interestingly, Saul (1969, pp. 11, 31-32) argued that a revival in house-building during the mid-1890s (1897-1900) may have helped to reduce unemployment. We see from Boyer & Hatton’s Table 6 that unemployment has declined between 1895 (7.3%) and 1900 (4.3%). Now, suppose, as we have seen, that the early 1870s were still undergoing an economic boom, precisely, a railway boom (Musson, 1959, p. 215), but that the boom has ended in 1873, as Saul (1969, pp. 13, 21, 25, 54) argued, then, we may have our explanation for the sharp increase in unemployment from 1873 to 1879. As austrian economists understood, if monetary over-expansion causes an artificial increase in investment, i.e., not backed by corresponding savings, and that this boom draws some unused resources (e.g., unemployed workers) into the market, an end to this boom would mean that unemployment tends to reach its previous, usual levels. The ABCT also predicts greater nominal wages during the boom, and this has been confirmed in the data showing a dramatic rise in nominal wages just in the years 1870-1873 (although data on wages in the railway-related sectors seems to be unknown), which is followed by a non-trivial decline in nominal wages between 1874 and 1879. The period of boom (1870-1873) has seen an increase in wholesale prices, which accompanied wages, although the ABCT does not make any assumption about absolute prices (but only relative prices).

The railway boom of 1870-1873 had severe repercussion on the U.S. economy (Newman, 2014, pp. 490-491) and probably had severe repercussion on the U.K. as well, especially if Saul’s words that “the boom to 1873 was unusually pronounced in several different ways” can be trusted. There are at least two reasons for this. First, it has been argued (Saul, 1969, pp. 32-33) that the workers had the ability to force their wages up above prices in the boom years but that they were able to maintain their wages even when the unemployment was high. For this reason, nominal wages fell only slowly. The austrian economists could then argue that without what is suspected to be a credit-induced boom, the harmful and sluggish downward adjustment in wages would have been avoided. Secondly, Saul (1969, p. 25) remarked that profit margins fell among british coal owners who were facing lower prices after their investment spree during the boom years of the early 1870s. Yet, probably something more would explain the long, sustained rise of unemployment from 1873 until 1879.

Another important factor explaining the trend in unemployment could be monetary policy. The Bank of England, in response to gold loss, has restricted the credit supply, which has caused a depression in investment and employment (Brown, 1965, p. 54). Indeed, Brown (1965, pp. 57-59) put his emphasis on a sufficient reserve ratio of the Bank of England to prevent or end a boom, by raising the interest rate so as to attract gold and preventing it from flowing out. Interest rates were higher during the boom and lower during the slump but it would be more accurate to say that the central bank has raised the interest rates at the end (not the beginning) of the boom, which puts an end to the boom (Ford, 1969, pp. 108, 111-112).

Brown (1965, p. 52) says that even when foreign lending vary over a wide range, there were only minor gold movements. A theory is needed to explain the simultaneity actually observed between boom and slump in different overseas investment areas. One is that central bank policies between countries were synchronized (Ford, 1969, pp. 109-110). For instance, central banks in France and Britain move their discount rates in the same direction at the same time, although not to the same extent. Brown (1965, p. 54) noted :

At such times, the overseas booms which drew our capital abroad did not draw enough of our goods abroad to effect the transfer of our intended loans. Incipient gold-loss therefore forced the Bank of England to apply restrictive policies which stopped the loss, but only at the cost of depressed total investment and consequent unemployment.

Brown (1965, p. 51) cites a study showing that the interest rate correlates with capital exports (net foreign investments). This means that money became scarce in the country because it was lent abroad. However, during the lending booms of 1872 and 1890, foreign lending (and the external drain of gold it would result) has been largely offset by an additional rise in demand for british exports (Brown, 1965, p. 58). This should have prevented the external drain of gold, the rise of interest rate by the Bank of England, and thus the rise in unemployment.

In general, the argument that interest rates covary with unemployment rates is somewhat attenuated because, if interest rates (through its effect on money supply) affect unemployment, one would also suspect that money supply would affect economic growth. Capie & Mills (1991) conducted a vector autoregression (VAR) analysis showing that there is a strong cyclical money-output in the U.S., but only a weak relationship in the U.K., and they argue that the difference may be due to the fact that the U.S. has experienced several banking crises, probably caused by a weak banking system having no branch banking (Beckworth, 2007). There were, however, no banking crises between 1870 and 1913 in the U.K.

Consumption

Generally, falling profits and rising wages are considered as symptoms of a recession. But also declining consumption. This one does not apply here either. Falling prices did not induce people to consume less (Musson, 1959, pp. 199-200; Saul, 1969, p. 14). Musson describes in this way :

Prices certainly fell, but almost every other index of economic activity — output of coal and pig iron, tonnage of ships built, consumption of raw wool and cotton, import and export figures, shipping entries and clearances, railway freight and passenger traffic, bank deposits and clearances, joint-stock company formations, trading profits, consumption per head of wheat, meat, tea, beer, and tobacco — all these showed an upward trend. These facts were visible to observant contemporaries such as Giffen and Marshall, who, despite the loud complaints of falling prices and profits, of overproduction and unemployment, pointed out that the country was in no real sense depressed. … On the other hand, there was an overwhelming mass of opinion … that conditions were bad. The complaints were not, of course, continuous: the depression, we know, was not unbroken, the clouds periodically lifted, and the atmosphere brightened. There were, in fact, cyclical fluctuations, with booms reaching peaks in 1882 and 1890, and slumps descending to troughs in 1879, 1886, and 1893. But the booms were short-lived, the slumps prolonged, and business never really escaped from the atmosphere of uncertainty and depression.

… There is no doubt that, on the whole, the condition of the working classes improved during this period. Real wages rose considerably, there was a redistribution of the national income in favor of wage earners, pauperism declined, deposits in savings banks grew steadily, and consumption per head of foodstuffs, beer, tobacco, and similar products rose.

And Wilson (1965, p. 185) reported that the increase of imports, which has certainly contributed to the decline in prices, pointed to a growth of miscellaneous wants amongst the consumers. To support this view, Wilson says that the number of persons engaged in transport, commerce, art and amusement, literary, scientific and educational functions had risen between 1871 and 1881 from 947,000 to 1,387,000, or from 8.8% to 11.7% of the self-supporting population. It’s hard to believe that a depressing economy would experience a rise in such miscellaneous activities. This is more a symptom of prosperity than of depression.

Supple (1981, p. 129) report data from Feinstein (1972) on consumers’ expenditures as a proportion of UK GNP. For the periods 1870-79, 1880-89, 1890-99, 1900-09, 1904-13, the percentages were 87.8, 87.8, 87.9, 85.5, 84.5. In other words, the propensity to consume is stable during the deflationary period but declined (along with growth) after 1900, i.e., during the inflationary period. Consumption, in fact, is not the leading factor causing economic growth, as there are empirical evidence showing that consumption crowds out investment and, thus, higher consumer spending leads to lower economic growth (Emmons, 2012). This, too, is consistent with Weber’s (2009) study showing that the early capitalism (1900) has a much slower growth than did the modern capitalism (1958), the latter period being characterized by a lower time preference (i.e., more savings) compared to the earlier period.

The key element in the anti-deflationist argument promoted by keynesians is that deflation induces people to consume less, which causes decline in profits, employment and, consequently, in prosperity. But instead of declining, standard of living were rising. And even the logic that current consumption increases investment is erroneous. Investment (especially long-term investment) is meant to satisfy consumption in the future. And investment is possible only by deferring consumption.

The debated causes of the Great Depression in the literature

Then, if price declines were not responsible for slower growth, what could be the explanation(s) ? One serious argument is inefficient knowledge and skills. Aldcroft (1964, pp. 118-120), Ashworth (1966, pp. 29-30), Saul (1969, pp. 43, 47-48) and Glynn & Gospel (1993, p. 115) agree that educational inefficiencies, e.g., focusing too much on the theory at the expense of practice or producing clerks and shopkeepers instead of skilled workers, could have impeded economic growth. Ashworth noted :

Cf. Final Report of the Royal Commission on the Elementary Education Acts, England and Wales, 1888, C.5485 which points out (p. 142) that “it is commonly said” that the existing system of elementary education tended to discourage the production of skilled artisans and prepared too big a proportion of boys to become clerks and shopmen. The Commission apparently accepted this view and sought a remedy. The Report also stressed the serious inadequacy of the elementary schools in communicating knowledge to their pupils (p. 133). Cf. also Report of the Royal Commission on Secondary Education, 1895, C.7862, which stressed bad teaching methods and a shortage of trained and suitable teachers as the worst defect of secondary education (pp. 70-2 and 326). The report maintained (p. 72) that, in the average grammar school, science was taught so inefficiently as to be deprived of any real educational value.

The less efficient educational system compared to that prevailing in the U.S. and Germany may explain why Britain had a major problem in keeping up with them (Floud, 1981, pp. 8-9), although the U.S. and Germany have also experienced a deceleration in industrial production between 1860-74 and 1870-97 (Coppock, 1961, pp. 214, 221-222). Saul (1969, p. 46) argues that Britain was lagging behind in steel-making, and one possible cause is the lack of adequate skills on the part of steam-engine makers in, e.g., building diesel engines. The inability of engineers raised in craft traditions to undertake the wholesale rethinking of productive processes necessary to manufacture by mass-production methods was another. If one wonders how Britain has led the rest of the world in the Industrial Revolution, Crafts (1995b, p. 765) reminds us that it was despite of her formal education system. We must remember, however, that this is a qualitative, not a quantitative argument. We don’t know how much it has really impacted economic growth. Only that it is a potential factor of unknown importance in explaining the decline in labour productivity during the Great Depression (Coppock, 1961, p. 229; Crafts & Mills, 2004, p. 170).

According to Musson (1959, p. 206), there is more. The inability of Britain to modernize her plant and develop new processes may be due to deficiency in technical education, but also to conservatism and heavy costs of replacing old plants. As for conservatism, british entrepreneurs (and also workers), unlike the americans, seemed indifferent, reluctant to flexibility about the need to adopt new, e.g., labour-saving methods (Aldcroft, 1964, pp. 114, 126-128, 130-131). This evidence is best illustrated through the history narrated by Coleman & MacLeod (1986, pp. 591-593, 600-601) although a limitation of such account is that it may not necessarily provide a random sample of british behavior weighted for the importance of each industry (McCloskey & Sandberg, 1971, pp. 96, 99). The reluctance in the adoption of labour-saving methods, if true, is relevant because when the additions to the capital stock yield diminishing returns (but see discussion below), capital accumulation is not anymore an independent determinant of growth but is itself determined by the rate of technical progress (Aldcroft, 1964, p. 115). So, technical progress (which includes new machines and processes but also better methods of organization and use of more skilled labour) is the final determinant of growth. According to Aldcroft, british businessmen were unwilling to invest in new technologies, despite the fact that the net capital formation as % of net domestic product is 6.8% in 1875-1894 as compared to 7.0% in 1855-1874 (Saul, 1969, p. 41). But McCloskey & Sandberg (1971, pp. 103-104) affirmed that this theory is empirically rejected and that, in theoretical grounds, one can easily guess that higher profits from adopting a new technique would undoubtedly alert and convince other entrepreneurs about its profitability. As for high costs, deflation may not be responsible. As Musson (1959, p. 206) noted, real costs for iron industries have risen in Britain but have fallen dramatically in the U.S., despite both economies were experiencing deflation. Saul (1969, p. 53) has noted that deflation wasn’t associated with a decline in investment in other countries.

British retardation is overwhelming. For instance, Aldcroft (1964, pp. 121-122) tells us that Britain was the pioneer of machine tools, but became outdistanced by the U.S. when in the 1880s the price of machine tools in the U.S. had fallen to half that of the equivalent british tools. Aldcroft provides an explanation :

The secret of the American and German success in machine tools was due to the fact that they concentrated on the production of large quantities of one or two standard tools in large, highly specialized and efficiently equipped plants. In contrast, in Britain a very large number of relatively small and inefficient firms existed producing a multiplicity of articles and some of them ‘seemed to take a pride in the number of things they turn out’. Costs of production in Britain could have been reduced appreciably if many of the older works had been well planned on a large scale, equipped with plant of the most efficient kind and if the character of the production had been standardized. But in fact there was ‘generally an absence of totally new works with an economic lay-out’, and it was not until the war ‘opened the eyes of manufacturers to the advantages of manufacturing in large numbers instead of ones and twos’, that British machine-tool makers made any serious attempt to streamline their methods of production.

This problem is not inherent to the Great Depression. Even by 1939, the shipbuilding industry was badly out of date, compared to U.S. and Germany and in general by 1914 there was hardly a basic industry in which Britain held technical superiority except perhaps pottery (Aldcroft, 1964, p. 117). The outdated machineries and processes may well explain why Britain has seen a large decline in the proportion of manufactured goods (products made from raw materials using machinery) in her exports (Musson, 1959, pp. 224-226). Since Britain essentially was exchanging her manufactured goods versus primary goods (i.e., raw materials), foreign competition made Britain less able to pay for her food imports with manufactures, while at the same time foreign agricultural competition depressed her agriculture and increased her dependence on foreign imports. One potential cause is the smaller size of british firms, loosely coordinated, compared to her main competitors, german and american firms (Glynn & Gospel, 1993, p. 114). As explained before, smaller firms are expected to have more difficulties to get long-term loans for the purpose of intensive productive investment.

Overall, the idea that the adoption of new techniques could have improved british growth has been challenged by McCloskey & Sandberg (1971, p. 105) who note that if the lost output was as much as 5% in the basic industries usually considered poorly managed (steel, coal, cotton, chemicals and railways) the british national income would have been lower than it could have been by only a little over 1%.

Many authors (Musson, 1959, pp. 207-210; Aldcroft, 1964, pp. 129, 133; Richardson, 1965, pp. 131-135; Saul, 1969, pp. 44-45, 51) seem to agree on the fact that Britain has lost her industrial leadership, as a result of slower growth, due to her early and long sustained start as an industrial power. It has been argued that although the early start hypothesis does not make sense theoretically speaking, it made sense on practical grounds (Saul, 1969, pp. 44-45). The early start hypothesis is also accepted by Harley & McCloskey (1981, pp. 64-65) as an explanation for why Britain was more specialized in the less sophisticated industries. The fact that british exports during the 19th century were heavily concentrated in a few basic industries that had been in the forefront of the industrial revolution can account for why exports have declined given that Britain was unable to shift toward sophistication. Richardson (1965, pp. 133-134) put it this way :

The rate of growth of an industry is bound to fall off for a number of reasons. First, there is the simple fact that a high growth rate cannot be maintained for ever. ‘It is a natural development, and almost a truism, that the rate of expansion of industry, measured in per cent., must decline during the course of an industrialisation process. A rapid percentage increase in the beginning of an industry’s existence cannot continue indefinitely without retardation, as otherwise production would soon reach completely abnormal figures’. There is a tendency for technical progress to slacken as an industry expands; cost reductions in a new industry are limited by the character of the technological basis of the industry itself, and once the initial breakthrough is made, further refinements will tend to yield diminishing marginal cost reductions. Merton’s investigations in the 1930s showed that there is a skewness in the rate of innovation in a given industry weighted heavily towards the early phases of its growth. Again, as technical progress advances, many particular improvements are merely new ways of producing existing products: in the absence of a rapid resurgence of demand for these products, this will make for natural retardation in individual sectors. Similarly, the spread of industrialisation throughout the world will tend to retard growth in a given industry in any one economy. Exhaustion of raw materials may ultimately exert a dragging influence on an industry’s growth curve. Finally, on the demand side, retardation will follow from the fact that as products age the level of demand for each individual product (other than replacement demand) will tend to reach saturation point, and rising real incomes cannot stave off this point indefinitely.

In particular, Richardson (1965, p. 147) denounced the fallacy of considering a given period in isolation of what precedes and follows it. He argues that any tendency for the aggregate rate of growth to decelerate may be (and usually is) outweighed by compensating forces. To illustrate, Richardson (1965, p. 143) shows that industrial growth during the 1880s was incredibly slow, while foreign investment (as shown by the exports of capital) and the rate of income growth were both high. Such investment helps to increase income in at least 3 ways : by financing cost-reducing transport innovations abroad and thus improving the terms of trade (as shown by the data), by promoting the export industries (but not new growth industries), and by adding to Britain’s invisible receipts. Richardson (1965, pp. 144, 148) believes that the reason is due to lower prospective rate of return in investing home while the conditions abroad induced huge investments to be made, a view that McCloskey (1979, pp. 539-540) would certainly agree with. The reason invoked (Richardson, 1965, p. 141) for the lack of investments at home after 1850s-60s is the lack of major innovations until the period of 1890s which made possible the growth of motor-car, electrical engineering, rayon and other new industries. If investors seek industries with the best opportunities of growth, we can expect them to be investing abroad if their own economy has reached a point where it awaits further innovations. One problem with this view is that the diminishing return on capital must assume, quite unrealistically, that capital is neither lumpy nor highly specific but is instead homogeneous and easily divisible. For this reason, Pollard (1985, p. 502) makes the point that there is no certainty that adding to its quantity will necessarily reduce its returns if investments are shifted from overseas to home. Pollard (1985, p. 502) also argues that the diminishing return on capital assumes full employment and constant technology, and that, since these assumptions are rejected, if capital is invested at home, it might have served to reduce unemployment without endangering the rate of return on capital. At the same time, Musson (1959, p. 210) and Pollard (1985, p. 507) affirmed that investment abroad was particularly effective, and probably yielded more in real gains, than investment at home would have done, because it was combined with idle or underemployed other factors abroad. That hypothesis, however, makes the implausible assumption that Great Britain was at full employment (see also McCloskey, 2009, pp. 24-25). Still, consistent with Edelstein’s (1981, pp. 78, 80, 86) conclusion, Chabot & Kurz’s (2010) analysis confirms that british investors send so much of their capital abroad precisely because this is where the returns were greater and perhaps due to diversification of risks. One reason for this is that it helps to reduce the costs of overseas transport (among other things) and ultimately the cost of british imports of food and raw materials (Edelstein, 1981, p. 70). Even granted this point about the heterogeneity of capital, the view that the lack of major, revolutionary inventions (such as the steamship in the 1850s-60s) accounted for the slower growth during the Great Depression remains very plausible (Musson, 1959, pp. 207-208). Richardson (1965, p. 147) expresses the idea in this way :

The retardation after 1870 can best be explained by referring to the preceding period: the rate of growth was high between 1780 and 1870 because during this time the basic industries were being developed, the transport system built, urbanisation extended and the most consequential technical advance of the time, steam power, applied to a range of industries. Once these tasks neared completion, awaiting some new major technological solution, the growth rate was bound to fall.

At the same time, that argument loses force when we consider only the steel industries because the U.K., U.S., and Germany all started with comparatively low levels of output, and while the U.K. maintained her position in the 1890s, she was left behind in the race by 1913 (Coppock, 1964, p. 392). Also, Wilson (1965, p. 192) doubts that innovation has the “monopoly” of growth although he provides no substantial argumentation. On the other hand, Crafts & Mills (2004) reported that the capital-deepening contribution of steam engines to industrial labour productivity and output growth rose steadily over time, from 1800-30, 1830-50, 1850-70, 1870-1910. This means that the slowdown in growth after 1870 can’t be explained by a waning contribution to growth from steam power. Instead, its growing contribution to growth at the end of steam engine revolution indicated a greater british dependence on steam. Although that does not mean there were new, great innovations, its (negative) impact on GDP should have been mitigated, as the hypothesis of technological changes predicts that “after the leading sector has been adopted for a number of years, its influence on aggregate output declines, and before new leading sectors can be put in place, overall output growth declines” (Bordo & Schwartz, 1981, p. 111). Yet there might be some other factors explaining the declining rates of economic growth. On the one hand, Crafts & Mills (2004, p. 170) report that labour productivity growth in two of the steam-intensive staples, namely, coal and cotton, accounted for much of the overall labour productivity slowdown after 1870, and that for coal the decrease in labour productivity growth was the result of the declining quality of natural resources and labour inputs. On the other hand, the slowdown in labour productivity has also been associated with a great decline in the growth rate of capital per worker for both coal and cotton (Crafts & Mills, 2004, p. 161, fn. 3), although the ratio of capital formation to national product shows no decline (Kuznets, 1961, Table 3). To defend the idea that capital formation has been the main cause of the decline in economic growth would be a difficult task again, since Coppock (1961, p. 229) affirmed that it is doubtful if the rate of capital accumulation per head in manufacturing industry fell sufficiently after 1870 to account for the decline in labour productivity growth by itself.

Some authors (Brown, 1965, pp. 48-49) argued that the slowering in growth and the increasing consumption could be attributed to declines in exports. For Saul (1969, p. 38), total consumption per head was influenced significantly by external factors, the improving terms of trade and the rising trends in other sources of overseas income. Saul (1969, pp. 19, 53) has reached the following conclusion :

The swings of home and foreign investment are said to have brought both low growth and sagging prices in the late 1870s and 1880s and again from 1901 to 1910. … The relative changes in international trade prices moved against Britain’s suppliers and by so doing helped to raise the standard of living at home, but the reduced purchasing power it entailed overseas may have been an important factor in the retardation of British exports and, through them, of growth.

The terms of trade (ratio of export prices to import prices; see Harley & McCloskey (1981, p. 54)) seemed to be an important factor. So, it deserves to be discussed more thoroughly. According to Musson, the terms of trade may explain some of the increase in unemployment. Musson (1959, p. 217) stated that the unfavorable shift in the terms of trade, together with the growing volume of imports and decline of exports, has worsened Britain’s balance of payments position in the later 1870s, while the favorable movement from the early 1880s onwards eased it, since british people were getting greater quantities of food and raw material imports in return for a given volume of manufactured exports. The improvement in the terms of trade, Musson argues, was the main factor in bringing about an improvement in real wages and the standard of life, for the decline in the prices of imported wheat and other foodstuffs led to a reduction in the cost of living. At the same time, this positive shift in the terms of trade had harmful effects on the export industries, causing their employment levels to go down in the late 1870s. The modest improvement since 1880s didn’t coincide with lower rates of unemployment, however. Also, according to Coppock (1961, p. 218), the combined effect of the changes in the supply side (growth of supply and falling transport costs) on the trend in the terms of trade must have been modest in size at least over the full period of the Great Depression.

Musson (1959, pp. 218-219) explained that the terms of trade could be easily mis-understood. British imports are valued Cost, Insurance and Freight (“C.I.F.”) while exports are valued Free On Board (“F.O.B.”). The latter means that the seller pays for transportation of the goods to the port of shipment, plus loading costs. So, a favorable terms of trade movement does not necessarily say what it means.

British imports are valued “c.i.f.” in the Board of Trade returns, while exports are valued “f.o.b.,” so that the consequence of reduced shipping freights was naturally an improvement for Britain in the terms of trade, and Britain was certainly better off as a result; but, though the terms of trade moved “unfavorably” to the countries supplying her with imports, their real position was not necessarily worsened. For example, the chief factor in the great fall in the prices of American food products in the British market was the fall in freight rates: prices on the American farm fell much less. Moreover, as a result of the opening up of new territory and agricultural mechanization, farming costs were reduced, while output and exports were enormously increased. Britain, in particular, was importing greatly increased quantities of American farm products. It seems doubtful, therefore, if the “unfavorable” movement in the terms of trade reduced the purchasing power of American farmers and so checked imports of British manufacturers. The same was true for other primary producing countries, from which British imports continued to grow rapidly.

Another leading factor to the change in terms of trade is the currency’s value. Solomou & Catao (2000, Figures 6 & 10) discovered that import price declines (and increases) during the period of 1879-1913 are partly due to nominal exchange rate appreciation (and depreciation). In U.K., when the real exchange rate increases (diminishes), the export growth diminishes (increases). On the other hand, the terms of trade increased during the 1880s and remained stable during the 1990s while the real exchange rate increased during the 1990s.

What is without dispute, however, is that Britain’s share of world trade has diminished during the 1873-1896 period. Saul (1965, p. 17) made the suggestion that since Britain was bound to lose some ground in world trade as others industrialised, there were several solutions. One was to shift to higher quality goods, another was to cut costs, the third was to switch to new markets, often helped by capital exports – a reasonable solution in the short but not in the long run. But he then argued that none of these propositions is convincing. Other propositions as for the cause involved international competition and tariff protection (Musson, 1959, pp. 222-228). As described earlier, U.K. failed to innovate and develop modern techniques of production, as opposed to U.S. and Germany. The fact that the growth in world trade in manufactures has declined relatively to world manufacturing production would have also severely impacted Britain for which the exports account for one quarter of GDP (Musson, 1959, pp. 219-220). And Hatton (1990) agreed. In addition, while Britain maintained a policy of free trade, some other countries revert back to protectionism due to the environment of depression, and tariff protection was growing during the 1870s-1890s, notably in european countries such as France, Germany, Spain, Italy, the USA, Brazil and elsewhere (Hatton, 1990, p. 583). Britain’s terms of trade was favorable with regard to unprotected countries but was unfavorable with regard to protected countries. Curiously, Hatton (1990, p. 583) informs us that tariff protection may not have a great impact on trade, despite showing the evidence against such a view. Hatton (1990, p. 578) first said that the relationship between exports and GDP is clear except at the end of the 1890s due to housing boom that complicates the relationship. Hatton (1990, p. 591) then demonstrates that what determines british exports in major part was world trade rather than relative export prices (i.e., across countries). Specifically, 50% in the loss of Britain’s share of world trade is due to inelasticity of exports with respect to world trade and 30% of the loss is due to deteriorating competitiveness. This has caused Britain to export less, while importing more and more foreign goods, which may explain the decline in prices in Britain. Musson (1959, p. 225) writes :

Imports had been growing gradually for many years, but it was not until the seventies that the railway and steamship brought in a flood of cheap foreign imports, which seriously depressed certain sections of British agriculture and destroyed the balance of the British economy. Britain rapidly became dependent for most of her food on overseas supply.

Although it has long been recognized that all industrial countries experience a decline in the proportion of GNP invested during a recession, Britain was uniquely dependent on external stimuli to end her investment slumps.

It is disconcerting to see that exports have been thought as a causal factor behind economic growth. If a relationship is found, it must be growth causing exports by improvements in skills and technology and exports sustaining economic growth. Yet some economists (e.g., Tang et al., 2015, p. 230) believe that the theory would lead us to expect that export is a source of growth even though the empirical evidence is mixed. Generally, when the causality is found, it could be in either direction or is bi-directional. According to Xu (1996), the lack of empirical consistency is due to the arbitrariness of the number of lags specified (which should better be determined according to Akaike’s FPE criterion) when doing causality tests because they are sensitive to model-selection criteria and functional form and to unit roots. Moosa (1996) attempted to test the causality between export and growth in Britain for the period of 1885-1993. It is unfortunate that the study cuts in half the period of interest (1873-1896) but the periods 1885-1893 and 1885-1913 show no evidence of causality going from export to growth, according to the non-significant p-value (at 5% level). Generally, I am very hostile to significance testing because it is a function of sample size and is not an effect size. Given that the period 1885-1893 has only 9 annual observations, I am not surprised by the non-significance of the p-value. The other problem is that the Y variable is real GDP, not real GDP per capita. Before looking at the statistical analysis however, one always needs to think about how relevant the export-led-growth theory is. More likely than not, it doesn’t hold water (McCloskey, 2009).

Now, having discussed all of the proposed causes, whatever explanations hold would have no importance anymore if one accepts Crafts et al. (1989a, 1989b) and Greasley’s (1992) argument that economic growth rates were stationary.

Lord Keynes (LK) telling lies, once again

He is the blogger at socialdemocracy21stcentury. He has a blog article on that topic here. I hope the readers won’t touch the link however. I dislike the idea of giving him a “hit” for his blog. What I admire in him is his aptitude of telling lies despite knowing exactly that he is lying. Because he really does an excellent job.

The guy says that inflation leads to a stronger economic growth, and reaches that conclusion by comparing growth in the 1850-70s (inflation) and 1870-90s (deflation). He read Saul’s book, as indicated by how many times he referred to specific pages of the book, but he did not say that Saul argues there is no inflation (i.e., long-term, sustained increase in prices) during this period of 1850s-1870s. Saul even shows a graph disproving this idea. Even more admirable is the fact that, for LK, when the so-called inflationary period of 1850-70s showed a stronger growth than the deflationary period of 1870-90s it must be due to the beneficial effects of inflation but when the only real period of inflation (1890s-1910s) showed a slower growth than the period of deflation, it must be due to other factors (that he even fails to describe how they worked in producing declining growth rates during the 1890s-1910s). It’s in this other blog article that LK used that lame argument. When he recognizes and agrees with writers that many factors could have affected growth so differently in Great Britain during the 19th century, he must understand that it is no easy task to disentangle and isolate the effects from each other. Yet, he speaks as if a naive look at growth and price trends is sufficient to tell whether deflation is good or bad, i.e., as if bivariate rather than multivariate analysis can isolate all other multiple plausible causes. The guy says that the economic problems were caused mainly by declines in profits between 1850-70s and 1870-90s but did not even say that Saul reported the data showing that the ratio of profits to national income or industrial income has been lower during the period of inflation (1890s-1910s) than during the period of deflation (1870s-1890s). Worse is the fact that LK himself displayed a graph showing that the profits were lower during inflation, yet ignoring that detail when writing his text. He then reported Boyer & Hatton’s (2002) unemployment estimates by pointing out that the Great Depression had an abnormal rate of unemployment but he carefully ignored to report that the period of inflation showed an even stronger rate of unemployment.

It is not the first time he made such omissions. I can understand that everyone can miss important information when reading a text. Random error occurs everytime. But curiously for LK, every time I saw him being so clumsy, it has always concerned the missing pieces of information that disprove his own ideas. This is certainly not an accident. It’s not random error (i.e., chance) but systematic error (i.e., bias). He is truly despicable.

]]>menghublog1001Deflation, Productivity Shocks and Gold - Evidence from the 1880-1914 Period (Bordo 2010) Figure 11Deflation, Productivity Shocks and Gold - Evidence from the 1880-1914 Period (Bordo 2010) Figure 20U.K. GDP per capita Maddison 2003 based on Feinstein 1972The Economic History of Britain since 1700 - Volume 2 - 1860 to the 1970's (Floud, McCloskey, 1981) Table 1.1The Myth of the Great Depression, 1873-1896 (Saul, [1969] 1972) Table IVThe Myth of the Great Depression, 1873-1896 (Saul, [1969] 1972) Table VWhat Really Happened to Real Wages - Trends in Wages, Prices, and Productivity in the United Kingdom, 1880-1913 (Feinstein 1990) Table 2What Really Happened to Real Wages - Trends in Wages, Prices, and Productivity in the United Kingdom, 1880-1913 (Feinstein 1990) Table 3What Really Happened to Real Wages - Trends in Wages, Prices, and Productivity in the United Kingdom, 1880-1913 (Feinstein 1990) Table 4What Really Happened to Real Wages - Trends in Wages, Prices, and Productivity in the United Kingdom, 1880-1913 (Feinstein 1990) Figure 1The Climacteric in Late Victorian Britain and France - A Reappraisal of the Evidence (Crafts 1989) Figure 1Trends and Cycles in British Industrial Production, 1700-1913 (Crafts 1989) Figure 4The Interest Rate and Prices in Britain, 1873-1913 - A Study of the Gibson Paradox (Harley 1977) Figure 1The Myth of the Great Depression, 1873-1896 (Saul, [1969] 1972) Diagram IThe Myth of the Great Depression, 1873-1896 (Saul, [1969] 1972) Table IThe Myth of the Great Depression, 1873-1896 (Saul, [1969] 1972) Table VINew Estimates of British Unemployment, 1870-1913 (Boyer & Hatton, 2002) Table 6New Estimates of British Unemployment, 1870-1913 (Boyer & Hatton, 2002) Table 3New Estimates of British Unemployment, 1870-1913 (Boyer & Hatton, 2002) Table 5New Estimates of British Unemployment, 1870-1913 (Boyer & Hatton, 2002) Table 7New Estimates of British Unemployment, 1870-1913 (Boyer & Hatton, 2002) Figure 3The Myth of the Great Depression, 1873-1896 (Saul, [1969] 1972) Diagram IIIThe Myth of the Great Depression, 1873-1896 (Saul, [1969] 1972) Diagram IIAn account of the good deflation in the American economy of 1870s-1890shttps://menghublog.wordpress.com/2015/01/13/an-account-of-the-good-deflation-in-the-american-economy-of-1870s-1890s/
Tue, 13 Jan 2015 23:56:21 +0000http://menghublog.wordpress.com/?p=2569Continue reading →]]>The american economy in the 1873-1896 is usually portrayed as the Great Depression of 1873-1896 (Wikipedia). But to say this is a myth is not very far from the truth (Higgs, 1971; Catalan, 2011). One feature of this period is the deflation. And deflation is often viewed as bad, because it has been associated with periods of recession (which is curious given the proof to the contrary). And yet it is important to make the distinction between good and bad deflation (Bordo & Filardo, 2005). What is usually termed good deflation is productivity-driven and what is termed bad deflation is demand-driven. But what the U.S. economy of the 1870-1890s had was a good deflation, which was found to have produced better economic outcomes than inflation (Beckworth, 2007). The characterization that this period was felt as distressful to most people is wrong. Equally false is the claim that the U.S. during this period has gone through many deep recessions (Beckworth, 2007).

Why deflation is better than inflation and constant-price regimes

The fact that deflation is associated with depression is curious. Friedman & Schwartz (1963 [1971], p. 93) reached the conclusion that the forces making for economic growth over the course of several business cycles are largely independent of the secular trend in prices. Similar conclusion is provided by Ryska (2014). In a study using VAR analyses on the U.S. and European countries (U.K., Germany, France) of the 1880-1913, Bordo et al. (2010, p. 540) found that the output was essentially supply driven in the European countries, which implies that money was neutral, whereas money was non-neutral in the U.S. only because of the presence of crises, probably due to unstable banking system. Overall, it cannot be concluded from these studies that deflation causes slower economic growth.

Supply-driven deflation may be considered as good deflation, insofar as productivity gains make goods and services cheaper and cheaper. But the definition given to bad deflation is obviously obscure and inaccurate. The bad deflation owing to falling demand is caused by excessive debt, which is caused by malinvestments as described the Austrian Business Cycle Theory. I insist on this point because keynesians always speak about insufficient demand without any reference to malinvestments but instead attribute (bad) deflation to savings while austrians consider that savings make the economy more productive.

Deflation is particularly feared due to increasing debt burden. If one borrows at 10% during a year which the price level declines by 5% it’s as if he would have borrowed at 15.8% when prices are stable. As Selgin (1997, p. 42) noted, an unanticipated increase in debt is offset by an unanticipated increase in real income. But if deflation is the result of the recession (due to accumulated malinvestment), the income won’t rise to compensate for increasing debt burden. But perhaps price stability is better than deflation ? Not even so. Selgin (1997, p. 41) has argued that “The argument, like most arguments for a constant price level, is perfectly valid so long as aggregate productivity is unchanging”. If productivity increases, there is no specific advantage of price stability. But when productivity falls, such rule requires a contraction (i.e., decline) of all non-fixed money incomes. Besides leading to a further depression of real activity (if prices and wages are sticky), such a rule might well result in certain debts not being paid at all. Some creditors might, in other words, escape the consequences of fallen productivity, by letting others bear a disproportional burden. Selgin (1999) argues that if it is admitted that debtors’ and creditors’ interests are best served by an increase in prices when productivity declines, then symmetrical reasoning suggests that those same interests are best served by allowing prices to fall as productivity advances. Furthermore, one additional issue with a zero inflation policy arises when debtors and creditors are inclined to index money rates of interest to the rate of inflation or deflation. Under zero inflation, productivity indexing would require an upward adjustment of nominal interest rates proportional to the higher growth rate of real (and, in this case, nominal) income. Otherwise, the interest rates would be allowed to fall if debtors negotiate for a lower interest rate due to anticipation of falling prices. Some of these arguments were also made in Selgin’s (1988, ch. 7 & 9) book on the theory of Free Banking.

One unexpected consequence of central policies for maintaining a constant-price regime, is that it may pushes the interest rates temporarily below their natural levels (Selgin, 1995b, pp. 714-715; Selgin, 1997, pp. 32-33). Because nominal prices do not adjust sluggishly to productivity changes but almost immediately, no excess demand for money arises. For this reason, changes in the demand for real money balances based on innovations to aggregate productivity are accommodated by falling prices automatically and well ahead of any possible monetary policy response. Monetary injection in this situation will bring excess money in the hand of people and cause malinvestments.

The belief that sellers are reluctant to adjust their price downwardly is also wrong. Sellers actively seek ways to improve productivity just so that they can charge less than their rivals without sacrificing profits. Productivity-based price cuts are a healthy aspect of the competitive process (Selgin, 1999).

An interesting thing about the zero inflation policy is that it involves more, not less, money price adjustments than what would be the case under productivity norm, a regime under which the price levels are allowed to vary to reflect changes in goods’ unit costs of production (Selgin, 1995a, pp. 736-737; Selgin, 1997, pp. 26-27). And the higher the number of distinct prices in factors in production, and the larger the number of exceeding adjustments under the zero inflation policy. Thanks to productivity gains, nominal wages will remain stable or increase slightly, but under price stability regime, productivity gains must take the form of more rapidly increasing nominal wages (Selgin, 1999). Suppose, for example, that labor productivity grows at an annual rate of 3%, while total factor productivity grows at an annual rate of 2%. Then the real wage rate, which reflects labor productivity, should also increase at an annual rate of 3%. If consumer prices are allowed to decline at a rate equal to the rate of growth of total factor productivity, money wage rates will still increase at 1% annual rate. If, in contrast, the authorities insist on stabilizing the CPI, money wage rates must increase 3% a year. And because nominal wages are less flexible than output prices, a policy of avoiding or limiting wage-rate adjustments by allowing prices to fall is less likely to be a source of labor-market frictions and consequent labor misallocation. Productivity norm minimizes the need for changes in the least flexible prices (Selgin, 1995b, p. 720). In addition to this point, price stability requires a contraction in money supply in the face of a decline in productive efficiency. Not only it will result in problems with debt payments but such policy also needs nominal wages to decline (Selgin, 1988, ch. 7; Selgin, 1997, p. 30). In other words, price stability involves nominal wage adjustments that would be absent if prices were allowed to increase. Selgin (1995a, p. 738) even says that the combination of constant and fixed prices and nominal wages (due to, e.g., wage stickiness) will cause a rise in unemployment, when productivity declines.

Generally, authorities seek (mild) inflation more than price stability. But inflation, even in relative terms (i.e., maintaining prices constant when the prices should have declined), produces the so-called malinvestments described by the ABCT. Such boom-bust cycles certainly hurt economic growth and prosperity.

Although Selgin’s (1997) treatise offers the best argument for deflation, Reisman’s (1996) treatise is not bad at all. There is theoretically no reason to believe that deflation is harmful. Instead, deflation appears to be more conducive to economic growth and prosperity than either price stability or inflation.

Empirical evidence on the American economy of 1873-1896

The best evidence of economic prosperity comes from Beckworth (2007). Figures 1-2 show that the trends in real GNP, real GNP per capita and real wage give an indication that the U.S. economy was clearly not the period of distress portrayed by many historians and economists. The GNP deflator shows that price levels were declining at a modest rate. Interestingly, we don’t see any sign of deep recessions in these data. Between 1866 and 1897, secular deflation averaged just over 2% a year while real GNP grew almost 4% a year (Figure 1).

Beckworth (2007) also shows in Table 2 that deflation was more conducive to economic growth than inflation. The real wage growth during the deflation period clearly exceeds the real wage growth during the inflation period, particularly during the 1866-1879 period. These real wage gains are consistent with the productivity growth rates of Kendrick (1961a) shown in Table 2 which indicate greater productivity gains during the deflation period. Table 2 also shows the growth rates of the capital stock, the capital stock per capita, and the capital stock per worker for the postbellum period as reported in Kendrick (1961b). Since firms under normal market conditions will only take on additional investment expenditures if they expect a positive rate of return, the growth of the capital stock can be viewed as a proxy for firms’ expectation of current and future profitability.

Although deflation was benign, this outcome may be the result of an economy with little-to-no nominal rigidities (price or wage stickiness) that easily could have handled any price level regime. If so, it isn’t meaningful because such episode would have no relevance for understanding deflation in the modern world where nominal rigidities are considered important. So, Beckworth applied the vector autoregressions (VARs) to identify economic shocks and their influence on economic activity. Studies usually apply this method. A common identification strategy used in these studies is to impose the Blanchard and Quah decomposition of VAR shocks into permanent and temporary components. Among other things, this approach allows for the identification of nominal shocks and their effect on the real economy. The idea is that if nominal wages are sticky then real wages should be countercyclical (i.e., moving in the opposite direction of the overall economic cycle, or rising when the economy is weakening and falling when the economy is strengthening) and it would show up in an impulse response function following an aggregate demand shock.

The data variables in use are the log of the Balke & Gordon (1989) real GNP and GNP deflator series, and the log of the NBER’s composite wage index (2006) to the Balke & Gordon (1989) GNP deflator ratio. The model is also re-estimated using the log of the Davis (2004) industrial production series as a measure of output. First differencing of all three variables is used because there was evidence of nonstationary using standard unit root tests in the levels. Since the data is limited to an annual frequency and VAR lags use up observations, additional years are included up front to offset the lags and increase the degrees of freedom. The Ljung-Box Q statistic indicates at least two lags are needed to eliminate serial correlation and whiten the residuals in both versions of the VAR. Consequently, the VARs are estimated using two lags for the years 1864-1897.

Figure 6 shows the accumulated impulse response functions (AIRFs) of the three endogenous variables over an 11-year response to a 1 SD aggregate demand shock of about 2 percent. The top panel in this figure shows the VAR estimated with real GNP and the bottom panel shows the VAR estimated with industrial production. The top panel reveals a 1 SD shock to aggregate demand increases output by 0.84% upon impact, while the bottom panel indicates output increases to 0.92%. Stated differently, a 1% shock to aggregate demand would be followed by an initial increase in output of 0.42-0.46%. This effect gradually unwinds and is close to zero by year four. This same 1 SD shock causes the real wage upon impact to fall 1.08% in the top panel and 1.00% in the bottom panel. The real wage, therefore, would initially fall by about a half a percent following a 1% shock to aggregate demand. The real wage would return to its original value by year six. The aggregate demand shock, therefore, has a short run effect on output and the real wage, but permanently increases the price level. This contemporaneous increase in the price level and decline in the real wage indicates nominal wages were relatively slow to adjust to the aggregate demand shock during this time. Moreover, these AIRFs imply nominal wage rigidities were nontrivial during the postbellum period of deflation since there was a decline in the real wage accompanied by an increase in output.

Figure 7 reports the VAR’s historical decomposition of the output growth rate into an actual path series (solid line), a baseline forecast series (long-dashed line), and a baseline forecast plus the effect of aggregate demand shocks on output (short-dashed line). The baseline forecast is a projection of the growth rate of output that does not include any of the structural shocks in the period being forecasted. We see that although far less important than aggregate supply shocks (the difference between the baseline plus aggregate demand series and the actual path series) the aggregate demand shocks still were consequential to economic activity. The historical decomposition under both output series shows aggregate demand shocks increased output by as much as 3.4% and decreased output by as much as 2.6%. Aggregate demand shocks mattered to the postbellum economy.

One common complaint about deflation is that the debts will be more difficult to be paid, and this is especially true for farmers at that time. One way to circumvent this problem is to allow nominal interest rates to decline. This is what happened during the U.S. economy of 1870s-1890s (Higgs, 1971, pp. 97-99; Beckworth, 2007, Figure 4). Interest rates fell for two reasons. First, with increasing accumulation of savings, competition among lenders forced interest rates down. Farm mortgages were generally drawn for terms of 1 to 5 years, and when extensions or new loans were negotiated farmers usually contracted at a new, lower rate of interest. Secondly, the general downward trend in prices has occurred during the three decades before 1897. Recognizing that deflation seemed to be a fact of life, many farmers no doubt bargained with lenders for a lower rate of interest in anticipation of falling prices. Indeed, the interest rate on farm mortgages tended to fall everywhere throughout the last quarter of the nineteenth century, though rate differences among places persisted, reflecting differences in the risks attending loans.

To answer this question more directly, Higgs (1971) explains that debt burden wasn’t a problem at all :

In a recent study, Robert Fogel and Jack Rutner have argued, however, that the increased real burden of debt repayment attributable to falling prices was almost negligible for farmers in general. “[C]apital losses on mortgages due to unanticipated changes in the price level had only a slight effect on the average profit of farmers. [See Table 4-4] . . . [T]he debt to asset ratio was low (about 13 percent) for most farmers. It was only the farmer with a high debt to asset ratio who was badly hurt by the declining price level. But such farmers were atypical.”

Both Higgs (1971, p. 100) and Beckworth (2007, p. 206). Although farmers became substantially better off in absolute terms, they became worse off relative to others. Many farmers probably didn’t accept that outcome.

By far the most curious feature of the said period is the data on unemployment rates. Although economic growth is strong, the unemployment rates weren’t low, according to most estimates (Lebergott, 1964; Romer, 1986). However, the numbers reported by Vernon (1994) show a much lower estimate. Vernon’s estimate should be trusted more than Lebergott because, among other things, these figures exaggerate the cyclical movement in unemployment rate by neglecting a procyclical movement in labor force. Secondly, Vernons’ 1890s employment series is interpolated with a single employment index, the Frickey (1947, 212) series covering factory employment in Ohio and several northeastern states. But factory employment is much more volatile cyclically than total employment. Given Vernon’s (1994) figures, the unemployment rate peaks at about 8% for just a single year (1878).

Selgin et al. (2012) graphed the data on unemployment rate from 1869 to 2009. Again, these rates were high only during the period of economic crises, i.e., 1874-79 and middle 1890s (see e.g., Rothbard, 2002, pp. 168-169).

Higgs (1971, pp. 123-124) reported high unemployment rates based on Lebergott’s data, but he believes this may be the cost of a strong economic growth.

Though inventions led to increased efficiency in production, they often meant bankruptcy for those employing older processes. David A. Wells, despite his pervasive optimism, was perceptive enough to recognize that “nothing marks more clearly the rate of material progress than the rapidity with which that which is old and has been considered wealth is destroyed by the results of new inventions and discoveries.” Though migration allowed young people to obtain higher incomes, it often left their parents lonely and unhappy in the old home. Though the settlement of fertile Western lands provided cheaper food for urban dwellers, it often meant ruin for Eastern farmers. And similar contrasts might be recited at great length. We could say that people did adjust; ultimately everyone was better off. But such an interpretation is incomplete and ignores the costs imposed on people by the disruptive transformations that inevitably accompanied economic growth.

The inescapable fact is that economic growth hurt many people. Some recovered their losses, but others did not. Economic growth meant Progress from a social point of view because it created more wealth than it destroyed, but the distribution of the gains and losses was quite unequal. If we are interested in individual welfare, the answer to the question “Was progress worth its price?” must necessarily be that for some it was, and for others it was not. It will hardly do to say that individuals “freely chose to have economic growth,” because growth was a social process; the actions of a single individual simply did not matter one way or the other. An individual could determine his own program of saving and investment, but he could neither foresee nor control the future development of the market system. He could not know that the investments made in such hopeful expectations and based on the most reliable available information were often destined to become reductions in his wealth.

In any case, if many economists believe that falling prices cause a rise in production costs and unemployment, a secular decline in prices must be followed by a secular increase in unemployment rate, but which patterns do not appear in the data.

It could be that the high unemployment rate in this period is not the result of deflation but has something to do with the great changes that have accompanied economic growth. Usually, crises share common features such as massive banking failures and layoffs, as well as contraction in economic activity. But those features don’t apply to the American period of 1873-1896. For example, the financial crises were probably due to a lack branch banking which makes the banking system more susceptible to external shocks (Beckworth, 2007, p. 205). As Selgin (1988) remarked, the absence of branch banking is generally a sign that the banking system wasn’t (still) very developed. But not in this case. Beckworth noted that branch banking was literally prohibited. Beckworth also noted that the imposition of a tax on state bank notes made banking under a state charter unprofitable. All of these features work to slow down the economic growth. Concerning the economic transformation as a whole, Higgs (1971, pp. 47-48, 56, 65-66) has a good description of the situation. For instance, during the 1870’s steam surpassed water as a source of power, and after 1890 electrical power was increasingly applied in industrial uses. Between 1865 and 1915 aggregate energy consumption increased more than five-fold, and mineral fuels (predominantly coal), which had provided less than 20% of this energy in 1865, furnished over 85% in 1915. The U.S. economy was experiencing an industrial revolution.

The composition of the economy’s total output changed dramatically in the post-Civil War era (Table 2.6). As their incomes rose, consumers increased their spending for the products of agriculture only slowly, and for manufactured products much more rapidly. Moreover, the rise in the fraction of total income used to finance material investment placed a greater demand on the manufacturing industries. As a result, the rate of return in manufacturing enterprises became relatively greater, and entrepreneurs moved to expand such production; at the same time many farmers, discouraged by their relative lack of success, sought to improve their condition by seeking nonfarm occupations. The upshot of these movements was the transformation of a predominantly agrarian economy into a great industrial economy – a transformation so sweeping and pregnant with implications that economic historians have called it an “industrial revolution” and made it the focus of a major part of their research for the past century. In 1870, after several decades of industrial growth, the United States had a manufacturing output equal to that of France and Germany combined, but only about three fourths as large as that of the United Kingdom; by 1913 the American manufacturing output equalled that of France, Germany, and the United Kingdom combined! Still the greatest producer of raw materials and foodstuffs, the United States had become the world’s industrial giant as well.

And one consequence of this transformation is illustrated in Table 3.2 :

Apparently, it is due to those great transformations and a frail banking system that unemployment rates weren’t as low as we would have expected.

Still, there is another comment on the Great Depression, notably the 1873-79 period. Newman (2014, p. 492) argued that the revised GNP estimates from Davis (2006) show that the depression was only limited to the period 1873-75. Newman (2014, pp. 494-495) continues and says that the belief that there was a depression from 1873 to 1879 was mainly due to faulty economic statistics and reliance on nominal rather than real values. In fact, there was recovery since 1875 (despite declining money supply) without fiscal or monetary stimulus. Balke & Gordon’s index didn’t show this, he said, because their estimates was based on build on was the railroad output-dominated Frickey transportation and communications index. Because railroads suffered a severe decline, their estimated output is downwardly biased. Perhaps more importantly, he says that the high unemployment estimates from either Lebergott or Vernon are implausible. For instance, Vernon derived his estimates based on Balke & Gordon’s GNP series, which must certainly understate output growth (Newman, 2014, p. 496). Unemployment rates thus shouldn’t be this high.

Similarity with the U.K. economy of 1873-1896

A similar story is reported by Selgin (1997, pp. 51-53) who describes the features of the UK economy in the period of 1873-1896. Falling prices were inspiring people to go shopping. Real wages were rising. Almost all indices of economic activity show an upward trend. The cause of deflation was due to a fall in real unit production costs for most final goods during this entire period. There were just certain branches of economic activity that were depressed; in Britain these included foreign trade prior to 1875, agriculture in the late 1870s, and (as a result of increased foreign competitiveness) ‘basic industries’ such as the iron industry beginning in the 1880s.

]]>menghublog1001The postbellum deflation and its lessons for today (Beckworth 2007) Figures 1-2The postbellum deflation and its lessons for today (Beckworth 2007) Table 2The postbellum deflation and its lessons for today (Beckworth 2007) Figure 6The postbellum deflation and its lessons for today (Beckworth 2007) Figure 7Unemployment Rates in Postbellum America, 1869-1899 (Vernon 1994) Table 2Has the Fed Been a Failure (Selgin 2010) Figure 6The Transformation of the American Economy, 1865-1914 - An Essay in Interpretation (Higgs 1971) Table 3.2The Bell Curve, 20 years afterhttps://menghublog.wordpress.com/2015/01/02/the-bell-curve-20-years-after/
Fri, 02 Jan 2015 23:59:59 +0000http://menghublog.wordpress.com/?p=2519Continue reading →]]>Or nearly so. I was planning to publish that blog article for the 31th December 2014. As you can see, I failed in this task, and didn’t finish in the right time. Anyway, I wrote this article, mainly because I am bothered that when people cite The Bell Curve the typical opponent responds with a link toward Wikipedia, specifically the part related to the “controversy” of The Bell Curve. It goes without saying that these persons did not read the books written in response to The Bell Curve. In fact, they have certainly read none of them. It is ridiculous to cite a book you didn’t read, but apparently, it does not bother many people, as I see.

For the 20 years of the book, I found appropriate to write a defense of the book. Or more precisely, a critical comment on the critics. I have decided to read carefully one of these books I can have access, and for what I have read here and there, it is probably the best book ever written against The Bell Curve. I know that Richard Lynn (1999) has already written a review before. But I wanted to go into the details. The title of the book I’m reviewing is :

In fact, I have read that book some time ago, but didn’t find the need to read everything in detail. And I was unwilling to write a lengthy review. But I have changed my mind because of some nasty cowards.

Summary

Concerning the Devlin’s book, the title is somewhat disconcerting. “Scientists Respond to the Bell Curve” suggests to me they do not think Herrnstein & Murray as serious scientists, and perhaps not scientists at all. By the same token, they use an appeal to authority. I prefer the scientific way to discredit an argument rather than this.

Anyway, if I have to give a brief summary, I will say that I appreciate Carroll’s chapter. And also that of Glymour, even though I don’t like the way he expresses his ideas, by using complicated and obscure terms. I also appreciate Daniels et al.’s chapter, even though there are approximations in what they say, and what they have done. The Winship & Korenman’s chapter is not bad either. Finally, I am surprised by Belke’s chapter, because I expected it to be a bad and inaccurate summary, but it’s not. Generally, however, there are plenty of errors.

Now, concerning one of the main claims of the book, which was that the statistical methods employed by The Bell Curve are deeply flawed, I have some disagreement. The main argument that is often used is that the authors employed a weak measure of environmental variable, i.e., SES (a composite of two parental occupation variables, two parental education variables, and family income variable). But the authors already answered it; that was because IQ can mediate the link between these environmental variables and the outcome variable. Murray even believed they could have controlled for too much confoundings. In fact, the real problem with the analysis of Herrnstein & Murray is that they usually don’t include interaction effects, either between SES and IQ, age and SES (or IQ), race and IQ, gender and IQ, etc. Instead, they simply add age and/or gender as single variable(s). Usually, they ignore the variable of race, and focuse on the white population. They could have improved their analysis, but given the large amount of peer-reviewed articles I have read, I am left with the impression that a great number of social scientists would have done the analysis the same way Herrnstein & Murray did, i.e., without interaction effects. I am very confident when I say that interaction effects are rarely used in social science. If Herrnstein & Murray are bad scientists, I am afraid that there are many more of bad scientists than what Devlin et al. could believe.

Fienberg and Resnick (1997) present the authors. Murray was the author of Losing Ground, in which the argument was that the welfare system is costly, counterproductive and discourage people to work. Herrnstein was the author of I.Q. in the Meritocracy, in which the argument was that intelligence and social status have a genetic component. They then (p. 5) present the main argument of The Bell Curve; high heritability of IQ, high predictivity of IQ, genetically mediated socioeconomic differences.

Fienberg and Resnick pursue (pp. 6-8) in narrating the eugenics movement. Karl Pearson and Francis Galton were specialized in statistics, through which they tried to understand the laws of inheritance. They remark that the government did not pay enough attention to biological factors. They believed that training and education cannot create intelligence. Intelligence must be bred. Pearson was fully aware of the problem of inferring causation from correlation, so the statistical methods needed to be improved, and more sophisticated techniques were later employed. Ronald Aylmer Fisher, geneticist and statistician, was also an important figure in the eugenics movement. The statistical tools that underpin the studies of genetics and IQ originate from Fisher. Although brief, the remarks made by Fienberg and Resnick give the impression that all these three scientists did not lack any intellectual integrity.

The telling of how the paradigm has shifted, from nature to nurture, is obscure. It seems to have occurred during the 1920s. And the reason advanced by Fienberg and Resnick (p. 12) was that scientific evidence has accumulated against the genetic theories. Like I said, this is odd. Jensen (1973, 1998) reviewed many of these kind of studies, and none of them were as old as 1920s and before. I have no reason to believe these studies have more weight than the recent studies, especially if modern IQ tests are more reliable, and that reliability increases group differences (Jensen, 1980, pp. 383-384).

Chapter 2 A Synopsis of The Bell Curve Terry W. Belke

Belke (1997) provides a short summary of The Bell Curve. For someone who has read the book several times, I can tell that Belke’s review is definitely a good one. And he also cites all the pages he finds important. This is greatly appreciated. Here, I will describe how Belke summarizes The Bell Curve.

Chapter 1 tells us that the probability of going to college increased dramatically for students in the upper half of the IQ distribution but decreased slightly for students in the lower half of the IQ distribution between 1900 and 1990. Chapter 2 demonstrates there is also an occupational sorting by intelligence. High-IQ professions have grown tremendously since 1940, and the proportion of individuals in the top decile of IQ in these professions as well. Chapter 3 tells us that the predictive power of IQ (especially general intelligence rather than specific skills) in job performance is high, and is more important than either education or age. But Belke could have also mentioned that the authors (1994, p. 77) did say that the predictivity of IQ increases with tests’ g-loadings. Chapter 4 argues that the value of intelligence in the marketplace has increased, with wages in high-IQ occupations growing more rapidly than wages in low-IQ occupations. The more complex a society becomes, the more IQ becomes important. The prediction is a trend toward more class stratification. It is unfortunate that Belke did not mention that the authors said that heritability can change if the conditions producing variation change.

Chapters 5-12 present the statistical analyses of The Bell Curve on the NLSY79 data. Intelligence is more important than SES in predicting a wide range of social outcome (poverty, school, parenting, welfare dependency, crime, civility and citizenship).

Chapter 13 relates to racial differences. The black-white difference in IQ amounts to 1.08 SD. The theories of cultural bias are untenable. Motivation is also irrelevant. When SES is partialled out, the gap is reduced by a third, but because SES has a genetic component, this method also under-estimates the black-white difference. The black IQ increases with SES but does not converge to the white score (Belke is not accurate because the gap actually increases with SES). The NAEP reveals a gap narrowing in the black-white gap, although Belke did not say that the NAEP is not an IQ test. Belke could have also mentioned Herrnstein & Murray’s (1994, pp. 276-277) review of the literature on the black-white IQ studies that shows no secular narrowing. But he does say that the authors cautioned that genetic differences within groups are not necessarily generalizeable to genetic differences between groups. And also the discussion about Spearman’s hypothesis : the higher the g-loading of a test, the larger the black-white gap. Belke also mentioned the authors’ emphasis on IQ malleability if the differences were genetic rather than environmental, and that they believe that the assumption that environmentally induced deficits are less hardwired and less real than genetically induced deficits is wrong (and so do I). Chapter 14 covers the research on racial differences in social outcomes for which most of them are considerably reduced when IQ is held constant. Chapter 15 covers the dysgenics of IQ, with lower IQ people having more children (and at a younger age) than higher IQ people. The reason has to do with the fact that women wish to take advantage of career opportunities. Chapter 16 illustrates the prevalence of low IQ among people who suffer from social problems.

Chapter 17 covers the topic of IQ gain. Belke mentioned two successful experiments, one in Venezuela and another in coaching the SAT, but curiously he doesn’t mention the authors’ skepticism about the robustness of these results. Then, Belke says that the authors believed that adoption could be more effective than schooling programs, which often results in fade out. Chapter 18 talks about the stagnation in the american education, which is related to the declining SAT scores among the most gifted students. The educational system has been dumbed down to meet the needs of average and below-average students. The SAT pool was shrinking but not expanding, which makes the common view that the SAT decline was due to the expansion of SAT pool untenable. Chapter 19 touches the subject of affirmative action in higher education. The racial difference between whites and blacks in the LSAT was -1.49 SD, and similar differences were reported for MCAT and GRE scores. This is way larger than the difference of 1 SD in IQ usually reported between blacks and whites. The authors suspect a possible consequence of such policy is the dropout of blacks who are aware of their own limited capacity to compete with smarter students. Chapter 20 treats the affirmative action in the workplace. When blacks and whites are equated for IQ, the blacks are hired at higher rates since 1960s with trends increasing into the 1980s. This concerns clerical, professional and technical jobs. The authors advocate that the goal of an affirmative action policy should be equality of opportunity rather than equality of outcome. Chapter 21 tells us about the possible scenarios associated to the actual, expected further cognitive stratification in the future. Chapter 22 tells us the recommendation of the authors, in that a society must operate in such a manner as to allow individuals throughout the entire range of IQ to find valued place. This could be done if the justice system starts adopting simpler rules which will make living a moral life for low IQ people. In the past, low IQ people were able to find a valued place, but not anymore in the contemporary world.

Daniels et al. (1997) conduct a study showing that the heritability of IQ is upwardly biased because the figures do not remove non-additive genetic effects. The additive effects (narrow heritability) are what matter most. This study is the same as the Devlin et al. (1997) that has been often cited by environmentalists. They begin the chapter in saying that “In one eloquent volume, the authors serve up justification for that oft-heard refrain “the poor will always be poor.”” (p. 45). This is one thing I always find amusing among egalitarians. They can’t resist to slam their moralizing speech right on the face of their opponents to make them appear as the bad guys. They continue in saying that “According to H&M, it’s all in the genes, and there is little that can be done about it. Indeed, if IQ and intelligence are highly heritable, H&M’s vision is plausible; if they are not highly heritable, their vision is only a phantasm.” (p. 46). They reiterate when they claim that “It is narrow-sense heritability that is the critical quantity determining the likelihood of both of H&M’s nightmarish genetic visions: cognitive castes and dysgenics.” (p. 53). The first mistake is that H&M (1994, p. 106) never said it’s all in the genes and the second mistake is that H&M (1994, pp. 313-315) said explicitly that things will not change if group differences were environmental rather than genetic. Belke’s chapter (1997, p. 29) mentioned that later point as well.

Daniels et al. (1997, pp. 50-53) describe what is additive and non-additive genetic effects, and the importance of modeling the later effect. Herrnstein & Murray (1994, pp. 105-108) accept a maximum value of 0.80 and a more plausible broad heritability estimate of 0.60. But Daniels et al. argue that the heritability is much lower if one considers only the narrow heritability. Their study focuses on the estimation of shared early external environment (they call it preseparation environment or prenatal maternal effect). They expect such effect to emerge because twins share the womb concurrently whereas siblings share the same womb serially. They argue that even if the mothers may have similar personal habits from one pregnancy to another, the temporal separation between progeny ensures a diminished correlation of sibling IQ (p. 57). They explain in detail the necessity to evaluate non-additive (i.e., dominance) effects. They believe that only the additive genetic portion of the heritability has any predictive value, and that non-additive effects make it far more difficult to predict the outcome of a given mating based on the knowledge of the phenotypes of the parents (pp. 52-53). For example, if the predicted IQ of a child of parents with IQs of 100 and 120 is to be 110, the expected IQ of the child might be far higher or lower than either parents if there were substantial interactions between genes.

The expected correlations, which are determined by the degree of genetic relatedness, between children and midparent and among siblings (reared apart or together) and among dizygotic twins (opposite or same sex) are all 0.50. The observed correlation is 0.50 between children and midparent and 0.60 between DZ twins. And the observed correlations between siblings reared together and apart are 0.46 and 0.24. Given their non-additive model, it is expected that shared maternal effect will be substantial.

In their analysis, assortative mating was modeled (i.e., adjusted). They compare models (e.g., III & IV) which attempted to model early common environment (c²) to be higher for twins than among siblings with models which constrain maternal effects to be zero (e.g., I & II). Models I & III, unlike models II and IV, assume c² to be equal for twins, siblings and parent-child correlations. Maternal (shared) effect seemed to be essential to achieve the best fit (according to Bayes factor). In the best fit model (III), the total (broad) genetic effect was 0.48 and its components, additive and non-additive were (respectively) 0.34 and 0.15. The maternal environment effects for twins and siblings were 0.20 and 0.05, which figures illustrate the extent of this non-additivity. The shared environment (c²) estimate was 0.17. Having reported these results, they claimed to have resolved the puzzle that Plomin & Loehlin (1989) have never been able to resolve, and that is not even resolved today, for what I know; why direct methods of estimating heritability consistently lead to higher estimates than the indirect methods. They argue that accounting for maternal effects and non-random mating explains this curious pattern (p. 58).

One problem with the results is that heritability increases with age. They did not restrict the analysis to adult persons. They also do not correct for measurement errors, as Lynn (1999) cogently noted. Furthermore, heritability (h²) is an r-squared measure and, so, is not an effect size. Its square root should have been used. The SQRT of these figures give SQRT(0.2)=0.44 and SQRT(0.05)=0.22. More problematic is that Bishop et al. (2003, Table 3) did not succeed to replicate their analysis. The likely reason is that Daniels and Devlin did not consider age effect, at least for maternal environment. Bishop discovered that DZ correlation was indeed superior than non-adoptive sibling correlation at ages 2-4 but the DZ correlation was lower than the non-adoptive sibling correlation at ages 7-10. These numbers indicate a diminishment of special twin shared (environment) effects over time. In any case, Daniels and Devlin’s indirect evidence for the impact of the so-called “prenatal” maternal effect must be supported by direct evidence, i.e., interventions. Finally, the last blow was administered by Segal & Johnson (2009, p. 89). The relevant passage reads :

A common assumption is that sharing a womb enhances twins’ phenotypic similarity because the fetuses are equally affected by the mother’s diet, health, medications, and other factors. However, the unique effects of the prenatal environment tend to make twins less alike, not more alike, especially in the case of MZ twins. Furthermore, twins’ prenatal situation cannot be considered vis-à-vis measured traits without reference to twin type (MZ or DZ) and the presence in MZ twins of separate or shared placentas and fetal membranes. Devlin, Daniels, and Roeder (1997) overlooked these distinctions, incorrectly concluding that twins’ shared prenatal environments contribute to their IQ similarity. (It was found that 20% of the covariance between twins and 3% of the covariance between siblings was explained by shared prenatal factors.) Thus, this analysis produced lower estimates of genetic effects than most other studies.

Having reported their analysis, they provide some comments on Herrnstein & Murray’s prediction of cognitive castes and IQ dysgenics, and additional subjects. Concerning castes, they argue there is no proof of an ever-increasing assortative mating. Murray’s (2012) latest book Coming Apart suggests his prediction was correct; for the United States, at least. Concerning IQ dysgenics, they think the Flynn effect contradicts the idea of dysgenic effect. But there is no evidence that Flynn effect is a real intelligence gain. Measurement invariance does not hold, which means that the IQ gains were not unidimensional. But, except Beaujean & Osterlind (2008), no one has been able to decompose the IQ gains into real and contaminated (real + artifactual) gains. Even the techniques used to decompose IQ gains, i.e., IRT, have their own problems, known as ipsitivity (Clauser & Mazor, 1998, pp. 286, 292; Nandakumar, 1994, p. 17; Richwine, 2009, p. 54; Penfield & Camilli, 2007, pp. 161-162). In the end, it is premature to say anything conclusive about the Flynn effect. Still, they argue (p. 62) that the Flynn effect cannot refute the idea of IQ dysgenics. In that case, I do not understand why they have resorted to this argument. With regard to race differences, they noted that “It is not clear to us why IQ would be positively selected in Caucasians but not in Africans.” (p. 62). Lynn (2006) proposed an evolutionary theory to explain how these differences can emerge. Daniels et al. also cite the Scarr et al. (1977) study failing to confirm the genetic hypothesis, but Jensen (1998, pp. 479-481) remarked there are problems with their studies. See also Chuck’s post. Daniels et al. commented on Herrnstein & Murray’s view that environmental homogeneity increases genetic variation. They deem it as false because the realized heritability is determined by the complex interplay of genes and environments, so that heritability can be zero, one, or in between, when environments are homogeneous (p. 64).

In the epilogue, their comment that “the subtle interplay of environment and genes rarely comes across in these writings, either because the authors judge the subject too complex for their readership or because they don’t grasp it themselves” suggests they fail to notice the footnote 32 at page 107 of The Bell Curve. The authors mentioned assortative mating, genetic dominance and epistasis as additional sources of genetic variation. So, they seem aware of this subtlety, but they didn’t go in depth on this topic. After all, this was not the subject of the book.

Chapter 4 The Malleability of Intelligence Is Not Constrained by Heritability Douglas Wahlsten

Wahlsten (1997) attacks Herrnstein & Murray’s ideas on heritability. He begins (p. 73) by saying that they are wrong in their affirmation that IQ malleability is limited by heritability estimates. The problem here is that Wahlsten is not a careful reader. The authors did say that “the heritability of a trait may change when the conditions producing variation change” (Herrnstein & Murray, 1994, p. 106). They also understand that, “as environments become more uniform, heritability rises.” (Herrnstein & Murray, 1994, p. 106). So, Wahlsten’s quote on Herrnstein & Murray must be placed on the right context. He just failed to do that. Furthermore, Sesardic (2005, pp. 154-156) has a good treatise on this subject. One element that has confused many people is that the emergence of an effect (genetically caused) is not the same as the persistence of that effect (environmentally caused). Thus, the above claim by Herrnstein & Murray does not imply that a phenotypic characteristic cannot be changed by environmental manipulation when the emergence of that phenotypic characteristic is entirely genetic. We can conclude that what is genetic (the emergence of an effect) is not readily modifiable, and what is readily modifiable (the persistence of an effect) is ipso facto not genetic.

Wahlsten then uses the example of PKU (phenylketonuria) to illustrate the fact that heritability does not constrain modifiability (p. 74). It is caused by a deficiency of the enzyme phenylalanine hydroxylase (PAH). The persons who lack an active form of this enzyme would suffer from abnormally high levels of phenylalanine in the blood and severe brain damage, eventually leading to mental retardation, because they are unable to digest phenylalanine (an amino acid that is a necessary part of a normal human diet). But it was found later that PKU can be rapidly eliminated thanks to a special diet low in phenylalanine. The story is nicely told, the argument well made. But once again, Wahlsten missed the target due to misreading of The Bell Curve (1994, p. 106) and also failed to distinguish between the onslaught of an effect and its continuous presence. It should be noted that environmental does not automatically mean modifiable, especially if we don’t know how to detect and manipulate the environmental causes of a characteristic.

Wahlsten (p. 78) continues by arguing that the Flynn effect does well to counter The Bell Curve’s main idea and even seems to be surprised that the authors dismiss the Flynn effect. This is another misreading because Herrnstein & Murray (1994, p. 308) tell us that some researchers found that the Flynn effect could be due partly, if not entirely, to narrow skills rather than general intelligence per se. They have every reasons to remain skeptic, as the researchers did not at the time (and not even today) understand the nature of the Flynn effect.

Wahlsten (p. 79) then resorts to a ridiculous argument : “The December 29, 1915 issue of the Chicago Herald trumpeted to its public: “Hear how Binet-Simon method classed mayor and other officials as morons” (reprinted in Ref. 16, p. 241)”. If you haven’t guessed what he meant, he just says that IQ has poor predictivity. But the correlation of IQ with occupation and education is one of the strongest and most robust findings in the field of psychometrics (Schmidt & Hunter, 2004; Strenze, 2007). Wahlsten proves to be very dishonest by citing such an old article. At that time, IQ tests certainly had much more imperfections than they do today.

The author (p. 81) cited several studies showing IQ gain for groups of children having 1 year more of school than the other group of children. The meta-analytic effect size for these four studies amounts to 4 IQ points (grade 1 versus kindergarten). Among those studies, there is the Morrison et al. (1995) study that has been commented by Rowe (1997, pp. 142-143). And Rowe noticed that the two groups differ in reading achievement by 0.90 SD before schooling, by 2.63 SD for young grade 1 versus old kindergarten, and by 0.36 SD for grade 1 versus grade 2. The sample sizes were small (N=10 per group). However, if generalizeable, this finding suggests simply that the group having a 1-year delay in schooling will sooner catch up to the other group despite this 1-year deficit. In any case, this pattern is predicted given what usually happens in schooling programs : high and rapid cognitive advantage for the experimental group over the control group but progressive fade out over the years. In other words, schooling can boost intellectual growth, but may not affect the level finally attained. This is why a follow-up study is so important.

Wahlsten (pp. 82-83) cites the Abecedarian, IHDP and MITP studies for illustrating how IQ can be boosted dramatically. I have covered this topic in my earlier blog post. But I remark that Wahlsten ignores follow-up studies and gives no caution on the need of these reports. And even this fantastic IQ gain can be artifactual. Jensen (1969, p. 100) provides a magnificent illustration :

In addition to these factors, something else operates to boost scores five to ten points from first to second test, provided the first test is really the first. When I worked in a psychological clinic, I had to give individual intelligence tests to a variety of children, a good many of whom came from an impoverished background. Usually I felt these children were really brighter than their IQ would indicate. They often appeared inhibited in their responsiveness in the testing situation on their first visit to my office, and when this was the case I usually had them come in on two to four different days for half-hour sessions with me in a “play therapy” room, in which we did nothing more than get better acquainted by playing ball, using finger paints, drawing on the blackboard, making things out of clay, and so forth. As soon as the child seemed to be completely at home in this setting, I would retest him on a parallel form of the Stanford-Binet. A boost in IQ of 8 to 10 points or so was the rule; it rarely failed, but neither was the gain very often much above this. So I am inclined to doubt that IQ gains up to this amount in young disadvantaged children have much of anything to do with changes in ability. They are largely a result simply of getting a more accurate IQ by testing under more optimal conditions. Part of creating more optimal conditions in the case of disadvantaged children consists of giving at least two tests, the first only for practice and for letting the child get to know the examiner.

Like I said. Absolutely magnificent. This detail is truly a very important one, but I have never seen anyone else making this point.

Singer & Ryff (1997) spend a large amount of time and place to narrate anecdotal stories of black people in South Africa who suffer stress and humiliation (pp. 106-111). I don’t see the need to add such an emotional touch. In any case, they use these stories to illustrate their focus on psychological resilience and, eventually, the possible effects of such stress on health and, by the same token, IQ. Specifically, there are the black mothers without their husbands who feel anxiety and insecurity about having not enough basic resources, and there are the black men who feel their work and life in the mines is degrading and humiliating, which induces them to exert antisocial activities and violence (p. 99), and which inevitably causes depressed IQs. They explained that the kind of disabilities associated with cognitive impairment due to age is dissimilar for whites and blacks (p. 92), but that overall the whites have better health than black people (p. 93). They say explicitly that “racial discrimination is a central social structural feature of the processes involved in the transmission of tuberculosis” because “It is the convergence of high crowding, dilapidated housing, airborne particulates, poor nutrition, and compromised immunity that are the requisite conditions for the spread of this disease” (p. 94). They focus mainly on tuberculosis (pp. 99-100) and tell us that 70% of cases in the US occur among the minorities and blacks live in environments conducive to transmission of tuberculosis (p. 100) but also to hypertension (p. 113) and they were shown to be the only racial group experiencing high isolation levels. And in South Africa, a society stratified by racial differences, the incidence of all forms of tuberculosis in 1979 was 18 among whites, 58 among asians, 215 among coloureds, and 1465 among blacks per 100,000 people (p. 97).

With regard to hypertension, they admit its familial nature (p. 117) because the correlation between adult sibs usually varies between 0.2 and 0.3 for both systolic and diastolic blood pressures, and similarly or somewhat lower for parent-offspring relationship. The correlations between systolic and diastolic blood pressures are, respectively, 0.55 and 0.58, for MZ twins, while being 0.25 and 0.27 for DZ twins. But they argue that the additive genetic effects are probably lower because of plausible GxE interaction effects.

The main problem of the proposed argument is that these authors see a low IQ score as a product of poor environments, without considering the reverse causation path. The argument also does not account for the fact that the racial IQ gap increases with SES levels. Another curiosity is the finding by Yeargin-Allsopp et al. (1995, Table 4) that the Odds Ratio (blacks over whites) of having mild mental retardation among children increases with SES. That is, the healthier the blacks are, and the further away they fall behind the whites. Generally, blacks were almost twice (OR=1.7) as likely as whites to be mentally retarded when adjusting for SES and birthweight. One must also read Currie’s (2005) paper that shows mathematically how load exposure, ADHD and poverty explain almost nothing of the black-white gap in school readiness. For a more general picture, see this article and this other one.

And, of course, their argument (p. 115) needs to assume that the black-white IQ gap must increase with age. Jensen’s (1974, pp. 998, 1000) review of longitudinal IQ studies shows no evidence for this. Farkas & Baron (2004) concluded the same with regard to the PPVT vocabulary test. Yet one limitation of these studies could be the (very probable) absence of correction for measurement errors.

Curiously enough, the authors end the chapter with a focus on the term “race”, which in their opinion is an arbitrary social construct (p. 117). I don’t see the point mentioning this, and it is not relevant to the debate on race differences in IQ. Anyway, since they come to this, my answer is that the interested readers should read John Fuerst’s essay on The Nature of Race.

Chapter 6 Theoretical and Technical Issues in Identifying a Factor of General Intelligence John B. Carroll

Carroll (1997) attempts to show by means of CFA whether there is a g factor or not. But he first begins to narrate (p. 129) the early age of psychometrics. Spearman and colleagues believed that a single factor can account for the correlations among mental tests, but then they had to acknowledge the existence of other factors as well. Holzinger developed the bifactor model, in which group factors and the general factors are independent. The bifactor model assumes that test scores can be modeled as linear combinations of scores on a general factor and one or more group factors. But the bifactor was never widely accepted. Later, Thurstone advances the idea that intelligence is composed of multiple factors, and he develops a method of factoring a correlation matrix and a method of rotating the axes of factor matrices to “simple structure” in order to facilitate interpretation of factors. And the g factor vanished. But Spearman and Eysenck disputed this result and found a g factor, along with several group factors. In later publications, Thurstone admits the possible existence of a general factor. Another advocate of multiple intelligence theories was Guilford, with his Structure-of-Intellect (SoI) model. Carroll reanalyzed many of Guilford’s datasets and found numerous higher-order factors, including factors that could be regarded as similar to Spearman’s g. Jensen (1998, pp. 115-117) discussed Guilford more in detail.

Carroll says that these earlier factorial methods were what has come to be known as exploratory factor analysis (EFA). He continues and says that, today, we have powerful techniques known as confirmatory (or structural) factor analysis. And he correctly made the point (p. 131) that exploratory (e.g., EFA) and confirmatory (e.g., CFA) analyses are both needed. The first step, exploratory one, serves to identify possible and coherent models and theories, while the second step, confirmatory one, serves to compare the models by fitting them to the observed data. But he also mentions something even more important. That the exploratory analysis (e.g., factor analysis) cannot prove the existence of a g factor. Just because PC1 or PAF1 has a higher eigenvalue does not demonstrate at all the existence of a g factor. The mere fact that cognitive variables are positively correlated does not validate the presence of a single general factor, as it might indicate the presence of multiple general factors (p. 143). The principal component derived from a matrix of randomly generated correlations is necessarily larger than the remaining components. He says that “factor analysis seeks to determine a factor matrix with the least number of factors, m, that will satisfactorily reproduce the given R” (p. 132); R stands for “observed matrix of correlation”. Satisfactory reproduction of R can be defined in several ways, e.g., “the extent to which m factors appear to account for the common factor variance (the communalities of the variables), or the extent to which the residuals (the differences between the observed correlations and the reproduced correlations) are close to zero” (p. 132) through CFA modeling.

There are fundamental differences between exploratory and confirmatory factor analysis; the methods are actually complementary. The former is concerned with analyzing correlational data to suggest a satisfactory model for describing those data. The latter is concerned with appraising the probability that any given proposed model, even one that might seem quite unlikely, could generate the observed data. Exploratory factor analysis is essentially descriptive, while confirmatory factor analysis appeals to statistical significance testing. Confirmatory factor analysis cannot proceed until one proposes a model that can be tested. One source of such a model is a model produced by exploratory analysis, but this is not the only source; a model could be suggested by a psychological theory.

One important detail here is that if a data set is found to measure a single factor, it is a general factor, but only if the variables are all drawn from different parts of the cognitive domain (p. 144). This assumption would be violated if the variables are drawn from a limited portion of the cognitive domain, because they might then serve to define only a single first-stratum or second-stratum factor. A large and representative battery of tests is thus needed. Having said that, Carroll (1997) uses three old data sets on large cognitive test batteries. The variance-covariance matrices of these (sub)tests were submitted to LISREL modeling of CFA, and he attempted to attain a satisfactory model fit. He succeeded in all analyses, and in each of these analyses the preferred model included a third-order g factor. Carroll (pp. 143-145, 151) gives us the warning that CFA does not prove that g is a “true ability” independent of more specific cognitive abilities defined by various types of psychological tests and observations. Carroll concludes the chapter by saying that Herrnstein & Murray’s view on the general factor of intelligence is accurate. The readers interested in digging into this topic may want to read this essay.

Chapter 7
The Concept and Utility of Intelligence
Earl Hunt

Hunt (1997) tries to downgrade the meaningfulness of IQ test. He does this in a very clumsy way. He says (p. 162) that IQ is not the only important predictor of job performance. Personality matters as well. Good to know, but there is nothing original in what he says. I can also say that Gottfredson (1997) argued that personality is important only in a restricted range of jobs. Anyway, Hunt continues and says (p. 163) that psychometricians disagree on whether there is a general factor of intelligence (Spearman and Jensen versus Thurstone, Cattell, Horn) but curiously enough he didn’t mention that the theorists who favored the multiple intelligence theories were wrong.

Hunt (p. 164) discusses the Gf-Gc theory. To sum up, it’s an interactionist model. The greater the Gf (fluid), the greater the Gc (crystallized). Gf reflects the capacity to solve problems for which prior experience and learned knowledge are of little use while Gc reflects consolidated knowledge gained by education, cultural information, and experience. The causality runs from Gf to Gc. They are both correlated because one’s Gc, at time t, will be an increasing function of Gf at time t-1. And he then reveals something more significant : “the fact that there is a correlation between tests intended to draw out pure reasoning ability and tests intended to evaluate cultural knowledge does not distinguish between g and Gc-Gf theories” (p. 165). Equally astonishing is the claim that the fact that Gf and Gc measures respond differently to outside influences (Gf generally decrease from early adulthood whereas Gc measures increase throughout most of the working years) is enough to disprove general intelligence theories (p. 165).

But I wonder in what way. When Hunt says that Gf and Gc have different age trajectories, he should be aware of the fact that Gc can differ between people having similar Gf just because of cultural differences (Jensen, 1980, p. 235; Jensen, 1998, p. 123). So, when noticing the increasing trend in Gc over time at middle age, the effect may be a cultural one (consolidated knowledge gain), but the decline in Gf at an earlier age may not be cultural. So, there is no point in making this comparison.

That Gf and Gc have different properties does not invalidate g. I can provide another illustration. Braden (1994) explains that deaf people have a deficit of 1 SD in verbal IQ tests, compared to the general population, but (virtually) no deficit at all on nonverbal IQ tests. In Braden’s book (1994, p. 207), Jensen has commented Braden’s work, and explained the compatibility of g with modularity in light of Braden’s findings. The relevant passage reads :

A simple analogy might help to explain the theoretical compatibility between the positive correlations among all mental abilities (hence the existence of g) and the existence of modularity in mental abilities. Imagine a dozen factories (persons), each of which manufactures the same five different gadgets (modular abilities). Each gadget is produced by a different machine (module). The five machines are all connected to each other by a common gear chain which is powered by one motor. But each of the factories uses a different motor to drive the gear chain, and each factory’s motor runs at a different constant speed than the motors of every other factory. This will cause the factories to differ in their rates of output of the five gadgets (scores on five different tests). The factories will be said to differ in overall efficiency or capacity (g), because the rates of output of the five gadgets are positively correlated. If the correlations between output rates of the gadgets produced by all of the factories were factor analyzed, they would yield a large general factor (g). The output rates of gadgets would be positively correlated, but not perfectly correlated, because the sales demand for each gadget differs for each factory, and the machines that produce the gadgets with the larger sales are better serviced, better oiled, and kept in consistently better operating condition than the machines that make low-demand gadgets. Therefore, even though the five machines are all driven by the same motor, they differ in their efficiency and consistency of operation, making for less than a perfect correlation between their rates of output. Then imagine that in one factory the main drive-shaft of one of the machines breaks, so it cannot produce its gadgets (e.g., localized brain damage affecting a single module, but not of g). Or imagine a factory where there is a delay in the input of the raw materials from which one of the machines produces gadgets (analogous to a deaf child not receiving auditory verbal input). In still another factory, the gear chain to all but one of the machines breaks and they therefore fail to produce gadgets. But one machine remains powered by the motor receives its undivided energy and produces gadgets faster than if the motor had to run all the other machines as well (e.g., an idiot savant).

And finally, Hunt’s argument relies strongly on the Gf-Gc model of Cattell and Horn, but we know today this model is not the best approximation of the structure of human intelligence. It’s the VPR (Johnson & Bouchard, 2005). These authors even argued that their finding “call into question the appropriateness of the pervasive distinction between fluid and crystallized intelligence in psychological thinking about the nature of the structure of intellect”. In any case, it is odd to claim that “There is really no way that one can distinguish between g and correlated Gf-Gc factors on the basis of psychometric evidence alone.” (p. 165). Carroll (2003) clearly disproved this idea. He confirmed the existence of a third-order g factor on top of several second-order factors, which include Gf and Gc.

Hunt (p. 167) affirms that the correlation between IQ and job performance decreases as experience accumulates over time. He cites two books. Of course, I do not have access to these ones. But he cited Ackerman (1987). However, Schmidt & Hunter (2004) have reviewed these studies, and they conclude that the predictive validity of IQ does not decrease over time and, if anything, it increases with worker experience. It is interesting to note that all the studies cited by Schmidt & Hunter were published many years before the Devlin et al. (1997) book. Hunt either missed them or ignored them.

Hunt says (pp. 169-171) that the correlations between IQ subtests in the bottom half were higher than those in the top half of the IQ distribution. In other words, Hunt discovered the so-called Spearman’s law of Diminishing Returns (SLODR). And he then concludes that “As predicted by cognitive theory, but virtually ignored by psychometricians, the data from conventional intelligence tests indicate that lack of “general intelligence” is pervasive, but that having high competence in one field is far from a guarantee of high competence in another.” (p. 170). But a better interpretation is that low-IQ people rely more on g than high-IQ people, because high-IQ people can find more room to specialization since they are relieved from the stress associated with having low IQ (Woodley, 2011, pp. 234-236). Indeed, low-IQ people presumably face some barriers in the labour market; the basic needs are more dependent on general cognitive abilities while the secondary needs are more related to narrow cognitive abilities (situational competence).

I am very disappointed in this article, but I appreciate the fact that Hunt (p. 161) refuses to see R² as a measure of effect size and prefers the correlation coefficient. He is perfectly right.

Cawley et al. (1997) use regression to predict wages by using the principal components derived from factor analysis of the ASVAB subtests alone, and then along with variables of SES and/or human capital. They begin to say (p. 180) that g is “an artifact of linear correlation analysis, not intelligence”. This passage is just hopeless. Did they read Carroll’s chapter ? They even continue (p. 180) to say that Herrnstein & Murray have claimed there is only one significant intelligence factor, called g, and that they fail to mention that there exist other factors of intelligence. Are these guys serious ? Herrnstein & Murray (1994, pp. 14-15) acknowledge the existence of these factors already. And they go on : “They raise the immutability of cognitive ability when arguing against the effectiveness of social interventions.” (p. 181). Once again, where did Herrnstein & Murray (1994, p. 106) say that IQ is immutable ? Nowhere. Cawley et al. (1997) continue and they say that IQ is not immutable because IQ rises with schooling. They do not answer the question of causality.

Concerning their analysis, they note that in the background model (local and national unemployment rates and a linear time variable), ability (either AFQT or g) contributes between 0.118 and 0.174 to the R² change. In the human capital model (grade completed, potential experience with its quadratic term, job tenure with its quadratic term), the marginal increase in R² due to ability (either AFQT or g) falls to between 0.034 and 0.011.

Having reported these numbers, the authors conclude by saying “payment is not made for “ability” alone, which violates the definition of meritocracy advanced by H&M” (p. 191). Are these guys doing it on purpose ? Throughout the book, Herrnstein and Murray repeatedly say that IQ is not the only predictor of social outcomes. And this has nothing to do with H&M’s idea of meritocracy, which can be easily understood if you read the pages 510-512 and 541-546 of The Bell Curve.

One obvious problem with the analysis of Cawley et al. (1997) is that it is not easily interpretable for the non-initiated readers. They do not explain the meaning of the regression coefficients, e.g., what is the percentage change in log(wage) when PC1 increases by 1SD ? Fortunately, it can be done. When the dependent variable is log transformed, we must simply exponentiate the coefficient of the independent variable(s). For example, the unstandardized coefficients for PC1 in the IQ-only model, are 0.1952, 0.1647, 0.1823, 0.1531, 0.1965, 0.1535, respectively, for black females, black males, hispanic females, hispanic males, white females, white males. So, the exponentiated coefficients (also called odds ratio) are 1.21, 1.18, 1.20, 1.16, 1.22, 1.17. A coefficient of 1.16-1.22 means that for each SD gain in PC1, there is 16-22% percentage gain in wage. This is a modest effect in my opinion. Let’s look at the respective coefficients for the model including all the covariates mentioned above; 0.1235, 0.1045, 0.0904, 0.1084, 0.0903, 0.0828. Their exponentiated coefficients are, respectively, 1.13, 1.11, 1.09, 1.11, 1.09, 1.09. These coefficients show that the expected gain in wage is around 10% by 1SD gain in PC1. Agreed, the effect of PC1 is smaller. But as everyone should know, when we adjust for SES variables, we may have removed some of the effects due to IQ, if IQ can exert an indirect effect on wages through these SES variables. Finally, even if measurement error attenuates the correlation, the odds ratio won’t change that much.

Interestingly, an earlier publication by the same Cawley et al. (1996, Tables 7-8) shows that the importance of PC1 increases with occupational choice. Among blue collar, PC1 has weak effects on log(wage) and the coefficients are, in the same order as defined above, 0.066, 0.047, 0.014, 0.090, 0.029, 0.038. The corresponding exponentiated coefficients are 1.07, 1.05, 1.01, 1.09, 1.03, 1.04. Among white collar, PC1 have coefficients of 0.217, 0.195, 0.150, 0.189, 0.122, 0.119, which correspond to exponentiated coefficients of 1.24, 1.21, 1.16, 1.21, 1.13, 1.13. These effects already adjust for variables of human capital and, so, the effect of PC1 is under-estimated. It’s curious they didn’t mention that their earlier works provide some evidence for the theory favored by Herrnstein & Murray, i.e., that IQ predictivity increases with job complexity.

Finally, a little quibble concerns the insertion of all Principal Components of the ASVAB into the wage regression equation. This is a weird practice, especially since most of these components are meaningless. And even the authors acknowledge that, and even more, “The signs of the coefficients of the second through tenth principal components are irrelevant because each principal component can be reconstructed using the negative of its ASVAB weights to explain an equal amount of ASVAB variance.” (p. 186). Personally, I would have simply used PC1, without considering PC2-PC10.

Cavallo et al. (1997, Table 9.6) reanalyze Herrnstein & Murray’s (1994, p. 324) analysis and conclusion that the set of variables IQ+age is enough to erase the black-white wage gap. The authors faulted Herrnstein & Murray (1994) for having ignored race*age interaction effects, i.e., race differences should not be calculated for a single age. Their Figure 9.1 shows the predicted 1989 earnings by age, each regression line plotted by race. The regression equation used for computing these predicted regression lines contains age, AFQT and parental SES as independent variables. At age 28-29, there is virtually no difference, but the black regression line is above the white regression line at younger ages (25-27) while this has reversed at later ages (30-32). This explains Herrnstein & Murray’s (1994, p. 323, footnote 13) analysis because they calculate the wage gap for people at the average age of the NLSY sample. According to Cavallo et al., the mean age was 28.7. At the same time, Cavallo et al. didn’t use the NLSY79 data in a longitudinal manner, i.e., with repeated measures of wage (Cole & Maxwell, 2003). So, their analysis may not be trusted either. Anyway, their analysis repeated for each separate gender group is still worth mentioning. Their Figures 9.2-9.4 show that white men have an advantage over black men with increasing age but that black women have an advantage over white women with increasing age. All these subgroups have different intercepts and slopes. But most people may not have a stable job and economic situation in their late 20s. I would have restricted the sample to people aged 30+. But at that time, the data weren’t collected for this age category.

Cavallo et al. subsequently repeat the regressions, and use four separate regressions; on black males, white males, black females, white females. Economists know fully well about the huge difference with respect to gender (due in fact to gender role) when it comes to wage differences, that the standard practice is to compute separate regressions (pp. 207-208). This time, they include as independent variables age, age^2, AFQT, SES, education, full-time experience, full-time experience^2, P.T. (part-time) experience, P.T. experience^2. The estimated coefficients are used to decompose the black-white wage gap due to the independent effect of AFQT, following the Oaxaca-Blinder wage decomposition (pp. 210, 213). The analysis shows (p. 211) that 92% of the racial wage gap for men is attributable to premarket factors (of which 38% is due to AFQT and 18% to SES, 17% to education 19% to experience) while the remaining 8% of the wage gap is attributable to other factor. These authors attribute these 8% to racial wage discrimination, because their viewpoint is that earnings can be truly color blind only if the regressions are similar across races when controlling for all the relevant background variables (pp. 204-205).

I agree with Cavallo et al. (1997) that Herrnstein & Murray’s analysis on this one was clumsy and I would have never recommended anyone to do it this way. On the other hand, Cavallo et al. used linear regression with log(wage). Even if this is what most researchers in social science would have done it, the correct procedure would be to use poisson regression (with probably robust standard errors). They also do not analyze the age effect longitudinally, nor did they attempt to separate cohort and age effects by using the method of multilevel regression suggested by Miyazaki & Raudenbush (2000) for longitudinal survey data.

Chapter 10
Does Staying in School Make You Smarter? The Effect of Education on IQ in The Bell Curve
Christopher Winship and Sanders Korenman

Winship & Korenman (1997) begin to summarize the earlier studies showing that education can improve IQ (pp. 220-224). Being out of school for a given period of time is associated with substantial IQ loss. Several longitudinal studies show that IQ increases by 2 to 4 points for each additional year of education. Then, they reanalyze Herrnstein & Murray’s (1994, Appendix 3) analysis on the effect of education on IQ. They attempt to predict AFQT (given in 1981) by using educational attainment (measured in 1980) and age variable and earlier IQ tests given in 1979 (e.g., Stanford-Binet and/or WISC) as independent variables. They applied several corrections to Herrnstein & Murray’s analysis, e.g., proper handling of missing data and the addition of age as covariate and eventually parental SES, the use of Huber’s robust standard errors (for clustered data), and the use of error-in-variables (EIV) regression. The reason given for EIV regression is that the independent variables education and IQ certainly have a reliability lower than 100%. This artifact reduces the true effect size. EIV regression allows the specification of some “assumed” (i.e., usually specified on the basis of past research) reliability estimates. Their preferred model (which also includes parental SES as covariate) is one assuming that both early IQ and education have a reliability of 0.90. In this model (10th), a year of education adds 2.7 IQ points. The analysis is well done, but unfortunately, the data has no repeated measures of the relevant variables. In this situation, any conclusion can only be suggestive.

The authors do not distinguish between IQ gains due to real gains in intelligence and IQ gains due to knowledge gains. It is hard (if not impossible) to conceive, in my opinion, an educational gain that does not incorporate knowledge gain. The consequence of this is that educational gain causes measurement bias. If school-related knowledge is elicited (at least some portion) by the IQ test, a disadvantaged group of people who do not have the required knowledge to get the items correct will fail these items even if they have equal latent ability with the advantaged group. This inevitably causes the meaning of group differences to be ambiguous (Lubke et al. 2003, pp. 552-553). In situation of unequal exposure, the test may be a measure of learning ability (intelligence) for some and opportunity to learn for others (Shepard, 1987, p. 213). This is the reason why test-retest effects are not measurement invariant. The between-group difference is not entirely unidimensional, because a score reflects a component (e.g., knowledge) that is present in one group but not in the other group. That is the very essence of measurement bias, and this is an element that most (if not all) researchers which focus is on education-induced IQ gains usually fail to grasp. The question of how much intelligence (not IQ) can be improved with schooling has never been answered. Not even today. Because of this, all research on educational gain until now have been hopelessly worthless.

Winship & Korenman’s belief that intelligence is malleable through cultural improvement (e.g., schooling) is also contradicted by Braden’s (1994) finding that deaf people have large deficit in verbal IQ and scholastic achievement compared to normal-hearing children, but virtually no impairment in nonverbal IQ. This indicates that intelligence is not affected even by a drastic cultural change, including deficit in schooling. The fact that deaf people have a deficit in (culturally-loaded) verbal tests can substantiate the idea that education-induced IQ gain is essentially a knowledge gain, not intelligence gain.

Even if education has an impact on IQ, the authors’ argument (1994, p. 394) is that inequality in outcomes may increase, because wider availability of resources make high-IQ persons to learn more and faster than low-IQ persons (unless resources are provided only to low-IQ persons). This is also the conclusion reached by Ceci & Papierno (2005).

Manolakes (1997) reanalyzes Herrnstein & Murray’s (1994) analysis on the relationship between IQ and crime among men. They both use logistic regression, with a dichotomized variable of self-reported crime (more accurately, it’s the sum of many delinquency variables, after being recoded appropriately) categorized as 1 if the score is at the top decile of criminal behavior and 0 if otherwise. The independent variables are the AFQT, parental education, type of residence (urban or rural, if stayed or if moved in one of them), race, IQ*parental education, IQ*race, parental education*race. Although the author could (and should) have probably included the three-way interaction IQ*parental education*race, the author says that this variable wasn’t significant. Given that the sample size is fairly large, I will not object, and it is likely that this variable would have a small coefficient.

While Herrnstein & Murray excluded black people from their analysis because they are known to underreport the frequency with which they engage in criminal acts, Manolakes argued that Herrnstein & Murray excluded black people because they rely on the work of Hindelang et al. who emphasize that criminal self-report scales are predominantly based on less serious crimes of high frequency (the type of crimes that whites are more likely to admit). But blacks are more likely to admit to less frequent but more serious crimes.

To make the aggregated delinquency variable, Manolakes did not use some variables, e.g., running away from home, skipping school, drinking alcohol while under age, and using marijuana, because they are the least serious criminal activities and should better not be considered as such.

Manolakes says (p. 243) that the data is inconsistent with Herrnstein & Murray’s presumption that IQ the only predictor of crime. Of course, if you distort your opponent’s ideas, you can refute them more easily.

Manolakes also argued (p. 243) that if the NLSY variables do not include questions regarding white collar crime, organized crime, corporate crime, consumer fraud, etc., the propensity of criminal behavior among high(er) IQ people is diminished. According to Manolakes, this omission may explain the relationship between lower IQ and higher crime.

The parameter estimates (coefficients) are reported in Table 11.2 but I don’t think it makes sense to report these coefficients or their exponentiated values (i.e., odds ratio) because the author has inserted many interactions. The effect of IQ thus cannot be understood solely by its own parameter but also by the other numerical (continuous) variables. However, Figures 11.2 and 11.3 show the relevant predicted plots. When IQ increases, whites (blacks) are less (more) likely to commit delinquency acts. When parental SES increases, whites (blacks) are more (less) likely to commit delinquency acts. It is not clear how to explain these divergent patterns. Figure 11.4 is also very curious. It shows the probability of being at the top decile of criminal activities by parental education for each quartile (lower, median, upper) of IQ. For lower quartile of IQ, the likelihood increases from 12.3% to 22.1%. For median quartile of IQ, the likelihood increases from 14.4% to 17.3%. For upper quartile of IQ, the likelihood decreases from 16.8% to 13.4%. Figures 11.5 and 11.6 report the same plots, but separately for whites and blacks. For whites, delinquency increases with parental education only for IQ at low and median value. For blacks, delinquency decreases with parental education only for IQ at median and upper value. As we can see, the regression lines are all very different, and difficult to interpret. And Manolakes does not even attempt to explain it, and she leaves that hard task to the criminologists, but ends the chapter in saying that such result contradicts Herrnstein & Murray’s assumption that the justice system needs to be made simpler in order to avoid violent or illegal acts due to limited intelligence. But this conclusion can be correct only if the analysis is correct. And it is not.

The likely reason for the result of Manolakes to differ from that of Herrnstein & Murray is that they did not include blacks in the regression and did not use interaction effects either.

The big problem with Manolakes and Herrnstein & Murray’s analysis is that dichotomizing a continuous variable can cause substantial loss of information and even “misclassification” (MacCallum et al., 2002). For example, people involved in delinquent activities 2 times per month can be very much different than those being involved 10 times and 20 times per month. But logistic regression can treat them as if they were no different from each other. Given the typical distribution of crime and delinquency variables (e.g., 70% of cases with score 0, 15% with score 1, 10% with scores 2 and 3, and 5% with scores 4 and more), the distribution is not symmetric around the median values and the most appropriate analysis is undoubtedly a poisson regression for “rare events” variables. And I think I am planning to correct both Manolakes and Herrnstein & Murray in the near future…

Glymour (1997) has definitely an obscure chapter. I dislike the style of the author, but I still understand the main idea (at least, that’s what I think). He spends a lot of time and energy to explain that factor analysis and regression say nothing about causality and that any result from modeling has no consequences whatsoever if one cannot define a proper theory, principle, logic, and mechanism that can explain the pattern of the data.

Glymour writes “What troubles me more is that the principal methods of causal analysis used in The Bell Curve and throughout the social sciences are either provably unreliable in the circumstances in which they are commonly used or are of unknown reliability.” (p. 259). This guy is not even funny. The authors never considered multiple regression as a causal analysis. And a lot of other practitioners also understand this.

Glymour then writes (p. 263) what is the most important passage of the chapter :

When social scientists speak of “theory,” however, they seldom mean either common-sense constraints on hypotheses or constraints derived from laboratory sciences or from the very construction of instruments. What they do mean varies from discipline to discipline, and is often at best vaguely connected with the particular hypotheses in statistical form that are applied to data. Suffice it to say “theory” is not the sort of well-established, severely tested, repeatedly confirmed, fundamental generalizations that make up, say, the theory of evolution or the theory of relativity. That is one of the reasons for the suspicion that the uses of “theory” or its euphemism, “substantive knowledge,” are so many fingers on the balance in social chemistry, but there are several other reasons. One is the ease of finding alternative models, consistent with common-sense constraints, that fit nonexperimental data as well or better than do “theory”-based models. (I will pass on illustrations, but in many cases it’s really easy.) Another is that when one examines practice closely, “theory”-based models are quite often really dredged from the data – investigators let the data speak (perhaps in a muffled voice) and then dissemble about what they have done.

I do not disagree with him this time. My impression is that he is probably right on this one. A lot of people in social science seems to commit the kind of fallacy that is usually derided by austrian economists, and that everyone can find the idea illustrated in the book Human Action of Ludwig von Mises (1949), such as “History cannot teach us any general rule, principle, or law” and “If there were no economic theory, reports concerning economic facts would be nothing more than a collection of unconnected data open to any arbitrary interpretation”. In short, what Mises (1949, p. 41, 49-51) says is that one should not derive and infer a theory from the data, but use theories to interpret data. A theory that is data-driven is not worth calling a theory. And probably many theories advanced in social sciences are hollow. So, testing these pseudo-theories through CFA-MGCFA models, SEM and other regression techniques, is an enterprise doomed to fail. If this is what Glymour meant, I fully agree with him.

Glymour continues, “Factor analysis and regression are strategems for letting the data say more, and for letting prior human opinion determine less” (p. 264). I have nothing to say, but this is interesting.

Glymour (pp. 265, 268) has listed 8 necessary assumptions of factor analysis. (1) There are a number of unmeasured features fixed in each person but continuously variable from person to person. (2) That these features have some causal role in the production of responses to questions on psychometric tests, and the function giving the dependence of measured responses on unmeasured features is the same for all persons; this is supported by the high test-retest correlations of IQ, but that argument meets a number of contrary considerations, e.g., the dependence of scores on teachable fluency in the language in which the test is given. (3) That variation of these features within the population causes the variation in response scores members of the population would give were the entire population tested; that the function giving the dependence of manifest responses on hidden features is the same for all persons, is without any foundation – if the dependencies were actually linear, however, differing coefficients for different persons would not much change the constraints factor models impose on large sample correlation matrices. (4) That some of these unmeasured features cause the production of responses to more than one test item; that other features of persons influence their scores on psychometric tests is uncontroversial. (5) That the correlation among test scores that would be found were the entire population to be tested is due entirely to those unmeasured features that influence two or more measured features; that all correlations are due to unmeasured common causes is known to be false of various psychometric and sociometric instruments, in which the responses given to earlier questions influence the responses given to later questions. (6) The measured variables must be normally distributed linear functions of their causes; normality and linearity are harder to justify, but at least indirect evidence could be obtained from the marginal distributions of the measured variables and the appearance of constraints on the correlation matrix characteristic of linear dependencies, although tests for such constraints seem rarely to be done. In any case, the other issues could be repeated for nonlinear factor analysis. (7) That measurement of some features must not influence the measures found for other features; that is, there is no sample selection bias (the data are missing at random). (8) That two or more latent factors must not perfectly cancel the effects of one another on measured responses.

He focuses (p. 269) on the following objection however :

There is another quite different consideration to which I give considerable weight. I have found very little speculation in the psychometric literature about the mechanisms by which unmeasured features – factors – are thought to bring about measured responses, and none that connects psychometric factors with the decomposition of abilities that cognitive neuropsychology began to reveal at about the same time psychometrics was conceived. Neither Spearman nor later psychometricians, so far as I know, thought of the factors as modular capacities, localized in specific tissues, nor did they connect them with distributed aspects of specific brain functions. (It may be that Spearman thought of his latent g more the way we think of virtues of character than the way we think of causes.) One of the early psychometricians, Godfrey Thomson, thought of the brain as a more or less homogeneous neural net, and argued that different cognitive tasks require more or less neural activity according to their difficulty. Thomson thought this picture accounted not only for the correlations of test scores but also for the “hierarchies” of correlations that were the basis of Spearman’s argument for general intelligence. The picture, as well as other considerations, led Thomson to reject all the assumptions I have listed. I think a more compelling reason to reject them is the failure of psychometrics to produce predictive (rather than post-hoc) meshes with an ever more elaborate understanding of the components of normal capacities. Psychometrics did nothing to predict the varieties of dyslexias, aphasia, agnosias, and other cognitive ills that can result from brain damage.

I can agree that the mechanisms of those factors have been poorly articulated. But Jensen (1998, p. 130) explains that some modules may be reflected in the primary factors while other modules may not show up as factors, e.g., the ability to acquire language, quick recognition memory for human faces, and three-dimensional space perception, because individual differences among normal persons are too slight for these virtually universal abilities to emerge as factors, or sources of variance.

Glymour also writes “If we adopt for the moment the first four basic psychometric assumptions, then on any of several pictures the distribution of unmeasured factors should be correlated. Suppose, for example, the factors have genetic causes that vary from person to person; there is no reason to think the genes for various factors are independently distributed. Suppose, again, that the factors are measures of the functioning or capacities of localized and physically linked modules. Then we should expect that how well one module works may depend on, and in turn influence, how well other modules linked to it work. Even so, a great number, perhaps the majority, of factor analytic studies assume the factors are uncorrelated; I cannot think of any reason for this assumption except, if wishes are sometimes reasons, the wish that it be so.” (p. 268). Indeed, why do the principal components necessarily need to be uncorrelated among them in an unrotated PC analysis ? This is a difficult question. But, as noted above, Jensen (1998, pp. 119-121, 130-132) has an answer to solve this puzzle. Some modules may not show up as factors. Jensen also talked about the so-called idiots savants, i.e., those who have typically a low IQ and can barely take care of themselves but can nevertheless perform incredibly well in some specialized tasks, e.g., mental calculation, playing the piano by ear, etc., although rarely, if ever, does one find a savant with more than one of these narrow abilities.

Glymour writes “If both regressor and outcome influence sample selection, regression applied to the sample (no matter how large) will produce an (expected) nonzero value for the linear dependence, even when the regressor has no influence at all on the outcome variable.” (p. 271). There are regression models used in econometrics to deal with sample selection such as tobit and truncated regressions. Maximum likelihood and multiple imputation (if correctly used) are also possible solutions to the problem of non-random missing data. Glymour continues (pp. 271, and 272-273) in saying that omitted variable bias causes the coefficients of all independent variables in regression to be biased and can also introduce or remove spurious correlations. Probably all practitioners of regression know the problem of omitted variables. But if there are omitted variables, one needs to explain and demonstrate through logical reasoning that a plausible factor was omitted, rather than to assume there must be necessarily one. A third problem pointed out by Glymour (p. 271) is that of reverse causation, i.e., Y causes X instead of X causing Y. For what it worths, the econometricians have an old technique called instrumental variable (IV) regression (also called 2-Stage Least Squares or 2SLS) that deals with this problem. Econometricians can also use Granger causality test in time series regression, where one is statistically evaluating the hypothesis of Y->X versus the hypothesis X->Y by using the lagged values of the independent variables, even though the usual practice of Granger causality test is to be used as a simple bivariate analysis. In the field of psychometrics and psychology, it is more common to use path analysis and Structural Equation Models (SEM). But ideally, one has to use it in conjunction with longitudinal data in order to get around the problem of equivalent causal models (MacCallum et al., 1993; Cole & Maxwell, 2003). In any case, this reverse causality problem can be alleviated, more or less well.

Glymour believes (p. 274) the result provided by Cawley et al. (1997) in this book illustrates what he says, and that the model preferred by The Bell Curve cannot be trusted, even though he does not believe that Cawley’s model has captured the influence of cognitive ability. Glymour certainly does not trust the current statistical methods.

Chapter 13
A “Head Start” in What Pursuit? IQ Versus Social Competence as the Objective of Early Intervention
Edward Zigler and Sally J. Styfco

Zigler & Styfco (1997) propose some corrections to Herrnstein & Murray’s review of intervention programs on education. They noted “What is forgotten in these rash judgments is that most intervention programs were never intended to raise intelligence.” (p. 284). I’m puzzled. The key thing is that if you boost education through repeated cognitive activities, but intelligence remains unchanged despite of this, the conclusion of The Bell Curve is left unchanged. Education simply does not boost intelligence.

Notably, in the case of the Head Start, Zigler & Styfco say (pp. 286-287) that the goals were to improve the child’s physical health and mental processes and skills with particular attention to conceptual and verbal skills, in helping the emotional and social development of the child, establishing patterns and expectations of success, in increasing the child’s capacity to relate positively to family members and others, in developing in the child and his family a responsible attitude toward society, in increasing the sense of dignity and self-worth within the child and his family. For what I know, the Head Start does not concentrate all of its resources to improve the child’s mental processes, but it is wrong to claim that cognitive ability has never been targeted in the programs. Although I agree with the authors’ claim that it makes no sense to say that the Head Start fails to improve IQ based on the assumption that all of the funding was directed toward improving IQ, when it is not the case. However, the goals of the Head Start were to improve the child’s cognitive environment. So, it is legitimate to affirm that Head Start does not improve IQ, despite the huge financial resources that were allocated to improve the environment. On the other hand, we can still argue that if financial resources were allocated more efficiently, the outcomes could have been different. But in what way ? For example, the authors (p. 287) say that each Head Start center has six components : health screening and referral, mental health services, early childhood education, nutrition education and hot meals, social services for the child and family, and parent involvement. If one of these components is suspected to be inefficient in improving the child’s IQ, the administration can decide which component should be dropped. But if all of these components are believed to play a role in the cognitive development of the children, changing these components will be difficult.

Anyway, perhaps for this reason, Zigler & Styfco (p. 293) believe that initial IQ gains following preschool experience were due to non-cognitive factors, as they say : “The physical and socioemotional aspects of development are more strongly controlled by the environment and, therefore, more effectively targeted by intervention”. They continue in saying (p. 293) that IQ and academic measures correlate at 0.70, which corresponds to an R² of 0.49. They conclude that IQ explains only 49% of the total variance of academic achievement and, thus, IQ is not a very robust predictor. The problem is that R² is not an effect size measure.

Although they admit (p. 294) the fade out in IQ gains, they insist on the improvement observed in scholastic achievement and other social outcomes. More importantly, they observe (p. 295) that the current problem with reviews and meta-analyses is that the results of many different programs are combined so that the robust results are diluted by null effects from others. My review of the reviews of other investigators tells me that there is probably no exception to the rule; indeed, education improves social outcomes but not IQ.

Zigler & Styfco (p. 297) affirm that Herrnstein and Murray explain criminal behavior by low IQ alone. This is a caricature of their work. They never said that anywhere in their book.

Zygler & Styfo (p. 299) inform us that the Head Start has improved the health status of the children but also the psychological well-being of their parents.

Zigler & Styfco now question the meaningfulness of IQ test. They write “The value of the IQ construct as a predictor of performance outside of the realm of school also became suspect. For example, Mercer referred to the “6-hour” retarded child – one whose IQ and school test scores are low but who functions perfectly adequately before and after the school day. By the same token, there are many individuals who achieve very high IQ scores but do not behave competently at home, work, or in social settings. IQ, then, was just not measuring what early intervention specialists hoped to achieve.” (p. 300). In other words, they say that IQ has a poor predictivity. As I said previously, this finding is one of the most robust element we know about mental testing for quite a long time. If they are so eager to prove the absence of IQ predictivity, I wish them good luck.

Zigler & Styfco (p. 303) say that the quality of the programs may not be sufficiently high to meet the needs of very young at-risk children. These qualities involve features such as good teacher/child ratios, staff trained in early childhood, small group sizes, and developmentally appropriate curriculum. For what I know, all these programs typically have such features. So, it is curious that they resort to this kind of argument. They say that in many public preschools, the number of children per teacher and the curriculum used often mirror typical kindergartens and are simply inappropriate for preschoolers. But the Abecedarian and Perry Preschool programs have low child/teacher ratio (6/1) and yet they are both disappointing.

They also write “In our chapter we have railed against this narrow focus because we do not believe that intelligence is the only important human trait.” (p. 307). This is excellent. But who seriously believes that intelligence is the only important human trait ?

Chapter 14
Is There a Cognitive Elite in America?
Nicholas Lemann

Lemann (1997) analyzes Herrnstein & Murray’s idea that a cognitive elite is dominating the United States and, more generally, modern societies. Lemann (p. 320) says that the argument that a cognitive elite emerges with assortative mating must imply that a cognitive elite should have been taking form gradually over time, rather than overnight in the 1950s. However, Herrnstein & Murray say (1994, p. 111) that assortative mating has increased especially among college-educated persons between 1940 and 1987 and have also noted (1994, p. 112) that a smart wife in the 1990s has a much greater dollar payoff for a man than she did fifty years ago. They believe that the feminist revolution (which has begun in the 1950s) has increased the likelihood of mating by cognitive ability, notably by increasing the odds that bright young women will be thrown in contact with bright young men during the years when people choose spouses. But Lemann also says that there is a weak evidence that high IQ people were beginning to accumulate in elite colleges during the 1950s. And he writes : “For example, Herrnstein and Murray’s figures on the (relatively low) average IQ scores at Ivy League schools in 1930, when pursued through the footnotes, turn out to come from the first administration of the Scholastic Aptitude Test to 8,040 students on June 23, 1926, and then the conversion of the scores to an IQ scale. But the takers were not actually students at Ivy League colleges; they were a self-selected group of high school students thinking of applying to Ivy League colleges. What Herrnstein and Murray report as the average IQ of Radcliffe College students is actually the average IQ of 233 high school girls who told the test administrators they’d like their scores sent to Radcliffe College.” (p. 320).

While Herrnstein & Murray believe that only people from a fairly narrow range of cognitive ability can become lawyers, Lemann (p. 321) argues that it’s because only people who get above average scores on the Law School Aptitude Test (LSAT) are allowed to become lawyers and not because of high IQ. True enough, but the authors never made this claim. They say that someone with a high IQ can fail at school, but someone who succeeds at school can hardly be a dumb person.

Lemann (p. 323) also says that as income rises above $100,000, the % of it derived from salaries and wages and business/profession is steadily declining, replaced by long-term capital gains. In other words, Lemann has the impression that the top income shares are composed of inheritors and financiers but not high-IQ professionals. The reason, I suspect, lies on the consequences of economic bubbles that are caused by the over-expansion of money supply, with the kind of scenarios articulated in the Austrian Business Cycle Theory. There are certainly many resource misallocations due to this monetary injection; inheritors and financiers would probably constitute a smaller share of the top incomes were it not for the monetary injection of the central banks. In any case, if Lemann is correct, we should find that IQ is not predictive (or loses predictivity) when income rises at very high levels. And this should be examined empirically.

Resnick & Fienberg (1997) seem a little bit annoyed, as they are afraid that the book of Herrnstein & Murray may revive the kind of opinions held by Galton, Pearson and Fisher.

They also write (p. 330) “Some, such as Stephen Gould in the new edition of The Mismeasure of Man, have argued that Herrnstein and Murray’s reliance on factor analysis is the Achilles heel of their entire effort and have dismissed it accordingly”. Oh God. That looks interesting. Let me remember. The passage of Gould’s (1996, p. 373) book… where was it already…

Charles Spearman used factor analysis to identify a single axis – which he called g – that best identifies the common factor behind positive correlations among the tests. But Thurstone later showed that g could be made to disappear by simply rotating the factor axes to different positions. In one rotation, Thurstone placed the axes near the most widely separated of attributes among the tests – thus giving rise to the theory of multiple intelligences (verbal, mathematical, spatial, etc., with no overarching g). This theory (the “radical” view in Herrnstein and Murray’s classification) has been supported by many prominent psychometricians, including J. P. Guilford in the 1950s, and most prominently today by Howard Gardner. In this perspective, g cannot have inherent reality, for g emerges in one form of mathematical representation for correlations among tests, and disappears (or at least greatly attenuates) in other forms that are entirely equivalent in amounts of information explained. In any case, one can’t grasp the issue at all without a clear exposition of factor analysis – and The Bell Curve cops out completely on this central concept.

Ouch. That must probably be this one, no ? So, what now ? If my memory is correct, Carroll in the Devlin’s book (1997) has written a chapter validating the existence of g and which has, by the same token, destroyed the multiple intelligence theories. But, more importantly, I would have sweared that Thurstone and Guilford were discredited quite a long time ago by now. And Gould resorts to this same argument in 1996 ?

Resnick & Fienberg write “The Bell Curve has been a “Pied Piper” for some segments of the social sciences. It deserves kudos for its way with words, drawing in readers who would otherwise be intimidated by numbers, whether or not they agree with Herrnstein and Murray’s conclusions. This success, however, has its price. We have looked a lot harder at the numbers than the typical reader is invited to do, and we are disappointed.” (p. 330). That’s good. Because I am also disappointed in your book, you see. So, we are even now.

They continue and say “Cultural transmission of traits that influence IQ is certainly possible. But they present no evidence for cultural stability and persistence, which is what their argument requires, and we find such stability unlikely given the rapid rate that culture can and does evolve.” (p. 333). Herrnstein and Murray believe otherwise, but I will not add any personal comment here.

They ask “If IQ scores correlate only in the 0.2 to 0.4 range with occupational success and income, how much importance can we assign to the cultural transmission of traits affecting the IQ of progeny?” (p. 334). These numbers are too low, and I recommend having a look at Strenze’s (2007) paper.

They say “Other variables, such as nutrition, quality of education, and peer culture, are, for various reasons, excluded from the analysis.” (p. 334). This, if I’m not mistaken, is what Jensen (1973, p. 235) termed the sociologist’s fallacy. The view that environmental variables have no genetic component.

They summarize (p. 335) the policy recommendation of Herrnstein & Murray. As expected, it’s a ridiculous caricature. The interested reader should read the chapters 21-22 of The Bell Curve instead of this scam. It is amusing that Resnick & Fienberg have noted (p. 334) that many critics of The Bell Curve have used straw man arguments. But these guys are not necessarily doing better than the others.

They dislike Herrnstein & Murray’s opinion that common people should need and hope for less government and more free market. They end the book with these final writings “Because of their principled opposition to government, Herrnstein and Murray have denied Americans the support of public institutions in the struggle against rising inequality. Without government, it will be a very unequal struggle.” (p. 338). But their distrust may have nothing to do with them being hereditarian. They (in particular Charles Murray) view the government as an inefficient allocator of resources for the same reasons generally advanced by the libertarians, and especially austrian economists. Government’s action causes resource misallocations, wastes and unintended consequences. They (especially Charles Murray) believe in the efficiency of free markets, but Resnick & Fienberg apparently don’t.

Cited references

Beaujean, A. A., & Osterlind, S. J. (2008). Using item response theory to assess the Flynn effect in the National Longitudinal Study of Youth 79 children and young adults data. Intelligence, 36(5), 455-463.

Woodley, M. A. (2011). The cognitive differentiation-integration effort hypothesis: A synthesis between the fitness indicator and life history models of human intelligence. Review of General Psychology, 15(3), 228.

I decided to cover this topic because I have applied this kind of analysis in my paper on the black-white score changes in the GSS Wordsum test. These techniques are not available in SPSS. One reason may be that these techniques are applied mainly by economists (who use mainly Stata), not by psychologists (who use mainly SPSS and may not be even aware of these techniques). However, the problem raised by data censoring and data truncation is also relevant in the field of psychology.

The tobit (or censored) regression is proposed for a dependent variable censored either at the lower end or the upper end of its distribution. Or both. Censoring is essentially a problem of floor and ceiling effects. For instance, some individuals are stacked at a certain threshold value (τ) because they cannot have a higher or lower score on the variable. This may be due to difference causes; the test may be too easy or too difficult. But censoring can take on another form. An income variable may have been coded into categories, e.g., $10,000-$20,0000, etc. …, but then at the very end, our last category may be something like “$100,000 and above”. In this case, the variable is censored at the upper end. As mentioned earlier, it is possible to have a data censored at both end, and in this case, we are specifying a two-limit tobit regression (by setting the value for lower and upper censored values); see Long (1997, pp. 212-213) for a development. For instance, in insurance coverage, there is a minimum coverage, a maximum coverage, and values in between.

The truncated regression is proposed for a dependent variable for which its distribution is not representative of the entire population. Truncation is essentially a problem of range restriction (although it is inaccurate to equalize truncation with range restriction). For instance, the data may have been collected for people having purchased durable goods. But people who did not purchase these goods due to, e.g., their price levels, are thus said to be truncated from below (instead of above). This is not to say that OLS is necessarily biased. It depends on the goal of the analysis. If we are interested in the value of Y for the entire population, OLS is biased. But if we are merely interested in our subsample, the OLS is sufficient (see the Stata manual). However, we must be aware that when we omit a portion of the data in this manner, the truncated data points are also missing not at random (because the value of Y for truncated and untruncated observations is different).

A graphical representation of censoring and truncation is given by Long (1997) :

In Panel A is the “latent” variable Y* that tobit and truncated regressions are trying to estimate (based on the set of independent variables). In censoring, the observations are censored and stacked at zero when τ=1. But, for truncation, the obervations literally disappear when they are below (or equal to) the threshold value τ=1.

Both techniques use maximum likelihood (ML) to estimate the effect of the changes in independent variables (Xs) on the expected (i.e., “potential”) value of the dependent variable (Y) given a gaussian (i.e., normal) distribution. Because the expected value of the dependent variable is latent (i.e., not observed), it is not possible to obtain standardized coefficients, unless we apply a special procedure (Long, 1997, pp. 207-208).

As for tobit, the technique allows a decomposition of the effect of X on the latent Y (i.e., the tobit coefficient) into two parts : the change in the probability of being above the censored value multiplied by the expected value of Y if above plus the change in the expected Y for the cases above the censored value multiplied by the probability of being above the censored value (McDonald & Moffitt, 1980). Mathematically, the latent Y* variable in tobit model is given by :

δEy/δXi = F(z) x (δEy*/δXi) + Ey* x (δF(z)/δXi)

where F(z) is the proportion of cases (i.e., probability) being above the threshold, δEy*/δXi is the change in the expected value of Y for cases above the threshold associated with an independent variable, δF(z)/δXi is the change in the probability of being above the threshold associated with an independent variable.

Long (1997, p. 196) presents the formula in a more intuitive way :

E(y) = [Pr(Uncensored) x E(y|y>τ)] + [Pr(Censored) x E(y|y=τy)]

Pr for probability, E(y) for expected y, and | y>τ for conditional on y above τ, and τy is the value of y if y* is censored (in Long’s book (see p.197) at least).

If we are only interested in the changes of the Xs on the latent Y, the coefficients obtained from tobit regression can be interpreted in the same way as those obtained from OLS regression (Roncek, 1992).

The formula for truncated regression can be found in Long (1997, p. 194) and in the Stata manual for truncreg function.

We haven’t provided a detailed answer of why OLS is inconsistent with truncated data when our interest focuses on the population estimates. One crucial assumption of OLS regression is the independence of the errors (residuals). The residuals must have mean zero and be uncorrelated with all explanatory variables. The problem here is that truncated data causes the sample selection (s) to be correlated with the error term (u). Wooldridge (2012, pp. 616-617) provides an example with a selection indicator s, i.e., s=1 if we observe all of the data or s=0 otherwise, where s=1 if the Yhat is lower or equal to the threshold (considering that the data is truncated from above). Equivalently, s=1 if u≤τ-Xβ, where Xβ is a shorthand for β0 + β1X1 + β2X2, … . This means that the value of s covaries with u.

Long (1997) illustrates the consequences of censoring and truncation for OLS estimation with Figure 7.2. The solid line is given by the OLS estimate of Y that is not censored. The long dashed line, OLS with censored data, has a lower intercept and a steeper slope because of the many values set at zero (shown as triangles), just below the threshold horizontal line τ=1, that pull down the left side of the long dashed line. The short dashed line is given by an OLS estimate with data points below τ=1 being truncated (i.e., removed) instead of being censored and shows a higher intercept and smaller slope.

Figure 7.7 (page 202) also shows in a very simple manner the effects of censoring and truncation. The difference here is that the censoring data points are equal to the threshold rather than being below it. The dots below the threshold τ=2 are truncated data points. E(y*|x) in the solid line is the correct estimate. E(y|y>2|x) is given by the long dashed line. We see that the long dashed line is indistinguishable from the solid line as we move toward the right side, but the long dashed line is above the solid line as we move to the left side. This is because there are few (many) data points truncated at the right (left) side. The long dashed line becomes closer and closer to τ as we move to the left. We also see there are circles along the horizontal line τ=2. These are censored data points. The short dashed line represented by E(y|x) is slightly below the long dashed line at the left side of the x axis, because the censored cases were not eliminated.

Both types of regression require normality and homoscedastic of residuals, even in the case of tobit which always considers a censored distribution to be non-normal. But since the Y* variable is not an observable one, we cannot get our residual variable by doing Y minus Yhat because we have to use Y* instead of Y. In tobit regression, a complex procedure must be applied to get the generalized residuals and conduct the test of normality (Cameron & Trivedi, 2009, pp. 535-538).

A particular feature of these kinds of regressions is that a standardized coefficient is usually not reported in statistical softwares because its calculation is not straightforward. Normally, the fully standardized coefficients are obtained with the operation coeff(X)/SD(Y)*SD(X). In the case of tobit regression, Roncek (1992, p. 506) shows that the standardized tobit coefficient can be obtained by coeff(X)*f(z)/“sigma”. f(z) is the unit normal density; this is (in my opinion) a complicated way of presenting the formula because one could have replaced the ambiguous f(z) by the more intuitive notation SD(X). “Sigma” is the estimated standard error of the tobit regression model (usually reported by the software) and is comparable with the estimated root mean squared error in OLS regression. But since sigma is the variance of Y* conditional on the set of X variables and that it needs not be equal to the unconditional Y* which is what we need, Long (1997, pp. 207-208) argues that the unconditional variance of Y* should be computed with the quadratic form :

where Var^(x) is the estimated covariance matrix among the x’s and σ^ε² is the ML estimate of the variance of ε. Thus, Long suggests we use the formula coeff(X)*SD(X)/σ^y*².

Even though the standardized coefficients seem usually preferred by psychologists, the economists (and particularly econometricians) dislike standardized coefficients and probably won’t recommend its use.

Finally, it should be noted that OLS is not always inconsistent with data having sample selection (Wooldridge, 2012, pp. 615-616). We will re-use his example of the s indicator of sample selection. If sample selection (s) is random in the sense that s is independent of X and u, the OLS is unbiased. But OLS remains unbiased even if s depends on explanatory X variables and additional random terms that are independent of X and u. If IQ is an important predictor but is missing for some people, such that s=1 if IQ≥v and s=0 if IQ<v, where v is an unobserved random variable that is independent of IQ, u and the other X variables, then, s is still independent of u. It is not a requirement that s is uncorrelated with X independent variables, on the condition that X variables are uncorrelated with u because it implies that the product of s and X must also be uncorrelated with the residuals u.

I recommend to download my subset of the GSS data : http://openpsych.net/datasets/GSSsubset.7z. The syntax I am using here is for Stata and SPSS. If you don’t have either of these softwares, you may want to use the free software R. You have to read my introduction to R for advanced statistics and use my R syntax if you want to reproduce the analyses below.

The results are identical between the softwares but not for sample sizes. This is because when sampling weight is used, Stata continues to treat N as observed sample size while SPSS treats N as weighted sample size.

The attentive reader would have noticed that I have dropped all observations having missing data on either wordsum or bw1. This is because most (if not all) softwares are flatly stupid. When we compute the predicted Y, or Yhat, the software will compute these values even for the observations that weren’t included in the regression model. Let’s say age variable has 50118 cases and my wordsum variable has 23817 cases. Also, assume that everyone having data on wordsum also has data on age. Listwise deletion employed in regression techniques will keep only 23817 cases. But what will happen is that Yhat will be computed for 50118 observations, just because age has 50118 observations. For what I have seen, it does not bias the computed values, but it adds some useless and fictitious values for the observations removed by the listwise deletion procedure. We do not want these additional values.

Mean-centering

Here, one question must be answered. Why am I using mean-centering ? Or, at least, center at any other value ? This is because the intercept is the value of Y predicted by the regression model (Yhat) when all independent X variables are equal to zero. The minimum age of the sample is 18. That implies the value 0 is not an observable value in my variable. So, if I use the original variable of age, my Yhat variable will be the wordsum score when people are aged 0. This obviously makes little sense. If wordsum increases with age, then, using the original variable of age will cause the values of Yhat to be too low. The value zero must be an observable value in all of the independent variables.

The other advantage of mean-centering is to avoid extreme multicollinearity. Multicollinearity is a trouble because interaction terms and squared/cubic terms are usually known to have an extremely high correlation with the main effect variables. When it happens, the software will typically remove the sources of that collinearity, and several variables will be dropped from the analysis.

Descriptive analysis

Before using regression, let’s request a scatter plot of wordsum against age, by race. This is simply a descriptive analysis, but will reveal useful for reasons explained later.

GRAPH
/SCATTERPLOT(BIVAR)=cohort WITH wordsum BY bw1
/MISSING=LISTWISE
/SUBTITLE='Wordsum score across age by race'.

We want a loess curve, not a fitted line. In SPSS, it’s a bother. There is no possibility to obtain fitted lines with syntax. We must edit the graph manually (properties -> loess -> apply). But anyway…

Usually, when we do scatter, we have a cloud of data points. This is because the analysis is a random effects approach. We didn’t request or compute a fitted regression line. However, we can request a loess curve without scatter dots. This makes the graph easier to read. Because otherwise, we would have something like this :

In any case, there is a large black-white gap narrowing, which seems to have stopped quite recently. And we see that the gap widens considerably with age.

At this point, one may ask a question. Why using statistical modeling when a simple descriptive analysis already answered our question ?

Let’s say var1 is age, var2 is race, and var3 is age*race interaction. We first use a model with only age in independent (X) variable. After running this model, I save the predicted value (Yhat) and plot the Yhat against age by race.

Hm… What happened ? There is only one regression line. Ok. But, the reason is simple. The difference is that the first plot is a summary/descriptive analysis while this plot here is about modeling. In our regression, only age has been included. In other words, wordsum is allowed to vary only across ages. But not across races.

We have two separate lines, but they are parallel. We didn’t include the interaction between age and race and, so, race difference was not allowed to vary across ages.

Now, I specify age, race, and age*race in my set of independent (X) variables.

This time, we get two separate regression lines and two separate slopes because our regression equation allows age effects to differ across races. If we have ignored the interaction, we would get two separate lines that are perfectly parallel. But even with this interaction, these lines are straight lines. As we have seen in the descriptive loess plot, the changes in wordsum score with age is not linear. Let’s add age^2 and race*age^2.

This is much better, but still not the best. If we add age^3 and race*age^3, we get the plot I graphed in Figure 2 of my paper. And that graph is the best approximate of the descriptive loess curves presented above.

That’s what modeling is all about. By specifying a set of X variables, we allow the Y variable to vary only according to the values of the included X variables. The advantage of this approach however is that we can hold constant all the variables we suspect to be confounding factors. And we cannot do this by a simple descriptive plot.

Linear prediction

If we have the parameter values, assuming that a linear regression model is the best good approximation of the data, we can easily make prediction with regard to the slopes.

Imagine that my regression involves Wordsum in Y variable, race difference in Wordsum by year in the X variable. The intercept is 1.04 and the slope (unstandardized coefficient) is -0.004. Because there are 22 years in the sample, I can guess that 1.04 is the Wordsum gap at year1, and the Wordsum gap at year 22 is 1.04+-0.004*22 = 1.04-0.088 = 0.952. These numbers were from the Lynn’s paper (1998) I have commented.

Let’s assume a more complex model, one that involves race (dichotomy), year/cohort and the interaction of race with year/cohort. How should we interpret the coefficients ? Huang & Hauser (2000) attempted to replicate Lynn (1998) by using this specification. They get an intercept of 2.641, a coefficient of 3.037 for race, and 0.024 for year, and -0.0176 for their interaction. If black=0 in race variable and year1=0 in year variable, the calculation is straightforward. The intercept is the Wordsum score of blacks at year1, while the Wordsum score of whites at year 1 is the sum of intercept and slope of race, i.e., 2.641+3.037=5.678. This calculation is possible because the coefficient of race is the effect of race net of the effect of cohort and its interaction with race. Now, what about the changes in black-white gap over time ? The slope of 0.024 for year tells us that Wordsum for blacks increase at 0.024 per year. The slope of -0.0176 for the interaction term tells us that an increase of one unit of year corresponds to a decline of the black-white gap of 0.0176 per year. With these parameters, we can easily create two columns (one for blacks, one for whites) which can be plotted together, because we can make a guess of the black and white scores at each subsequent year. For example, at year24, the black score and white score are predicted to be, respectively, 2.641+(0.024*24) and 5.678+(0.024*24)+(-0.0176*24), that is, 3.217 and 5.832.

But everything becomes more complicated if we add squared and cubic terms.

OLS regression

As we have seen, the Yhat is computed by summing the intercept and all the products of the coefficients with their variables. But until now, we have just requested the Yhat from the software, and we didn’t compute it manually. Let’s say we want to graph the black-white changes in wordsum score across cohort.

The attentive reader will ask. Why using numbers as far as 8 or 9 digits after the decimal point ? Well, the problem is that with squared and cubic terms, even a small value is important. We shouldn’t stop at 2 or 3 digits after the decimal.

Now, let’s explain the calculation. What does it mean to do Yhat = intercept + coeff*var1 ? Let’s say we have a sample of 10 individuals. Each of them having different values in all variables. And let’s use a simpler model.

In this model, I obtained an intercept of 4.8957, and coefficients of 1.369, 0.00808, and -0.01537. Now, let’s look at our data.

The Yhat column above is the Yhat requested directly from the software, not computed by hand. Look closely at the first row. This individual has a value of 1 for bw1, cohortc and bw1cohortc. Then, his Yhat(wordsum) is given by :

You see the values are close to the ones displayed in my data window. But not exactly the same, due to rounding bias. Still, the calculation of Yhat is straightforward. If we want to control for covariates such as SES variables, things will become tedious. Let’s add in our regression the variable gender. We get this graph.

This is a graph of Yhat vs cohort by gender values (0=male, 1=female). Now, imagine we use degree (mean-centered) variable instead of gender. Since degree has 5 values, we should get 5 curves for whites and 5 curves for blacks. Let’s see…

No. It’s impossible to read. And we want to ask. How can we get a reliable curve when plotting Yhat ? Requesting the Yhat from the software appears difficult. If you haven’t guessed yet, we can do it by just ignoring the SES variables in our calculation of Yhat. After all, the regression coefficients for bw1, cohortc, cohortc2, cohortc3, bw1cohortc, bw1cohortc2, bw1cohortc2, are the mean predicted values for these respective variables when degree, educ, logincome and age are held constant. So, Yhat should be computed simply as follows :

That graph is nice. We see, again, that the black-white difference has been halved.

Before closing this section, I want to say there is another way to use regression. By means of dummy variables. They are good substitutes for non-linear terms. The drawback is that the range of values assigned to each dummies can be arbitrary. In my situation, it is, but only because I wanted to avoid low sample sizes in the lowest and highest values of my dummy variables. The advantage is that we do not need to have the raw data set to make predictions. Dummy variables are coded as 0 and 1. So, multiplication of a coefficient by a variable is not needed. In dummy variable regression, we need to drop one dummy, in order to avoid perfect collinearity among the variables.

My intercept is 4.506666. And the coefficient for bw1 is 1.595729. What are these values ? Since cohortdummy1 and bw1C1 are missing (reference categories), and the intercept being the value of Y when all X are zero, thus the intercept is the black score for cohortdummy1. And bw1 is the black-white difference net of the influence of all other X variables. So, bw1 is effectively bw1C1. We get the following columns :

In the first column, the first six rows are the black scores (intercept, intercept+cohortdummy2, intercept+cohortdummy3, etc.), the last six rows are the white scores (intercept+bw1, intercept+bw1+cohortdummy2+bw1C2, intercept+bw1+cohortdummy3+bw1C3, etc.), the second column is the racial category (black=0, white=1) and the third column is the category of the cohort dummy variables.

The method is similar to OLS regression, except that the intercept is a logit. And a logit is not a probability. So, to obtain that probability, we exponentiate the logit, and we obtain the odds. Finally, we get the probability by computing odds/(1+odds).

Here, I have the opportunity to introduce to the so-called Differential Item Functioning (DIF) methods. The idea in the different kinds of DIF techniques is similar. We want to know if the group variable contribute to the probability of correct response in any given item when the total score (sum of all items) is included as regressor. DIF methods are complicated, so I will shorten the description of LR method.

In logistic regression (LR), we have typically three models. One involves only total score, the second adds group variable, the third adds the interaction of group with total score. The first model can be regarded as the Rasch model which makes the assumption of unidimensionality (i.e., only the total score accounts for the variation in the probabilities of correct response). The second model is aimed to detect uniform DIF. The third model is aimed to detect non-uniform DIF. If we want a measure of “effect size” for the model with both uniform and non-uniform DIF, we calculate the difference in R² between model 1 and model 3. But I would rather use its square root. A better, simpler method consists in plotting the probabilities, or Item Characteristic Curves (ICC). Plotting the ICC in model 3 gives both forms of DIF. So, we will disregard R² and look at the ICC. I select word g as the dependent variable because it is by far the most biased item among all of them. The interested reader can try other items.

I calculate the logit and probabilities manually, but we can ask the software to do it for us.

The probability of correct answer in word g is clearly different between blacks and whites. Of course, this is not all. We need to know if the black-white difference in the TCC or total characteristic curve (i.e., the sum of all ICCs) differs from the black-white difference in the raw wordsum score. We need also to answer whether the DIF methods really require that (almost) all items should be DIF free or not. And if these DIFs introduce biases in prediction (Roznowski & Reith, 1999). These are all complex questions, and I have planned to answer them in a forthcoming article.

Multilevel (linear mixed-effects) regression

Multilevel regression is exceedingly complex, and probably no less complex than SEM, MGCFA and IRT. If you need a description, go read my paper. Then, I will be using a 2-level model. In the fixed (level-1) component, I have bw1, aged1-aged9 dummies and the interaction of bw1 with aged1-aged9 dummy variables. The fixed portion can be compared to the classical OLS regression. The random (level-2) component includes cohort21 as random intercept and bw1 as random slope. These variables are in fact residuals of their respective fixed coefficients. That is, a random coefficient is estimated for each of the 21 categories of my cohort21 and these coefficients are the variations around the model mean intercept estimated in the fixed component. And bw1 random slope will also have 21 random coefficients because this random slope is specified to vary across each value of the cohort21 variable. The random coefficients in the random slope bw1 are the variations around the average fixed slope bw1 estimated in the fixed component of the multilevel model.

By holding constant age, and letting cohort vary in the population, we are able to break the linear dependency between age and cohort, i.e., separate the effect of age from the effect of cohort with respect to black-white score gap. This is important because, as we have seen, the black-white gap increases with age, decreases with cohort. But age decreases in more recent cohorts. There is a potential confounding here that classical regression cannot disentangle because it assumes that the effects of all X variables are additive (see Yang & Land, 2008, p. 322).

In this kind of analysis, there are two ways of making graphs.

The first is to use the estimated random coefficients of the intercept and slope (in the case your model involves a random slope model). Since the intercept is the value of Y for all Xs=0, the random effects of cohort are in fact the random variation of black scores over time. And since the random slope is the dichotomized race variable, the variation in white score over time can be obtained by adding the random coefficients of cohort to the respective random coefficients of the random slope bw1. We now obtain two columns (i.e., vectors) of values and we can plot them by group (race and cohort) categories.

The second is more straightforward. It simply requires to compute the fitted values, and all softwares can automatically request the fitted (predicted) Y. But in multilevel regression, the fitted values are the sum of the fixed component and the random component. Usually, the softwares should be able to request either the fitted, or the fixed effects, or the random effects (EBLUPs) predictors.

To the extent I have included many variables (nine age dummies and a race dichotomy), a simple scatter of the fitted wordsum against cohort21 will show variations of data points corresponding to the fixed effects of age and race. A possibility is to graph the scatter for each value of the age dummy. So, here, I request a series of scatter plot by age.

If you find the SPSS syntax a little bit complex, you should definitely read Bickel’s (2007, p. 113) guide. In SPSS, it seems difficult to request a multi-panel plot. I use Stata again :

Here, we don’t see the black-white score gap narrowing. So, what happened ? Oh. Nothing fishy. It’s a question of which method is the most adequate. I have answered this question in my paper. But if the explanation is not simple enough, I will make it even simpler in my forthcoming article on multilevel models.

]]>menghublog1001How to calculate and use predicted Y-values in multiple regression - 1How to calculate and use predicted Y-values in multiple regression - 2How to calculate and use predicted Y-values in multiple regression - 3How to calculate and use predicted Y-values in multiple regression - 4How to calculate and use predicted Y-values in multiple regression - 4.5How to calculate and use predicted Y-values in multiple regression - 5How to calculate and use predicted Y-values in multiple regression - 6How to calculate and use predicted Y-values in multiple regression - 7How to calculate and use predicted Y-values in multiple regression - 8How to calculate and use predicted Y-values in multiple regression - 9How to calculate and use predicted Y-values in multiple regression - 10How to calculate and use predicted Y-values in multiple regression - 11How to calculate and use predicted Y-values in multiple regression - 12How to calculate and use predicted Y-values in multiple regression - 13How to calculate and use predicted Y-values in multiple regression - 14How to calculate and use predicted Y-values in multiple regression - 15How to calculate and use predicted Y-values in multiple regression - 16The 1920-1921 Depression and Recoveryhttps://menghublog.wordpress.com/2014/12/17/the-1920-1921-depression-and-recovery/
Wed, 17 Dec 2014 23:32:34 +0000http://menghublog.wordpress.com/?p=2360Continue reading →]]>Let’s recall the story. Some austrian economists (Woods, 2009; Powell, 2009; Murphy, 2009) claimed that Warren Harding cut the taxes, and by this has promoted the economic recovery. But Kuehn (2010) challenges this view. Kuehn (2012) believes that it was the reduction in interest rates by the Fed that has helped the economy to recover. And Selgin (2014) answered that it was not the Fed’s monetary easing but gold flows that has contributed to the recovery.

For the description of the 1920-1921 crisis, Vernon (1991) reports that the industrial production has fallen 25.6% below its January 1920 peak and bottomed out at 32.6% below its January 1920 level in July 1921 while the unemployment rate was 1.4% for both 1918 and 1919, 5.2% for 1920, and 11.7% for 1921. These figures make the 1920-1921 crisis a serious one, indeed. On the other hand, Vernon indicate that the causes of the dramatic deflation were due to both a decline in aggregate demand and a positive aggregate supply shock.

The ratio of the percentage decline in the GNP deflator for 1920-21 to the percentage decline in real GNP is 2.6 using the Department of Commerce figures, 3.7 using the Balke and Gordon data, and 6.3 using the Romer data. By contrast, during 1929-30, the first year of the Great Depression, the GNP deflator declined by 2.7 percent and real GNP by 9.4 percent, for a ratio of 0.3. The ratios of the percentage decline in GNP prices to the percentage decIine in real GNP for 1930-31, 1931-32, 1932-33, and 1937-38, the other Great Depression years in which real GNP declined, were 1.0, 0.9, 1.2, and 0.3, respectively, all well below the 1920-21 figures.

Romer (1988, Figure 1) argues that Commerce series data were less reliable than the revised Kendrick series. The first shows a 15% decline in GNP and the second shows a 3% decline between 1919 and 1921 (Romer, 1988, pp. 108-109). Although Romer admits the sharp decline in aggregate demand, the GNP fell by little and so the aggregate demand is unlikely to move the economy down. Furthermore, the real GNP rose 5% between 1917 and 1918 and the GNP deflator rose 15%. In contrast, between 1920 and 1921 a fall in real GNP of only 2% was associated with a price decline of 16%. Romer suspects that there may have been some type of aggregate supply shock either during the war or in 1921. What this tells us is that the Keynesian prescription (i.e., fiscal stimulus) wouldn’t apply here because there seemed to be no expectation of a further decline in aggregate demand.

In response to the Woods, Powell and Murphy articles, Kuehn (2010) points out that the Harding administration did cut tax rates for higher income families in 1922 (the highest bracket’s rates were reduced from 73% to 58%) and implemented an across the board rate reduction in 1923 (from 58% to 43.5% for the highest bracket and from 4% to 3% for the lowest bracket). However, these rate cuts were accompanied by a considerable expansion of the income taxable at any given rate. While the top bracket’s rate was reduced by 15% points from 1921 to 1922 in Harding’s Revenue Act of 1921, the income taxable at that rate was expanded from all income over $1,000,000 to all income over $200,000. The net effect was that the percent of individual income collected as revenue through the income tax has actually increased from 3.67% to 3.95%. Kuehn (2010, pp. 12-13) argues that the decline in the tax burden came too late to be considered as a factor in the recovery from the 1920-21 downturn. Harding entered office as the contours of the new “normalcy” were emerging.

With regard to the fall in the general price level after the Federal Reserve began increasing the discount rate, it was a dramatic one. In January 1921, a year after the rate increase began, prices had already fallen by 23.7% to a level last seen in November 1918. The deflation was also accompanied by a sharp decline in wages. The NBER index of average weekly earnings across 12 manufacturing industries declined by 34% from June 1920 to its lowest point in January 1922, a considerably steeper drop than the decline of 19% recorded in the BLS’s consumer price index over the same period. Other wage indices showed a more measured decline, such as the New York Federal Reserve’s Composite Wage Index, which fell by only 12%. King (1923) reports a drop in wages of 8.9% from the fourth quarter of 1920 to the first quarter of 1922. King (1923) also reports that factory wages declined by 14.5% over the same period, with wages in Metals and Metal Products falling 20.5%. Other figures are available from the NICB 1922 report; Wages and hours in American industry, July 1914-July 1921. The nominal wage declines did not occur before the beginning of 1921. See pages 8, 10-11, 16-17, 21, 26, 31.

Most of the federal spending declines occur before the beginning of the crisis, even though there is a slight decline in federal expenditures starting from the beginning of 1921. What Kuehn was pointing out is that the formidable slash in public spending occurred before Harding was even elected president (who was in office in March 1921). He then suggests that the recovery was accomplished by the Federal Reserve System, when it initiated discount rate reductions in May 1921 to keep the recovery on track. The graph below shows the discount rates (taken from Jon Catalan’s blog; the old blog, not the one in link).

A mere reduction of 0.5% on interest rates may account for the recovery. Really ? My impression is that if the reduction leads to a quick recovery, the keynesians would claim that the reduction was sufficient, and if there is no recovery, the keynesians would claim that the reduction was not enough. With this kind of logic, they always win. Anyway, according to the Bank of Canada, the effect of a change of the interest rates on aggregate output takes several months :

The Bank of Canada’s policy actions relating to the overnight interest rate have almost immediate effects on the exchange rate and interest rates, but current estimates suggest that it takes between 12 and 18 months for most of the effect on aggregate output to be observed. Most of the effect on inflation is not apparent for between 18 and 24 months (Duguay 1994). And even these estimates are subject to considerable variation. … In particular, these long time lags mean that central banks must be forward-looking in their policy decisions. … If, on 1 January 2005, the Bank of Canada observes an event in the world economy that is likely to reduce aggregate demand beginning in June of the same year, there is nothing the Bank can do in January to fully offset that shock. Even if it responded immediately and lowered its policy rate in early January, there simply would not be enough time for its policy to stimulate aggregate demand sufficiently to offset the effects of the shock by June.

As noted by Selgin (2014) however, it is even worse in the case of the Fed of the 1920s, which policy rate was the discount rate, rather than the federal funds rate (which target rate is achieved through overnight loans of bank reserves). Thus, in the 20s, a lowering of the Fed’s policy rate (discount rate) might not even imply an increase in Fed lending or security purchases. In reducing its discount rate, the Fed merely allowed banks possessing the requisite commercial paper to discount that paper with it at the newly reduced rates. Whether they would do so, however, depended on whether the rates in question were low, not merely compared to previous rates, but relative to market rates or to natural rates. If not, the volume of discounting might not budge, and the lower rates would not imply any actual monetary expansion, except perhaps relative to the contraction that might have ensued had rates remained high.

Now, there is another interpretation of the 1921 crisis. Selgin at freebanking.org has left a comment on the book The Forgotten Depression of Jim Grant (2014). Monetary expansion helped for the post-1921 recovery but not owing to the Fed’s easing.

As you can see from the chart, although there was some increase in “bills discounted” in response to the Fed’s lowering of its discount rate, the increase was slight compared to the massive decline in total Fed non-gold assets since 1920. What’s more, it was more-or-less perfectly–and by implication quite intentionally–offset or “sterilized” by means of Fed sales of government securities. The Fed’s contribution to recovery, in short, consisted, not of any actual monetary stimulus, but of a mere cessation of what had been a precipitous decline in its interest-earning asset holdings.

This isn’t to say that monetary expansion played no part in the post-1921 recovery. In fact, it played a significant part. But the expansion that took place was due solely to gold inflows, which were themselves encouraged by relatively high interest rates as well as by falling prices–that is, by the normal working of the price mechanism rather than by activist Fed policy. (In the 30s as well, by the way, such recovery as took place was entirely the result not of Fed easing–or of fiscal stimulus–but of the dollar’s devaluation and subsequent gold inflows from Europe.) That gold flows (as opposed to Fed easing) contributed to the post-1921 recovery is itself a fact that Jim Grant readily acknowledges; his book’s 17th chapter is called “Gold Pours into America.”

]]>menghublog1001A note on America's 1920-21 depression as an argument for austerity (Kuehn 2012) Figure 1discount rates 1920 depressionfed-1920s-assetsHistorical evidence of anti-Gresham’s Lawhttps://menghublog.wordpress.com/2014/12/16/historical-evidence-that-greshams-law-is-false/
Tue, 16 Dec 2014 17:04:43 +0000http://menghublog.wordpress.com/?p=2346Continue reading →]]>The book Good Money (Selgin, 2008) already showed us that the idea of Gresham’s law as a natural feature of free market is just plain wrong. Historical evidence of anti-Gresham’s law is so rare that it is even more important to report them.

But first, let’s recall the principle of this law. Simply, it says that bad money drives out good money, which reminds us of Akerlof’s lemon markets. The difference here is that the lemon markets are inherent to the free markets. And austrian economists usually believe that legal tender laws solely activate Gresham’s mechanisms. Imagine that 1 ounce of gold is equivalent to 15 ounces of silver in the market rate. The government fix the exchange rate to 1/20. Gold is overvalued and people get more silver for each ounce of gold than what they should have in a free market. Because silver is undervalued, people will stop making contracts that stipulate silver payments. And silver will disappear from the circulation on this country and may be sold to another country where silver has higher price.

When a given currency has the privilege of legal tender laws, it has a fixed exchange rate and thus circulates at a given face value. If the price of coin M is not allowed to vary according to its intrinsic value (its metallic content), when this coin M is traded against coin N, people seeking profits can remove coins M of its content because debasement of that coin is not reflected in its price. The debased or light-weight coins M are overvalued and full-bodied coins N are undervalued. As a result, coins of type N are melted down to be turned into lighter ones. The ultimate consequence is that if people are not allowed to freely value the prices of each coins, the good coins (N) will be withdrawn from the circulation. Hülsmann (2008) discusses legal tender laws.

The idea that legal tender laws activate Gresham’s law has been attacked by various authors. We will review their argument and provide an answer.

Fixed transaction cost (Rolnick & Weber, 1986)

Rolnick & Weber (1986) call that law a fallacy, believing that such laws enforcing fixed rate of exchange cannot be maintained because “it would imply potentially unbounded profits for currency traders at the expense of a very ephemeral mint or a very naive public” (p. 186). Bad money will drive good money out of circulation, according to them, but only when use of the good money at its market (nonpar) price is too expensive. Since small change is expensive to use at a nonpar price, they expect small denominations of the money undervalued at the mint to be scarce while large denominations circulate at a premium. They cite several historical accounts in order to show that despite legal tender laws, people didn’t follow the rules by adopting the fixed exchange rates, and that transaction costs (the Rolnick & Weber’s law) can better explain these facts in terms of fixed exchange cost.

Rolnick & Weber (1986) cite several experiences that appear to contradicts the Gresham’s law. The period between 1792 and 1853 in U.S.A. contains two such exceptions. One is the U.S. experience with the Spanish milled dollar (good money), which was a heavier coin than the U.S. silver dollar (bad money), containing about 373.5 grains of pure silver compared with 371.25 grains in the U.S. dollar, and over this period it had legal tender standing. From 1792 to 1811, the Spanish dollar circulated at a premium (of 0.25% to 1%) over the U.S. dollar. It continued to circulate at a premium in later years. The U.S. silver coins failed to drive out the Spanish dollars. Instead of being exported or hoarded, that good money circulated at a premium. The other U.S. experience involve gold and silver. Between 1792 and 1834, the U.S. mint overvalued silver. On April 2, 1792, the Congress passed a coinage act establishing a ratio of 15 to 1, the par price, between silver and gold coins, which was the market price in 1792. But soon after, the market price for gold rose and remained higher than the par price until June 24, 1834, when the second major coinage act raised the par price to 16 to 1. After mid-1834 and until the early 1850s, when Congress reduced the silver content of all small-denomination coins, the status of gold and silver currency was reversed. The ratio of 16 to 1 was higher than the market price for gold and remained so for the rest of the century. In this period, gold became the mint’s overvalued money. In reality, when gold was undervalued at the mint (1793-1833), 25% of the coinage was still gold, and when silver was undervalued at the mint (1834-46), 45% of the coinage was silver.

Rolnick & Weber (1986) cite two more examples in the U.S.A. One experience was during the early part of the greenback era (1862-79). Greenbacks were legal tender. Because of speculation on the outcome of the war and resumption, the gold price of these notes fell from their par value when first issued to 91 cents on the dollar by June 27, 1862, and to 84 cents by July 22, 1862, and below 40 cents by July 22, 1864. Specie (gold and silver) was the undervalued money. In the West, despite the presence of greenbacks, gold remained the unit of account and a medium of exchange. Greenbacks were current there but at a discount. In the East it appeared that the money system was reversed, as greenbacks were accepted as the unit of account and specie circulated at a premium. The other experience was at the time just after the Bland-Allison Act of 1878, when the Congress authorized the minting of another silver dollar, the so-called Bland dollar (412.5 grams of silver) which circulated with the trade dollar (420 grams of silver). Both of these were U.S. silver dollars. The Bland dollar was current at par, and the trade dollar circulated at its gold price, which varied around 93 cents. By 1880, the lighter-weight Bland dollar (legal tendered) failed to drive out the heavier-weight trade dollar and also managed to circulate at a higher price than the heavier-weight dollar.

Rolnick & Weber (1986) cite one experience from England in the 17th century, when the English mint began producing a new gold coin (guinea) along with the silver shilling. The guinea was first issued in 1663 at the mint price of 20 shillings, yet it never circulated at that price, and although not inscribed with any shilling denomination, was legal tender for all payments, including taxes, at 20 shillings. In 1663 this mint price was well below the guinea’s market price; that is, the guinea was undervalued at the English mint, and the shilling was overvalued. For many years, the premium was no more than 2. The price of guinea remained, thus, at 21.5 shillings.

These examples are used to show that legal tender laws didn’t force people to trade at par. But these examples, they say, also show that transaction costs matter. Their theory predicts that undervalued large-denomination currency would circulate at a premium while undervalued small-denomination coins would disappear. This tendency stems from the fact that paying premiums on small-denomination currency tends to be more costly than paying them on large-denomination currency; there are economies of scale in using currency at nonpar prices. During the silver standard period (1792-1833), of the undervalued currency, only the large denominations seem to have circulated. At that time, undervalued large-denomination currency consisted of gold coins and Spanish dollars that contained more silver than the U.S. dollar. While most of the gold was exported, the Spanish dollar circulated for many years at a premium. The small change available during this period consisted of U.S. silver coins and a substantial amount of Spanish coins. The small-denomination Spanish coins contained less silver than the U.S. coins, and the undervalued small U.S. coins had trouble circulating.

Selgin (1996) answered Rolnick & Weber’s (1986) criticism by saying that their theory of transaction cost (Rolnick & Weber’s law) does not replace Gresham’s law. Both of these effects can co-exist. Selgin argues that the failure to adopt par exchange, in the examples cited by Rolnick & Weber (1986), is due to the fact that there was no punishment (and thus no cost) for infringing the law : “Such laws operate, not by actually laking legally fixed exchange rates operational, but by making it costly or at least risky for sellers to communicate their monetary preferences to buyers.” (p. 641). So, when nonpar exchange is costly, due to legal tender laws, exchange costs are minimized by employing money that trades at par only. The sellers who attempt to place a discount on bad money or refuse it altogether may be punished while the would-be buyers who report such discrimination may be rewarded. This situation is akin to a prisoner’s dilemma : sellers would price their goods in terms of bad money rather than good money because they want to avoid legal penalties involved in refusing bad money while buyers would incur losses if they offer good money to a seller whose prices are in terms of bad money. This results in an equilibrium, in which bad money will be used as both medium of account and medium of exchange, that does not depend on laws explicitly favoring bad money or making the use of good money illegal.

Selgin (1996, p. 644, fn. 9) says that the sovereign authorities may sometimes fail to enforce the circulations of their coins at par. Selgin (1996, p. 646) also says that the bimetallic regimes (United States from 1792 to 1834) cited by Rolnick & Weber (1986) did not involve any explicit sanctions against most nonpar exchanges involving either metal. Selgin finally cites three episodes where legal tender laws explicitly punish infringements, and the results were in accordance with the conventional view of Gresham’s law, contra Rolnick & Weber. Selgin argued that how strict legal tender laws must be to give effect to Gresham’s Law remains an open question.

Information asymmetry (Dutu, 2004)

Dutu (2004) and Dutu et al. (2005) argue that asymmetric information can also activate Gresham’s law. They cite several instances in Europe where the practice is widespread. They focus on moneychangers, due to their major role in exchanges. Moneychangers exchange large denomination coins for small denomination ones, and cried down coins for authorized ones. They were thus the main metal suppliers for the mints, thanks to the metal gathered through their activity.

So, what are they guilty of ? Moneychangers appeared in the middle of the 12th century and were widespread all over Europe by the 13th century. They specialized in gathering and selling information on money. They seek profits through two means. Billonnage, by comparing the intrinsic content of two supposedly identical coins and paying with the bad but keeping the heavier ones. Arbitrage, by taking out the coins that were undervalued out of one place and bringing them where they were relatively overvalued.

This was possible because the intrinsic content of the coins was difficult to assess for most people. Of relative importance, Dutu (2004) focused on the great variety of coins, the imperfect coinage technique, frequent mutations, wear, and the poor communication network. As we shall see, the story told by Selgin (2008) reveals that the key element of this theory is plain wrong. It is competition that reduces the benefit of making fake coins through the multiplication and variety of coins, not the contrary. And it is competition that helps to improve coinage techniques.

Dutu (2004) argues that even if moneychangers were allowed to price two supposedly identical coins differently, they had no interest in doing so because buyers could not tell the coins apart. In this case, it would also have been profitable for them to sell their services in the detection of good and bad coins, as it was the case in Great Britain (Selgin, 2008). So, the scenario described by Dutu (2004) is clearly not the only possible outcome. Perhaps there were even some laws that have caused troubles in the circulation of information and caused people to adopt moral hazard behaviors like what happened with Enron when accounting expertise were found to be dishonest.

In light of what has been said, if Gresham’s law is still activated despite the absence of legal tender laws, one may wonder whether some restrictions at another level are operating. According to Dutu (2004, p. 561), money changing was the monopoly of a small group in some regions (e.g., Paris and Bruges). And in some others, they operated freely (e.g., Brussels and Liege). But that does not mean that the mints had free competition. As noted by historians, the reason why the few episodes of free banking and free coinage did not last is because the government would lose an extraordinary source of seigniorage profits. When it comes to money, regulation is the rule.

Even if Gresham’s law is activated, that may not last very long. The private sector may find substitutes, as they did in Great Britain when the government failed to supply enough coins of good quality (Selgin, 2008). Unless thick layers of regulations don’t allow the emergence of these substitutes.

Today, we don’t use such valued coins but paper money instead. And Dutu et al. (2005) admitted explicitly that Gresham’s law cannot apply to modern economies. Gold and silver coins were melted and recoined as lighter ones but today the coins, bills and deposits have no intrinsic value and thus are not melted. Certainly, the government’s control over the money could have reduced the intrinsic value of the money to nothing. Still, even in a free market economy, paper money will be the medium of exchange and valued coins or commodity money will remain in the banks, eventually leaving for bank clearing (Selgin, 1988). The relevance of Gresham’s law also diminishes due to the use of credit cards and electronic money today.

Historical evidence of Gresham’s Law

Tudor England

Selgin (1996) tells us that in Tudor England there was no such thing as freedom of contract in the modern sense. Persons caught profiteering from the internal exchange of coin at other than its par value were subject to fines and imprisonment as well as the forfeiting of any unlawfully exchanged sum. Such laws meant that a buyer had the right to offer in cash payment any coin issued by the mint, and that a seller had to accept such coin at face value regardless of its intrinsic value. The sellers adopted bad coin as their medium of account – a choice reflected in the substantial increase in prices following each episode of debasement. Buyers in turn offered sellers payment in bad money only, returning good coins to the mint in exchange for a share of the (nominal) seigniorage. By early 1549, most of the good (nine-tenths or ten-twelfths fine) silver coins that had been in circulation prior to the debasements had been reminted into coins containing less than half as much silver. By the time of Elizabeth’s accession, debasement had seriously eroded the real value of other nominally fixed government revenues, as well as the public’s real demand for money. Gresham advised the Queen that her only recourse was to restore the coinage by “bringing your basse mony into fine of xi ounces,” a standard last seen in 1527. In September 1560, the Queen took Gresham’s advice, while insisting at the same time that taxes be paid in new coin. As soon as its legal-tender status was revoked, bad money began to be supplanted once again by good money.

Continentals (U.S.A.)

Selgin (1996) tells us that, On January 11, 1776, when the Continentals were only five months old, Congress resolved that whoever should refuse to receive Continentals at par should be deemed and treated as an enemy of his country, and be precluded from all trade and intercourse with its inhabitants. States adopted similar resolutions. For instance, on December 27, 1776, the Pennsylvania Council of Safety decided that if any person shall refuse to take continental currency in payment, or for any goods or commodity offered for sale, or shall ask a greater price for any such commodity in continental currency than in any other kind of money or specie, the person so offending shall be considered a dangerous member of society, and forfeit the goods offered for sale or bargained for, to the person to whom the goods were offered for sale or by whom they were bargained for, and shall moreover pay a fine of 5 pounds to the state; and every person so offending, shall for the second offence be subject to the aforementioned penalties, and be banished from this state, to such a place and in such manner, as this Council shall direct. The award for informing on discriminating sellers and lenders offers an incentive to make legal tender laws even more effective. The Pennsylvania resolution was, moreover, as harsh on sellers who would place a premium on good money as it was on those who would place a discount on bad money, explicitly contradicting Rolnick and Weber’s assertion (1986, p. 193) that a premium on good money would not be in violation of legal-tender laws. These legal-tender laws gave effect to Gresham’s Law precisely to the extent that they failed to secure the actual exchange of good money at par, while successfully discouraging its circulation at a premium. Specie was seldom seen except upon its initial disbursement by newly arrived English and French troops. For the most part specie went into hoards, emerging only after Congress officially recognized its free-market value relative to paper money on 16 March 1781.

French Revolution

Selgin (1996) tells us that during the French Revolution the 1793 Convention, finding that paper Assignats had depreciated substantially relative to specie, decreed that any person selling gold or silver coins, or making any difference in any transaction between paper and species, should be imprisoned for 6 years; that anyone who refused to accept a payment in assignats, or accepted assignats at a discount, should pay a fine of 3 thousand francs; and that anyone committing this crime a second time should pay a fine of 6 thousand francs and suffer imprisonment 20 years. Some months later the same crimes were made punishable by death along with the confiscation of the criminal’s property. As in revolutionary America, rewards were given to informants, while penalties grew more severe. Such laws caused Assignats to replace specie as France’s medium of account while suppressing open quotations of any premium on specie. Instead of being offered in exchange, specie went into hoards, prompting the government to threaten confiscation of any concealed metals. Twelve men actually lost their heads for hoarding specie, on the grounds that they had intended to pay it to the enemy.

Chosun Korea

Kim (2004) relates one episode in the late Chosun Korea. The standard copper currency called sangp’yongt’ongbo (ever-normal cash) had circulated since 1678 in the Chosun dynasty. As the debasement of sangp’yongt’ongbo was not sufficient to meet an ever-increasing fiscal expenditure, the Taewongun administration circulated a large-denomination currency, named tangbeckchon (100-cash), in 1867. It was not formally convertible into ever-normal cash. The nominal value of 100-cash (bad money) was 100 times the nominal value of ever-normal cash (good money) whereas the intrinsic value of the former was only 5-6 times higher than that of the latter; the fixed rate was 100:1 and the market rate was 5-6:1 and the denomination of 100-cash was 100 times that of ever-normal cash. The state council declared a new legal tender law on 7 January 1867, eight days before 100-cash was about to circulate. Exchange rate was fixed between the two coins because of the law that penalized nonpar exchange. Since ever-normal cash was undervalued, the consequence was that ever-normal cash was taken out of circulation. As more and more 100-cash was poured into the economy, prices rose rapidly. The public tried to escape from these negative consequences and to substitute stable ever-normal cash for 100-cash. In response to this situation, the Taewongun introduced a much more stricter legal tender law on 23 February 1868 that henceforth all payments including taxes to the government agencies throughout the country must be made exclusively in 100-cash. Kim’s (2004) report is exceedingly confusing, but it seems that in the end, the government withdrew 100-cash in the final stage of the episode because of its inflationary effect (and the major part of the returned ever-normal cash seems to come from people’s hoarding). Total supply of 100-cash dramatically increased to 16 million yang for a short period of only 6 months until 16 June 1867. This amounted to approximately three times the total volume of ever-normal cash coined during the period from 1807 to 1850s. As a result, the price of rice soared from 7–8 yang per sok in December 1866 to 44–45 yang per sok in about two years. It went up by approximately 600% during such a short period of 2 years. The monthly rate of inflation is estimated to be 7.3–7.5%. Kim (2004) argues that this story confirms the conventional view that legal tender law causes bad money to drive out good ones, but also confirms Rolnick & Weber’s law. This is curious because there is nothing in the details reported by Kim that would confirm Rolnick & Weber’s (1986) theory. That small-denomination is driven out of circulation says nothing about Rolnick & Weber’s law because legal tender laws actually favored large-denomination coins.

Historical evidence of anti-Gresham’s Law

Great Britain (Birmingham)

Selgin (2008) told us the story of the Birmingham button makers in the Great Britain. The regal coins issued by the Royal Mint couldn’t satisfy the exceeding private demand for coins. But manufacturers and other businessmen succeed to supply the necessary tokens. It appeared that the Royal Mint couldn’t rival the private mints, which coins were of much higher quality so as to become historical documents. In fact, higher quality makes counterfeiting more difficult, and so the Royal Mint coins were counterfeited by a much greater scale.

There were multiple reasons why the private sector was making coins of good quality. According to Selgin (2008 p. 137) :

It did so, first of all, because nice coins were good publicity. At a time when there was no national press and when advertisements still consisted of mere notices, tokens “were one of the few media where persuasive – even aggressive – advertising could flourish” (Mathias 1962, 36). Although every token was good for some sort of publicity, the treatment of tokens as advertising platforms is most obvious in some tokens issued by retailers.

The story tells us that fakes can be easily detected and were not widespread. Counter-intuitively, competition makes counterfeiting more difficult. This is because a slight difference in coinage by false coiners is easily detected, thanks to the presence of multiple private mints having their own techniques and designs for making coins (Selgin, 2008, pp. 141-142). The emergence of a collectors’ market, surely owing to the high quality of the coins, helped a great deal to the detection of false coiners. This is even more remarkable, considering that counterfeiting was legal (Selgin, 2008, p. 144).

China (Han Dynasty)

Chen & Lai (2012) tell us about the story of China (Han Dynasty). The free coinage policy under Emperor Wen (179-157 BC) produced coins of greater quality than the central coinage policy of the Emperor Wu (140-88 BC).

What are the ingredients of the success ? In China, the existence of a money-weighing law (MLW) requires that all circulating coins must be checked using an official money scale. Otherwise the user would be punished with ten days of forced labour. MWL guaranteed that all circulating money would be scrutinised. That law was widely applied throughout the empire. With the use of the scale, when buyers tendered bad money to purchase commodities, the seller would ask for more money to compensate for the underweight coins (according to the quality of coins as judged by the money scale).

The MWL was an important device for supporting Emperor Wen’s free coinage policy. In the first step, the government provided a standard form of money with a specified metal content (fineness). In the second step they encouraged people to mint coins, and these had to meet the minimum requirements as specified. In the third step they enforced the use of the standard money scale as a public arbitrator, to distinguish good money from bad.

As the number of competitors increased, in order to sell more coins, the coiners would have to improve the quality of their coins and excess profits would decrease until equilibrium was reached. Ultimately, the confidence in private coinage was the consequence of market mechanism.

Table 1 illustrates the quality of some ancient Chinese coins. Column (a) shows the officially claimed weight of the coin when minted, where the basic unit is zhu (0.651 grams). Column (b) shows the average weight of the coin as found in an archaeological site. Column (c) shows the ratio of the actual to the claimed weights, which reveals the degree of debasement: below 100% means the quality of the coin is degraded, over 100% means its quality is higher than the official standard. Column (d) shows the copper content of the coin (fineness), as reported from laboratory analysis. Column (e) then compares the copper content of each standardised gram in various coins. The usefulness of this comparison is evident from the indexes in column (f). We use the copper content of each standardised gram of the first coin in column (e) as the base (0.43g=100) against which to compare the quality of other coins.

The coins minted under Emperor Wen and Emperor Jing (179–141 BC) have the highest quality (index=205). Column (e) shows that the copper content of the coins was obviously lower under the central coinage regime. The four-zhu (sizhu) coin minted during the reigns of Emperors Wen and Jing (179–141 BC) contains 0.88g of copper for each standardised gram, while the same four-zhu coin minted during the reign of Emperor Wu (140–88 BC) contains only 0.73g of copper for each standardised gram. There are more sizhu (four-zhu) samples from archaeological sites. First, we have 430 sizhu coins minted during the reigns of Emperors Wen and Jing (179–141 BC, free coinage), with a total weight of 1,223.15g and an average weight per coin of 2.84 g. Second, we have 75 sizhu coins minted under Emperor Wu (140–88 BC, central coinage), with a total weight of 174.3g and an average weight per coin of 2.32 g.

This was for the anti-Gresham’s law. But in the Qin Empire (221-206 BC), the emperor has forced people to accept all types of money in circulation. Bad money was driving out good money. Then, the Empire was unable to issue a sufficient quantity of good quality new coins in the short term. Interestingly, this episode resembles what happened to the Royal Mint in Great Britain (Selgin, 2008). Anyway, the state stipulated that all existing coins were legal and usable if they satisfied four conditions: they were not seriously damaged, they were not made with lead, the inscription was identifiable, and the diameter was greater than 1.8 centimetres. Anyone who refuses the money was punished and private minting was now illegal.

And finally, the Emperor Jing has adopted in 144 BC the central coinage policy that has ended the first and only free coinage golden age in Chinese history. The sizhu became unstable in weight and fineness. But ultimately, the quality of the coins has deteriorated. 10,436 five-zhu coins were found. The average weight of these five-zhu coins decreased from 3.35g (minted under Emperor Wu, 140–88 BC), to 3.26g (minted under Emperor Zhao, 87–75 BC), and then to 3.07g (minted under Emperor Xuan, 74–50 BC, and Emperor Ping, 1–5 AD). The average weight of the five-zhu coins continued to decrease, reaching 2.86g in the Eastern Han Dynasty. The downgrade in quality, according to the authors, was mainly due to the pressure of state finance.

New Orleans (U.S.A.)

Pecquet & Thies (2010) report the story. During the first year of occupation in 1862 by Federal troops, when the Union commander repudiated the city’s Confederate currency, the money supply of New Orleans was thus destroyed by the repudiation of Confederate currency and the diminished usefulness as money of the notes and deposits of the weaker banks of the city and of the notes of the stronger ones. As the money supply was declining, the city government of New Orleans faced a significant deficit.

In 1863, the city government issued a large amount of municipal scrip, which included a range of denominations from change notes to 20 dollar bills, which had no specified maturity date and were not redeemable in any form of money. But, they were acceptable for city taxes and, thus, they could be said to have been tax-backed. Through 1864, the cumulative amount of municipal scrip issued was arguably small relative to the need of the city for a medium of exchange. Even though the only backing for these notes was that they were receivable by the city for taxes, they passed in retail transactions at par, and were the main hand-to-hand currency and the standard of value in retail transactions through early 1866. In larger transactions and with brokers, they exchanged for at most a small discount. By the end of May 1863, the New Orleans Bee reported that city treasury notes circulated at par while Greenbacks commanded small premiums varying from 1–2% to 2.5–3.5%. By the end of 1863, the paper reported that, although both currencies traded at par in retail transactions, brokers required a one half percent premium for legal tender notes.

But beginning in 1865, several factors undermined municipal scrip, including (1) the revival of banking, (2) growing concerns about overissue, and (3) certain actions by both city and federal authorities. The revival of banking might have begun in 1863 with the return of the Louisiana State Bank to its management early in the year and with the incorporation of the First National Bank of New Orleans later in the year. However, through 1864, uncertainties plagued the state-chartered banks of the city. These uncertainties involved the possible recovery of the specie taken from them by the Confederates, the enforcement of the claims of the banks against planters, and the value of the state and municipal bonds held by the banks. With the end of the war, these uncertainties were generally resolved in favor of the state banks, resulting to a rise in the value of bank notes and deposits to par for all but one of the city banks.

Large-denomination scrip becomes uncurrent. The city’s net emissions of scrip proceeded at a rapid pace. From 1864 to 1867, the quantity of city notes in circulation increased from $2.3 to $4.0 million. During 1867, state scrip became uncurrent within New Orleans. About the same time, the banks refused to accept New Orleans municipal scrip as deposits and broker discounts for New Orleans municipal notes increased to about 5%. The continuing emissions by the city to deal with its own financial troubles soon renewed the depreciation of the city scrip.

Small-denomination scrip also becomes uncurrent. By late 1867, New Orleans city officials wanted to completely end the acceptability of municipal scrip for taxes. At some point, the General Hancock himself undermined the city notes by rejecting the tax-backing of state notes. Thereafter, Louisiana state notes could be accepted only for tax arrears or redeemed by some other mechanism such as swap for state bonds. Immediately, the value of Louisiana state scrip plunged in the currency markets from 30% to 50% discounts against the U.S. dollar. At this point, the market value of small-denomination city scrip fell to a point where retailers balked at receiving it because of the substantial cut they were taking from brokers. Overnight, the city’s retail stores refused to accept small-denomination city scrip in trade.

So, in the end, the municipal notes were accepted as soon as its value is stable but were being dismissed as soon as its value plummet due to issuance of municipal notes in excess of the demand. This exemplifies how people choose money when legal tender laws and punishments are not enforced.

Conclusion

There are few historical evidence but they lead to the same observation. Good money drives out bad money. We have seen there are many ways by which the market can sort out the good and the bad.

]]>menghublog1001Good money drives out bad - a note on free coinage and Gresham's law in the Chinese Han Dynasty (Chen & Lai 2012) Table 1Get (not so easily) introduced to Rhttps://menghublog.wordpress.com/2014/12/13/get-not-so-easily-introduced-to-r/
Sat, 13 Dec 2014 06:26:54 +0000http://menghublog.wordpress.com/?p=2261Continue reading →]]>I dislike R, unlike some other softwares I use, such as SPSS and Stata. It’s extremely error prone. But it’s free, and can do almost everything (e.g., a few things Stata cannot do and a lot of things SPSS/AMOS cannot do). As always, I will update whenever I learn something new.

This is what I think about R. But R doesn’t understand my language.

I don’t care. This software is ugly. But I will try to make it as simple as I can.

Before going any further

I recommend to download : http://openpsych.net/datasets/GSSsubset.7z. That’s the subset of the General Social Survey data I will be using for my analyses below. So, if you want to follow, you should get it. And do not forget to run the entire syntax displayed in Create and modify variables.

Essential things to know first

R seems to work badly when you load .sav or .dta files and perhaps still some others. Of course, there are packages such as foreign or memisc. But in some packages I used for my statistical analysis it just didn’t work. So I highly recommend using .csv files, and nothing else. R also works well with .txt file but has no separate rows or columns and is impossible to convert into an excel file.

R has no menus from which you can do your statistical analyses, and you have to use syntax, which is why it is error prone. In other softwares, when you use the menus, the output is given along with the associated syntax. This makes things very easy. But R, is R.

R is case sensitive. If your variable is named “race” and you type “Race” it will not work.

R is pretty stupid when it comes to missing data, and it’s the only software that does not automatically handle missing data, and forces us to do it explicitly.

The files must be loaded. You must use \\ instead of \ or it will fail. The function read.csv() below, as you guessed, allows you to read .csv, but you must then save it into an object, whatever you call it. Let’s say “entiredata”. By importing a .csv file, you don’t need to use data.frame() because R automatically store it as a data frame.

A lot of specific functions require specific packages. We install them and load them with library(). The command require() does the same thing. You only need to install it once, but you must load it each time you open RGui (the R console).

install.packages('psych')
library(psych)

Useful functions (to get your feet wet)

Sys.setenv(LANG = "en") # make R environment in english
help(table) # request the webpage help for table() function
View(d) # open the data window of the object "d"
head(d, n=10) # display ten first rows in all vars
tail(d, n=10) # display ten last rows in all vars
require(dplyr) # get select() function to keep the variables you wish
dk<- select(d, worda, wordb, wordc, wordd, worde, wordf, wordg, wordh, wordi, wordj, wordsum, bw1) # keep the variables you want (useful when you have too many variables)
dk = dk[complete.cases(dk),] # keep only cases with no missing data on all variables in dk
d$bw1<-NULL # drop variables individually
d<- subset(entiredata, age<70 & age>=18) # retain only cases having age<70 and age>=18
xtabs(~degree+sex+race, data=d, subset=wordsum<11) # "do if" not missing data on wordsum because all values in wordsum are below 11
(56 + 31 + 56 + 8 + 32) / 5 # You get 36.6, but I usually do this kind of things in EXCEL or Kingsoft Spreadsheets

Create and modify variables

Please, run the entire syntax in your R console. Otherwise, you will never be able to work with the regression, structural equation models, factor analysis, and item response, that I have prepared. Do not forget to check your directory path and make sure your file is converted into csv.

The following code will attach some value labels to your categorical variables. Remember that if you do this, you cannot use it anymore for the creation of another variable; e.g., you may need bw1/race to compute Y-hat after regression.

describeBy() is particularly useful. You get N, mean, 10% trimmed mean, median, standard deviation, standard error of the mean, skewness, kurtosis, min, max, and finally “mad” which is a robust estimator of the standard deviation that is computed by a transformation of the median absolute deviation (mad) from the sample median. describe() does the same thing but without groupings.

require(psych) # for describe() and describeBy()
Meng_Hu_summons_the_undead = table(d$wordsum, d$degree, d$bw1) # select variables' N to be cross tabulated
addmargins(Meng_Hu_summons_the_undead) # N by cells and N_sums by rows and columns
addmargins(prop.table(Meng_Hu_summons_the_undead, margin=2)) # expressed in proportion, i.e., percentage, and with N_sums by rows and columns
with(d, table(degree, race, sex)) # another way of doing cross-tabulations (N per cells)
xtabs(~degree+bw1+gender, data=d, na.action=na.omit) # same thing
xtabs(~degree+bw1+gender, data=d, subset=wordsum<11) # no missing data on wordsum because all values in wordsum are below 11
describeBy(d$degree, d$race) # summary stats of degree by race
describeBy(d$degree) # it also works with a single variable
by(d$wordsum, INDICES=d$educ, FUN=mean, na.rm=TRUE) # mean of wordsum by education level
mean(d$wordsum, trim=0.10, na.rm=TRUE) # mean trimmed at 10% for checking outliers, also removes these stupid NAs
mean(d$wordsum, trim=0.05, na.rm=TRUE) # mean trimmed at 5% for checking outliers, also removes these stupid NAs
shapiro.test(d$logincome) # it won't work because N is higher than 5000

The histogram gives the distribution of wordsum by race. The boxplot displays wordsum mean score by race and by gender.

Using cor() without specifying use=”” will cause you to get NA. cor() needs to be used with complete.obs (listwise) or pairwise.complete.obs. Otherwise, it will not work if there are missing data. cor() can be used on the entire data set stored in object “d”. Be careful if you have too many variables.

rcorr() produces 3 matrices, one for correlations, one for sample sizes, one for the asymptotic p-values. It uses pairwise deletion. And it automatically handles missing values.

cor.test() gives confidence intervals (only for pearson) and allows to test the type of the alternative hypothesis (against the null) which may be either “two.sided” or “less” (if the predicted correlation is negative) or “greater” (if the predicted correlation is positive).

Two formulas are possible : lm(d$y ~ d$x1 + d$x2) or lm(y ~ x1 + x2 , data=d). You can use sampling weight, and you can use a subset of the data, e.g., analysis on males only by using subset=d$gender<1. The inconvenience is that fitted() won’t probably be working.

ANOVA can give you the significance of the difference in model fit of our two regressions, but remember that significance test is affected by sample size. I would largely prefer to employ the Cohen’s f² measure of effect size suggested by Selya et al. (2012).

Errors-in-Variables (EIV) regression

This is a linear regression which corrects individually for measurement error in the independent variables (although I recommend SEM whenever it’s possible). There is no R package available but Culpepper & Aguinis (2011) have a handmade program. The results are identical to those obtained from the eivreg function in Stata. The intercept, R² and confidence intervals seem to be missing however.

This is a technique which is claimed to be able to disentangle the tangle of correlations between the independent variables. According to its proponents, relative importance analysis is able to tell which independent variable is the strongest predictor, while classical regression cannot.

I know of two functions. censReg() and vglm() from censReg and VGAM packages. censReg displays numbers not available in VGAM, so it’s complementary, however it does not (yet) allow sampling weights. Tobit requires you specify the right or upper value of the dependent variable to be censored. Here, wordsum has values 0-10, so I use 10 as the upper limit.

Please, don’t try to extract residuals, even though you can. R is not Stata. R is pretty stupid. Stata knows it’s wrongdoing. In tobit, the Y is not treated as an observed but as a latent variable. The residuals can’t be extracted in the usual way. You must use the method elaborated by Cameron & Trivedi (2009, pp. 535-538).

Logistic regression

LR evaluates how the probability of scoring 1 (versus 0) in the binary Y variable may vary across the values of the group variables or continuous variables of interest. Usually, people use socio-economic variables. But one can perform a Differential Item Functioning (DIF) test with LR (Oliveri et al., 2012). The theory is that an unbiased item’s response across groups must produce identical curves for all the groups when conditioned on total test score and eventually the interaction of group membership with total score.

anova() shows if the model fit is significantly different when variables are sequentially added and drop1 () shows if the model fit is significantly different when a given variable is dropped from the full model (i.e., with all x variables). Both functions could have been used with F, LRT, Rao, or Cp, instead of Chisq. See help(anova).

Multilevel (linear) regression

I use package lme4 for LME model. REML is the default option. You put FALSE if you want ML estimations. If you estimate a random intercept model, you do (1 | cohort21), but if you estimate a random slope model, you do : (bw1 | cohort21), where bw1 is the slope. You can specify multiple random slopes. If you want to add another level, let’s say, with survey year, you do (bw1 | cohort21) + (1 | year). In this way, with age as fixed effects, you can completely isolate the variation of Y solely due to age, period, and cohort effects. The APC model of Yang & Land (2008).

Package MuMIn is also needed if you want to compute the R²GLMM of Johnson (2014) for random intercept and random slope. Johnson himself has a self-made syntax (see below) which produces similar result as the MuMIn. The package arm is needed if you want the standard errors of random coefficients.

Below, I show how to use difR. It’s tedious. Because the items are selected not based on their variables’ name but on the order listed in the data window. So let’s create a subset of our data d into the object dk. We get a dk data with 12 variables. Let’s look at the screenshot.

In the above picture, it appears that worda to wordj are listed in columns 2:11 and bw1 in column 13. This is false. If you try to select 13 in the columns, R will shoot you this message : “Error in `[.data.frame`(dk, , 13) : undefined columns selected”. Indeed, the column row.names does not count as a real column’s variable. Thus, worda:wordj are in columns 1:10, wordsum in column 11 and bw1 in column 12. Given this, we do :

Notice that column 11 is not even specified in the syntax. The reason, I think, is because the total score variable wordsum is used as scaling factor. Here’s a description of the arguments used :

anchor=c() specifies the item(s) included in your anchor set.correct asks : should the continuity correction be used? (default is TRUE).purify asks : should the method be used iteratively to purify the set of anchor items? (default is FALSE).nrIter sets the maximal number of iterations in the item purification process (default is 10).thrSTD is the threshold (cut-score) for standardized P-DIF statistic (default is 0.10).thrTID is the threshold for detecting DIF items with TID method (default is 1.5).save.output will save your output into a new file in your computer’s folder.

PCA, FA, CFA and MGCFA

Factor Analysis or Principal Component Analysis should be carried out. We get the pattern of the factor loading to be used in CFA/MGCFA. We also determine the numbers of factors to be extracted. In most application of MGCFA, I don’t see people using parallel analysis to examine how many factors to extract (although it should be noted that it is sensitive to sample size). Sometimes, they use Scree plot, but eyeballing this tedious plot is no easy task. And sometimes they use the Kaiser’s eigenvalue-greater-than-one rule, which is the worst one, since the cut-off is arbitrary and there is no clear delimitation (e.g., a factor with an eigenvalue of 1.01 considered as major and another of 0.99 as trivial). A recommended reading is Courtney (2013). Another approach is to look the theory within field of study for indications of how many factors to expect, given their interpretability. That may be a better approach because it’s letting the science to drive the statistic rather than the statistic to drive the science.

When this step is done, we combine the two covariances (group1+group2) and we assess the model fit of equality in the pattern of loadings, in factor loadings, in means (intercepts) and, eventually, the equality in residuals. These are known as configural invariance, metric invariance, scalar invariance, residual invariance.

The following example is taken from the input data analyzed by Dolan (2000). This is an attempt to replication. So you should think about reading Dolan’s article before gonig further. The purpose is to test a model with and without second-order latent factor.