Stochastic Trend

Sunday, November 20, 2016

I've been asked to make some brief comments on the 2016 World Energy Outlook just published by the IEA at the ANU Energy Change Institute's 2016 Energy Update. It's a huge report, but I'll focus on the global projections for energy use and GHG emissions. I think that the IEA are still over-optimistic about the potential for energy intensity improvements and underestimate the future contribution of non-fossil energy. Under the "Current Policies" scenario they expect fossil fuels to have 79% of total energy in 2040 vs. 81% today. The current rapid growth of renewables under current policies makes me skeptical about that. The decline in world energy intensity is also more rapid than in recent decades.

Three main scenarios used throughout the report are summarized in the following Figure:

The "New Policies Scenario" includes policies from NDC's where the policy to implement the pledge appears to actually exist. The "450" Scenario is where policies that actually limit warming to 2 degrees C are implemented. Clearly, decarbonization is minimal under the current policies scenario and not that great under the new policies scenario. But the improvement in energy intensity is very large under all scenarios and does the vast majority of the work in reducing CO2 emissions. How plausible is this huge reduction in energy intensity? Here, I plot the historical global trend in energy intensity and the growth rates projected under the current and new policies scenarios:

The current policies scenario projects an increase in the rate of reduction in energy intensity relative to the 1990-2015 mean. This is possible, the rate of change might accelerate, but I am skeptical. Just looking at the data, we see that in the last few business cycles, energy intensity rose or fell slowly after recessions compared with later parts of boom periods. So, we seem likely to go through other cycles like these. Another issue is that the Chinese economy might have grown slower than the government admitted to in the recent couple of years. This would have exaggerated the global decline in energy intensity but probably not be a lot. The main reason, is that energy efficiency improvements do not translate one-for-one to reductions in energy intensity. The rebound effect, which we are researching in our ARC DP16 grant, means that improvements in energy efficiency lead to increases in the use of "energy services" - like heating, lighting, transport etc. which mean that energy use does not decrease as much as it would if all the efficiency improvement flowed through to energy consumption. At the micro-economic level this is simply because these energy services become cheaper as a result of the efficiency improvement. At the macro-level things are more complicated. I suspect that the IEA's model, which is driven by exogenous assumptions on things like the rate of economic growth, underestimates the economy-wide rebound effect.

Wednesday, October 26, 2016

Most studies of global climate change using econometric methods have ignored the role of the ocean. Though these studies sometimes produce plausible estimates of the climate sensitivity, they universally produce implausible estimates of the rate of adjustment of surface temperature to long-run equilibrium. For example, Kaufmann and Stern (2002) find that the rate of adjustment of temperature to changes in radiative forcing is around 50% per annum even though they estimate an average global climate sensitivity of 2.03K. Similarly, Kaufmann et al. (2006) estimate a climate sensitivity of 1.8K, while the adjustment coefficient implies that more than 50% of the disequilibrium between forcing and temperature is eliminated each year. Furthermore, the autoregressive coefficient in the carbon dioxide equation of 0.832 implies an unreasonably high rate of removal of CO2 from the atmosphere. The methane rate of removal is also very high.

Simple AR(1) I(1) autoregressive models of this type assume that temperature adjusts in an exponential fashion towards the long run equilibrium. The estimate of that adjustment rate tends to go towards that of the fastest adjusting process in the system, if, as is the case, that is the most obvious in the data. Schlesinger et al. (no date) illustrate these points with a very simple first order autoregressive model of global temperature and radiative forcing. They show that such a model approximates a model with a simple mixed layer ocean. Parameter estimates can be used to infer the depth of such an ocean. The models that they estimate have inferred ocean depths of 38.7-185.7 meters. Clearly, an improved time series model needs to simulate a deeper ocean component.
Stern (2006) used a state space model inspired by multicointegration. The estimated climate sensitivity for the preferred model is 4.4K, which is much higher than previous time series estimates and temperature responds much slower to increased forcing. However, this model only used data on the top 300m of the ocean and the estimated increase in heat content in the pre-observational period seems too large.

Pretis (2015) estimates an I(1) VAR for surface temperature and the heat content of the top 700m of the ocean for observed data for 1955-2011. The climate sensitivity is 1.67K for the preferred model but 2.16K for a model, which excludes the level of volcanic forcing from the radiative forcing aggregate, entering only as first differences. With two cointegrating vectors it is not possible to “read off” the rate of adjustment of surface temperature to increased forcing and Pretis does not simulate impulse or transient response functions.

Monday, October 24, 2016

Estimates of the climate sensitivity have been the focus of ongoing debate with widely differing estimates (Armour, 2016) and notable differences between observation and model based sensitivity estimates. The consensus in the IPCC 5th Assessment Report (Bindoff et al., 2013) is that the equilibrium climate sensitivity (ECS) falls in the range of 1.5-4.5 K with more than 66% probability. The transient climate response (TCR) falls in the range 1-2.5 K with more than 66% probability. Armour (2016) notes that the range of ECS supported by recent observations is 1-4 K with a best estimate of around 2 K and the TCR is estimated at 0.9-2.0 K. This suggests that climate model based estimates are too sensitive.

Richardson et al. (2016) note that sea surface temperature measurements measure water rather than air temperature, which has warmed faster. Additionally, the most poorly measured regions on Earth, such as the Arctic, have also warmed the most. Richardson et al. (2016) process the CMIP5 model output in the same way as the HADCRUT4 temperature series is constructed – using seawater temperatures and under-sampling some regions. They infer an observation-based best estimate for TCR of 1.66 K, with a 5–95% range of 1.0–3.3 K, consistent with the climate models considered in the IPCC 5th Assessment Report.

Marvel et al. (2016) argue that the efficacy of other forcings is typically less than that of greenhouse gases so that total radiative forcing is less than standard calculations estimate. When single-forcing experiment results are reported to estimate these efficacies, and TCR and ECS are estimated from observed twentieth-century warming, both TCR and ECS estimates are revised upward to 1.7K and to 2.6-3.0 K, depending on the feedbacks included. Armour (2016) highlights the joint (multiplicative) importance of the Richardson et al. (2016) and the Marvel et al. (2016) studies, which together should raise observational ECS by 60%, which reconciles the discrepancy between observation and model based estimates.

Historical record high global temperatures occured in 2015 and are expected in 2016. Nevertheless, the period between 1998 and 2014, when surface temperatures increased much slower than in the previous quarter century has been the subject of intense scrutiny. As the search for the missing pieces of the puzzle began, a number of potential culprits surfaced.

Among the suggested candidates were an increase in anthropogenic sulfur emissions (Kaufmann et al., 2011), declining solar irradiance (Tollefson, 2014, Trenberth 2015; Kaufmann et al., 2011), and an increase in volcanic aerosols (Andersson et al., 2016) over the examined period, which also coincided with a negative phase of the Pacific Decadal Oscillation (PDO). Similarly, Fyfe et al. (2016) mention anthropogenic sulfate aerosols as contributing factors to the earlier hiatus period from the 1950s to the 1970s. Smith et al. 2016 recently suggested that anthropogenic aerosol emissions might be a driver of the negative PDO. This is however in contrast with the findings of Kosaka and Xie (2013) who attribute with high probability the hiatus to internal variability, instead of forcing.

Karl et al. (2015) argued that the apparent hiatus was due to mis-measurement of surface temperature data. They correct the temperature data for several biases finding the result warming trends between 1950-1999 and 2000-2014 to be “virtually indistinguishable”. However, their approach was critiqued, by among others Fyfe et al. (2016,) who argue that the starting and ending dates of the observation period matter significantly, as the 1950-1970 period also included a big hiatus.

The majority of recent studies agree, however, that exchange of heat between the atmosphere and the oceans is a key player in explaining the surface warning slowdown. Nonetheless, the mechanisms by which oceans absorb and then again release heat were not well understood until recently, when this process was found closely linked to the decadal oscillation of the oceans. Decadal ocean variability, in particular the Pacific Decadal Oscillation (PDO), but also the variability of the Atlantic and Indian Oceans, seem to play a key part in explaining atmosphere – ocean interactions (Kosaka and Xie, 2013; Meehl et al. 2011). According to Meehl et al. (2011), hiatuses might be relatively common climate occurrences, where enhanced heat uptake by the ocean is linked to La Nina-like conditions. By contrast, the positive phase of the PDO favors El Nino conditions and injects heat into the atmosphere (Tollefson, 2014). Stronger trade winds during La Nina episodes drive warm surface water westwards across the Pacific, then down into the lower layers of the ocean. Simultaneously cold water upwells in the eastern Pacific (Trenberth and Fasullo, 2012). It is possible that extreme La Nina events, such as that in 1998, may tip the ocean into a cool phase of the PDO.

While the heat uptake and content of the world ocean is a key factor in the Earth’s energy balance, observations of ocean heat content are sparse. Currently, systematic annual observations for the upper 700m only reach back to 1955, while for the upper 2000 meters only to 2005. Pentadal ocean heat estimates for the upper 2000 meters (Levitus et al. 2012) are available since the mid 1950s. Due to the lack of systematic observations, the pentadal estimates (Levitus et al. 2000) used composites of all available historical temperature observations for respective 5-year periods. Therefore, the farther we go back in time, the larger the uncertainty surrounding ocean heat uptake and the larger potential biases might be.

Estimates for 1955-2010 (Levitus et al., 2012) show a rate of heat uptake of 0.39 Wm-2 for the upper 2000 meters of the world ocean but the uptake has varied over time. Half of the heat accumulated since 1865 accumulated after 1997 (Gleckler et al., 2016) Balmaseda et al. (2013) estimate that heat uptake in the 2000s was 0.84 Wm-2 for the entire ocean with 0.21 Wm-2 of that being stored below 700m, but in the 1990s uptake was negative (-0.18 Wm-2) though other sources find a lower but positive rate of uptake in that period. The vast majority of warming is concentrated in the top 2000m of the ocean (Purkey and Johnson, 2010). Johnson et al. (2016) estimate net ocean heat uptake in the top 1800m of the ocean of 0.71Wm-2 from 2005 to 2015, and 0.07Wm-2 below 1800m. However, during the recent hiatus period, the upper layers of the ocean did not show enough warming to account for the imbalance in the energy system (Balsameda et al. 2013). This “missing energy” was actually stored in the deep oceans (Trenberth and Fasullo, 2012). Estimates of deep ocean heat fluxes are very limited. Kouketsu et al. (2011) calculate world ocean temperature changes for the 1990s and 2000s for waters below 3000m, estimating heat changes below 3000 meters to be around 0.05 Wm-2. Purkey and Johnson (2010) estimate the heat uptake below 4000m to be 0.027 Wm-2.

We only used data on global temperature and radiative forcing and the most basic estimator possible to produce this prediction. It's in the right ballpark in terms of the increase in heat content and even some of the wiggles match up (the levels are "arbitrary"). Diagnostic statistics look fairly good too. I think we can only improve on this prediction using more sophisticated estimators. Watch this space :)

* NOAA assign the middle year of each 5 year window as the date of the data. We assign the last year of each 5 year window instead.

Friday, September 2, 2016

This graph shows the value of electricity divided by GDP for 130 countries in 2013 plotted against GDP per capita. I used the 2015 price of electricity reported by the World Bank Doing Business report, IEA data on electricity use in 2013 and GDP data from the Penn World Table. Cost share is in inverted commas because GDP isn't gross output and electricity is used for consumption as well as production. The fitted curve is a quadratic.

There is a general trend to lower cost shares at higher income levels. But the electricity cost share is very low in some poor countries like Ethiopia simply because they don't use much electricity. It is also low in many oil producing countries such as Kuwait who subsidize electricity. In Kuwait a kWh costs 0.7 U.S. cents. By contrast, in Jamaica a kWh cost 41.6 U.S. cents. The highest cost share is in Macedonia.

I think we should expect that total energy cost shares will be more declining with income. This is because poor countries use more of other energy sources and rich countries use less of energy other than electricity. This matches the longitudinal data we have from Sweden and Britain.

The changes to the temperature trend in the period of the "hiatus" are really small and hard to see in the context of the century scale temperature trend (Panel A). Most of the effect of corrections is in the 19th Century (Panel B). But even there, all the corrections make little difference to the overall trend.

The next graph shows 3 different estimates of global temperature since 1955:

The HADCRUT data has been criticized for not covering heating in the Arctic very well. The GISS series shows more warming due to that. But really the two series are not that different in the overall signal they provide. The Berkeley Earth series is somewhere in between these two. Berkeley Earth was funded by Koch and others to investigate whether the series from official agencies had distorted the record by their use of corrections. The result turned out almost the same as NASA's (GISS). So, there definitely doesn't seem to be any conspiracy to distort the data.

Graham Lloyd also mentions the paper published by Nerilie Abram et al. in Nature this week, that argues that "industrial era warming" began earlier than previously thought. Here is a key figure from their paper:

The brown graph is the reconstructed land surface temperature anomaly and the blue graph the sea surface anomaly. Their argument is based on the current warming trend starting in the first half of the 19th Century. But I don't see anything in the paper that actually associates this with anthropogenic forcing. So I tend to somewhat agree with the quote from Michael Mann "the Abram team
misinterprets the cooling of the early 1800s from two giant volcanic eruptions
as a cooler baseline instead of something unusual. That makes it look like
human-forced warming started earlier than it did instead of climate naturally
recovering from volcanoes putting cooling particles in the air". The paper compares the onset of warming in simulations to the onset of warming in the data and there is almost no correlation between the model results and the data (Panel A):

Yes, greenhouse gases were increasing already (CO2 chart is shown in the bottom right hand corner of the Figure) but it's likely that they only contributed a small part to the warming in that period and much of it is a bounceback from the volcanic eruptions, which had suppressed temperature.

The idea that people have been affecting the climate for a long time was first introduced by Ruddiman in a 2003 paper. I think that Ruddiman was likely right about this. My guess is that what we are seeing in the early 19th Century is mostly still the Ruddiman effect of increased human population, land-clearing, farming etc. Industrial CO2 emissions were very low: 54 million tonnes of carbon a year in 1850.

I'm working on a new climate change paper with Zsuzsanna Csereklyei and Stephan Bruns for the conference on climate econometrics in Aarhus at the end of October. We have got all the data together and we've reviewed the literature and so now comes the modeling phase. Watch this space :)

Wednesday, August 10, 2016

I don't like looking at my published papers because I hate finding mistakes. Today I saw that there is a missing coefficient in Table 2 of my recent paper with Donglan Zha "Economic growth and particulate pollution concentrations in China". In the column for Equation (2) for PM 2.5 the coefficient for the interaction between growth and the level of GDP per capita is missing. The table should look like this:

I checked my correspondence with the journal production team. They made lots of mistakes in rendering the tables and I went through more than one round of trying to get them to fix them. But the version I eventually OK-ed had this missing coefficient. At least the working paper version has the correct table.

Monday, July 25, 2016

I got a request for the data in our 1997 paper in Nature on climate change. I didn't think I'd be able to send the actual data we used as I used to follow the practice of continually updating the datasets that I most used rather than keeping an archival copy of the data actually used in a paper. But I found a version from February 1997, which was the month we submitted the final version of the paper. I got the RATS code to read the file and with a few tweaks it was producing the results that are in the paper. These are the results for observational data in the paper, not those using data from the Hadley climate model. I have now put up the files on my website. In the process I found this website - zamzar.com - that can convert .wks to .xls files. Apparently, recent versions of Excel can't read the .wks Lotus 1-2-3 files that were a standard format 20 or more years years ago. For those that don't know, Lotus 1-2-3 was the most popular spreadsheet program before Microsoft introduced Excel. I used it in the late 80s and early 90s when I was in grad school.

Introduction
The environmental Kuznets curve (EKC) is a hypothesized relationship between various indicators of environmental degradation and countries’ gross domestic product (GDP) per capita. In the early stages of economic growth environmental impacts and pollution increase, but beyond some level of GDP per capita (which will vary for different environmental impacts) economic growth leads to environmental improvement. This implies that environmental impacts or emissions per capita are an inverted U-shaped function of GDP per capita, whose parameters can be statistically estimated. Figure 1 shows a very early example of an EKC. A vast number of studies have estimated such curves for a wide variety of environmental impacts ranging from threatened species to nitrogen fertilizers, though atmospheric pollutants such as sulfur dioxide and carbon dioxide have been most commonly investigated. The name Kuznets refers to the similar relationship between income inequality and economic development proposed by Nobel Laureate Simon Kuznets and known as the Kuznets curve.

The EKC has been the dominant approach among economists to modeling ambient pollution concentrations and aggregate emissions since Grossman and Krueger (1991) introduced it in an analysis of the potential environmental effects of the North American Free Trade Agreement. The EKC also featured prominently in the 1992 World Development Report published by the World Bank and has since become very popular in policy and academic circles and is even found in introductory economics textbooks.

Critique
Despite this, the EKC was criticized almost from the start on empirical and policy grounds, and debate continues. It is undoubtedly true that some dimensions of environmental quality have improved in developed countries as they have become richer. City air and rivers in these countries have become cleaner since the mid-20th Century and in some countries forests have expanded. Emissions of some pollutants such as sulfur dioxide have clearly declined in most developed countries in recent decades. But there is less evidence that other pollutants such as carbon dioxide ultimately decline as a result of economic growth. There is also evidence that emerging countries take action to reduce severe pollution. For example, Japan cut sulfur dioxide emissions in the early 1970s following a rapid increase in pollution when its income was still below that of the developed countries and China has also acted to reduce sulfur emissions in recent years.

As further studies were conducted and better data accumulated, many of the econometric studies that supported the EKC were found to be statistically fragile. Figure 2 presents much higher quality data with a much more comprehensive coverage of countries than that used in Figure 1. In both 1971 and 2005 sulfur emissions tended to be higher in richer countries and the curve seems to have shifted down and to the right. A cluster of mostly European countries had succeeded in sharply cutting emissions by 2005 but other wealthy countries reduced their emissions by much less.

Initially, many understood the EKC to imply that environmental problems might be due to a lack of sufficient economic development rather than the reverse, as was conventionally thought, and some argued that the best way for developing countries to improve their environment was to get rich. This alarmed others, as while this might address some issues like deforestation or local air pollution, it would likely exacerbate other environmental problems such as climate change.

Explanations
The existence of an EKC can be explained either in terms of deep determinants such as technology and preferences or in terms of scale, composition, and technique effects, also known as “proximate factors”. Scale refers to the effect of an increase in the size of the economy, holding the other effects constant, and would be expected to increase environmental impacts. The composition and technique effects must outweigh this scale effect for pollution to fall in a growing economy. The composition effect refers to the economy’s mix of different industries and products, which differ in pollution intensities. Finally the technique effect refers to the remaining change in pollution intensity. This will include contributions from changes in the input mix – e.g. substituting natural gas for coal; changes in productivity that result in less use, everything else constant, of polluting inputs per unit of output; and pollution control technologies that result in less pollutant being emitted per unit of input.

Over the course of economic development the mix of energy sources and economic outputs tends to evolve in predictable ways. Economies start out mostly agricultural and the share of industry in economic activity first rises and then falls as the share of agriculture declines and the share of services increases. We might expect the impacts associated with agriculture, such as deforestation, to decline, and naively expect the impacts associated with industry such as pollution would first rise and then fall. However, the absolute size of industry rarely does decline and it is improvement in productivity in industry, a shift to cleaner energy sources, such as natural gas and hydro-electricity, and pollution control that eventually reduce some industrial emissions.

Static theoretical economic models of deep determinants, that do not try to also model the economic growth process, can be summarized in terms of two parameters: The elasticity of substitution between dirty and clean inputs or between pollution control and pollution, which summarizes how difficult it is to cut pollution; and the elasticity of marginal utility, which summarizes how hard it is to increase consumer well-being with more consumption. It is usually assumed that these consumer preferences are translated into policy action. Pollution is then more likely to increase as the economy expands, the harder it is to substitute other inputs for polluting ones and the easier it is to increase consumer well-being with more consumption. If these parameters are constant then either pollution rises or falls with economic growth. Only if they change over time will pollution first rise and then fall. The various theoretical models can be classified as ones where the EKC is driven by changes in the elasticity of substitution as the economy grows or models where the EKC is primarily driven by changes in the elasticity of marginal utility.

Dynamic models that model the economic growth process alongside changes in pollution, are harder to classify. The best known is the Green Solow Model developed by Brock and Taylor (2010) that explains changes in pollution as a result of the competing effects of economic growth and a constant rate of improvement in pollution control. Fast growing middle-income countries, such as China, then having rising pollution, and slower growing developed economies, falling pollution. An alternative model developed by Ordás Criado et al. (2011) also suggests that pollution rises faster in faster growing economies but that there is also convergence so that countries with higher levels of pollution are more likely to reduce pollution faster than countries with low levels of pollution.

Recent Empirical Research and Conclusion
Recent empirical research builds on these dynamic models painting a subtler picture than did early EKC studies. We can distinguish between the impact of economic growth on the environment and the effect of the level of GDP per capita, irrespective of whether an economy is growing or not, on reducing environmental impacts. Economic growth usually increases environmental impacts but the size of this effect varies across impacts and the impact of growth often declines as countries get richer. However, richer countries are often likely to make more rapid progress in reducing environmental impacts. Finally, there is often convergence among countries, so that countries that have relatively high levels of impacts reduce them faster or increase them slower. These combined effects explain more of the variation in pollution emissions or concentrations than either the classic EKC model or models that assume that either only convergence or growth effects alone are important. Therefore, while being rich means a country might do more to clean up its environment, getting rich is likely to be environmentally damaging and the simplistic policy prescriptions that some early proponents of the EKC put forward should be disregarded.

Thursday, July 21, 2016

Just finished writing a survey of the environmental Kuznets curve (EKC) for the Oxford Research Encyclopedia of Environmental Economics. Though I updated all sections, of course, there is quite a bit of overlap with my previousreviews. But there is a mostly new review of empirical evidence reviewing the literature and presenting original graphs in the spirit of IPCC reports :) I came up with this new graph of the EKC for sulfur emissions:

The graph plots the growth rate from 1971 to 2005 of per capita sulfur emissions in the sample used in the Anjum et al. (2014) paper against GDP per capita in 1971. There is a correlation of -0.32 between the growth rates and initial log GDP per capita. This shows that emissions did tend to decline or grow more slowly in richer countries but the relationship is very weak - only 10% of the variation in growth rates is explained by initial GDP per capita. Emissions grew in many wealthier countries and fell in many poorer ones, though GDP per capita also fell in a few of the poorest of those. So, this does not provide strong support for the EKC being the best or only explanation of either the distribution of emissions across countries or the evolution of emissions within countries over time. On the other hand, we shouldn't be restricted to a single explanation of the data and the EKC can be treated as one possible explanation as in Anjum et al. (2014). In that paper, we find that when we consider other explanations such as convergence the EKC effect is statistically significant but the turning point is out of sample - growth has less effect on emissions in richer countries but it still has a positive effect.

The graph below compares the growth rates of sulfur emissions with the initial level of emissions intensity. The negative correlation is much stronger here: -0.67 for the log of emissions intensity. This relationship is one of the key motivations for pursuing a convergence approach to modelling emissions. Note that the tight cluster of mostly European countries that cut emissions the most appears to have had both high income and high emissions intensity at the beginning of the period.

Tuesday, July 12, 2016

I wrote a long comment on this blogpost by Ludo Waltman but it got eaten by their system, so I'm rewriting it in a more expanded form as a blogpost of my own. Waltman argues, I think, that for those that reject the use of journal impact factors to evaluate individual papers, such as Lariviere et al., there should be then no legitimate uses for impact factors. I don't think this is true.

The impact factor was first used by Eugene Garfield to decide which additional journals to add to the Science Citation Index he created. Similarly, librarians can use impact factors to decide on which journals to subscribe or unsubscribe from and publishers and editors can use such metrics to track the impact of their journals. These are all sensible uses of the impact factor that I think no-one would disagree with. Of course, we can argue about whether the mean number of citations that articles receive in a journal is the best metric and I think that standard errors - as I suggested in my Journal of Economic Literature article - or the complete distribution as suggested by Lariviere et al., should be provided alongside them.

I actually think that impact factors or similar metrics are useful to assess very recently published articles, as I show in my PLoS One paper, before they manage to accrue many citations. Also, impact factors seem to be a proxy for journal acceptance rates or selectivity, which we only have limited data on. But ruling these out as legitimate uses doesn't mean rejecting the use of such metrics entirely.

I disagree with the comment by David Colquhoun that no working scientists look at journal impact factors when assessing individual papers or scientists. Maybe this is the case in his corner of the research universe but it definitely is not the case in my corner. Most economists pay much, much more attention to where a paper was published than how many citations it has received. And researchers in the other fields I interact with also pay a lot of attention to journal reputations, though they usually also pay more attention to citations as well. Of course, I think that economists should pay much more attention to citations too.

Wednesday, June 15, 2016

Recently, Stephan Bruns published a paper with John Ioannidis in PLoS ONE critiquing the p-curve. I've blogged about the p-curve previously. Their argument is that the p-curve cannot distinguish "true effects" from "null effects" in the presence of omitted variables bias. Simonsohn et al., the originators of the p-curve, have responded in their blog, which I have added to the blogroll here. They say, of course, the p-curve cannot distinguish between causal effects and other effects but it can distinguish between "false positives", which are non-replicable effects and "replicable effects", which include both "confounded effects" (correlation but not causation) and "causal effects". Bruns and Ioannidis have responded to this comment too.

In my previous blogpost on the p-curve, I showed that the Granger causality tests we meta-analysed in our Energy Journal paper in 2014 form a right-skewed p-curve. This would mean that there was a "true effect" according to the p-curve methodology. However, our meta-regression analysis where we regressed the test statistics on the square root of degrees of freedom in the underlying regressions showed no "genuine effect". Now I understand what is going on. The large number of highly significant results in the Granger causality meta-dataset is generated by "overfitting bias". This result is "replicable". If we fit VAR models to more such short time series we will again get large numbers of significant results. However, regression analysis shows that this result is bogus as the p-values are not negatively correlated with degrees of freedom. Therefore, the power trace meta-regression is a superior method to the p-curve. In addition, we can modify this regression model to account for omitted variables bias by adding dummy variables and interaction terms (as we do in our paper). This can help to identify a causal effect. Of course, if no researchers actually estimate the true causal model then this method too cannot identify the causal effect. But there are always limits to our ability to be sure of causality. Meta-regression can help rule out some cases of confounded effects.

The bottom line is that you should use meta-regression analysis rather than the p-curve.

* In the case of unit root spurious regressions mentioned in Bruns and Ioannidis' response, things are a bit complicated. In the case of a bivariate spurious regression, where there is a drift in the same direction in both variables then it is likely that Stanley's FAT-PET and similar methods will show that there is a true effect. Even though there is no relationship at all between the two variables, the nature of the data-generating-process for each means that they will be correlated. Where there is no drift or the direction of drift varies randomly then there should be equal numbers of positive and negative t-statistics in underlying studies and no relationship between the value of the t-statistic and degrees of freedom, though there is a relationship between the absolute value of the t-statistic and degrees of freedom. Here meta-regression does better than the p-curve. I'm not sure if the meta-regression model in our Energy Journal paper might be fooled by Granger Causality tests in levels of unrelated unit root variables. These would likely be spuriously significant but the significance might not rise strongly with sample size?

Wednesday, June 1, 2016

It's the first official day of winter today here in Australia, though it has felt wintry here in Canberra for about a week already. The 1st Semester finished last Friday and as I didn't teach I don't have any exams or papers to grade and the flow of admin stuff and meetings seems to have sharply declined. So, most of this week I can just dedicate to catching up and getting on with my research. It almost feels like I am on vacation :) Looking at my diary, the pace will begin to pick up again from next week.

I'm working on two main things this week. One is the Energy for Economic Growth Project that has now been funded by the UK Department for International Development. I mentioned our brainstorming meeting last July in Oxford in my 2015 Annual Report. I am the theme leader for Theme 1 in the first year of the project. In the middle of this month we have a virtual workshop for the theme to discuss the outlines for our proposed papers. I am coauthoring a survey paper with Paul Burke and Stephan Bruns on the macro-economic evidence as part of Theme 1. There are two other papers in the theme: one by Catherine Wolfram and Ted Miguel on the micro-economic evidence and one by Neil McCulloch on the binding constraints approach to the problem.

The other is my paper with Jack Pezzey on the Industrial Revolution, which we have presented at various conferences and seminars over the last couple of years. I'm ploughing through the math and tidying the presentation up. It's slow going but I think I can see the light at the end of the tunnel! This paper was supposed to be a key element in the ARC Discovery Projects grant that started in 2012.

In the meantime, work has started on our 2016 Discovery Projects grant. Zsuzsanna Csereklyei has now started work at Crawford as a research fellow funded by the grant. She has been scoping the potential sources of data for tracing the diffusion of energy efficient innovations and processing the first potential data source that we have identified. It is hard to find good data sources that are usable for our purpose.

There is a lot of change in the air at ANU as we have a new vice-chancellor on board since the beginning of the year and now a new director for the Crawford School has been appointed and will start later this year. We are also working out again how the various economics units at ANU relate to each other... I originally agreed to be director of the Crawford economic program for a year. That will certainly continue now to the end of this year. It's not clear whether I'll need to continue in the role longer than that.

Finally, here is a list of all papers published so far this year or now in press. I can't remember how many of them I mentioned on the blog, though I probably mentioned all on Twitter:

Monday, May 23, 2016

"I want to apply the DOLS methodology... I have read several books and research works about DOLS but none of them explain clearly how to test cointegration in this case.... I asked some professors about this issue and one of them told me that I should apply the Johansen cointegration test."

It's quite easy to find papersthat do this - first test for cointegration using the Johansen procedure, report only the cointegration test statistics, and if they can be used to reject the null hypothesis of non-cointegration then use some other method such as Dynamic Ordinary Least Squares (DOLS) to estimate a static single equation regression model. These researchers aren't actually interested in the complete vector autoregression (VAR) system, which is OK. I've reviewed quite a lot of papers that use this approach.

If your model has more than two variables (one dependent variable and one explanatory variable) then this is a very bad idea. The cointegration test statistics from the Johansen procedure (if they reject the null) say nothing about the cointegration properties of your single equation regression model.

The following simple example shows why. Imagine we have three variables, X1, X2, and X3 with the following "data generation process":

where epsilon 1 is a stationary stochastic process and epsilon 2 and 3 are simply white noise. Variables X2 and X3 follow simple random walks. Variable X1 cointegrates with X2. But X3 is a random walk that has nothing to do with the other two variables. If you estimate a VAR with these variables and do the Johansen cointegration test, you should expect to find that there is one cointegrating vector. But the following regression:

will not cointegrate. It is a spurious regression because it includes X3 which is an unrelated random walk. We cannot rely on finding that the VAR "cointegrates" to assume that this regression also cointegrates. Only X1 and X2 cointegrate in this example. Of course, it is possible that X1, X2, and X3 are jointly cointegrated but as this example shows, that doesn't have to be the case.

How can we avoid this? The cointegrating vector in this case is [1, -beta1]. We could test within the Johansen procedure whether we can restrict the cointegrating vector to not include a coefficient for X3. Unlike gamma3 in the static regression, if X3 does not belong in the cointegrating relationship, then this coefficient is expected to be zero. We can and should also test the residuals of the static regression to see if they cointegrate.

Friday, May 13, 2016

My previous post discussed Doug Keenan's climate contest. I wondered how accurate we could actually expect to be in such a situation. I assume that the temperature series is a simple random walk, possibly with a constant drift term. We want to see how accurately we can determine whether there is a drift term in the random walk or not.

So, again just using Excel, I created 1000 series of 134 observations each distributed as Normal(mu, 0.11), where mu is the drift term. For 250 series I set mu to 0.01, for 250 series I set it to -0.01 and for 500 to 0. I then compute the usual t-test for the significance of the sample mean for each series.

Only 127 t-tests were significant at the 5% level and 201 at the 10% level. Using a 10% significance level, statistical power - correct rejection of the incorrect null hypothesis of no drift - is 29%. Using a 5% significance level, power is 20%. There is no distortion of the actual "size" of the test - the number of incorrect rejections of the true null.

So, combining this information, if you use this method and a 10% significance level you will get 595 correct classifications of whether a random walk has a drift or does not have a drift, which is far below the 900 required to win the contest.

Of course, it seems that Keenan's data is a bit more complicated than this and may or may not have any relevance to the actual nature of climate data or the nature of the climate change problem.

You can download my data here. The first column is the drift term used and the first row indicates the years and the statistics columns.

My colleague Robert Kaufmann got an e-mail from Doug Keenan inviting him to participate in his "climate change contest" without the usual $10 submission fee. I hadn't heard about this contest and went to the site to investigate. So, Keenan has produced 1000 time series of 135 observations each that are somehow derived from random numbers and then added a plus 1 or minus 1 per 100 observations trend to some of these. The series have been calibrated so that the they could potentially
reproduce in some way the observed global temperature time series from
1880 to the present without an added trend. The task of the contestant is to determine for each series whether it has an added trend or not. If any submission gets 90% of these or more right by 30th November, this year, that submission will win $100,000.

Keenan's idea is that no-one can validly detect with 90% accuracy whether there is a trend in temperature or not. Therefore, the IPCC's claim that temperature has definitely increased over the last century and it is very likely that this is due to human activity must be wrong.

I downloaded the data. Looking at some of the series it's pretty clear that they are some sort of random walk (stochastic trend). It is not simply a series of random numbers (white noise) with a linear trend added. I haven't bothered to write a program to test this. Assuming that they are simple random walks, I tested in Excel whether the mean of first difference was different to zero for each of the thousand series. Only 8 of the series have a mean first difference that is significantly different to zero at the 5% level using the standard calculation of the standard error of the mean, which assumes that the first differences are white noise. If they were normally distributed white noise and none of the original series had an added trend then we would expect about 50 of the means of the first differences to be significantly different to zero by the definition of statistical significance. So, something else seems to be going on here. I expect that statistical power to detect a non-zero drift term of 0.01 or -0.01 when the standard deviation of the first differences is 0.11 is in any case rather low. Perhaps we could use structural time series methods, but statistical power of 90% at a significance level of 10% is a lot to ask for in this situation. I created my own dataset to see how many series one could expect to correctly classify - statistical power using a simple data generating process and a simple test was 29% for a 10% significance level test. This means that we can only correctly classify 595 of the 1000 series.

The real question to ask is whether Keenan's thought experiment makes sense. I would argue that it doesn't. His argument is that if temperature follows some kind of integrated process then it is very hard to determine whether it has a drift component or will sooner or later just stochastically trend down again. Therefore, we can't know if temperature has statistically significantly increased or not. But theory and climate models predict that global temperature should be stationary if radiative forcing is constant. If we detect a random walk or a close to random walk signal in the temperature data then something else is happening. Research can then try to determine if it is likely to be due to anthropogenic factors or not. It is possible that we make a type 1 error - falsely rejecting the null hypothesis - but we can determine how likely that is. So, in my opinion, Keenan's contest is another case of mathiness in climate econometrics.

Saturday, April 16, 2016

After a very long process our original paper on the growth rates approach was rejected by JEEM about a month ago. I think the referees struggled to see what it added to more conventional approaches. A new referee in the 2nd round hadn't even read the important Vollebergh et al. paper, so it's not surprising they missed how we were trying to build on that paper. That, discussion with my coauthor, Paul Burke, and preparing a guest lecture for CRWF8000 on the environmental Kuznets curve, got me thinking about a clearer way to present what we are trying to do. It really is true that teaching can improve your research!

Vollebergh et al. divide the total variation in emissions in environmental Kuznets curve models into time and income effects:

where G is GDP per capita, E is emissions per capita, and i indexes countries and t time. They point out that the standard fixed effects panel estimation of the EKC imposes very strong restrictions on the first term:

Each country has a constant "country effect" that doesn't vary over time while all countries share a common "time effect" that varies over time. They think that the latter is unreasonable. Their solution is to find pairs of similar countries and assume that just those two countries each share a common time effect.

In my paper on "Between estimates of the emissions-income elasticity" I solved this problem by allowing the time effect to take any arbitrary path in any country by simply not modeling the time effect at all and extracting it as a residual. The downside of the between estimator is that it is more vulnerable to omitted variables bias than other estimators.

We introduced the growth rates approach to deal with several issues in EKC models, one of them is this time effects problem. The growth rates approach partitions the variation in the growth rate of emissions like this:

where "hats" indicate proportional growth rates, and X is a vector of exogenous variables including the constant. The time effect is the expected emissions growth rate in each country when the economic growth is zero. This is a clear definition. The formulation allows us to model the time effect in each individual country i as a function of a set of country characteristics including the country's emissions intensity, legal origin, level of income, fossil fuel endowment etc. I don't think this is that clear in the papers I've written so far. We focused more on testing alternative emissions growth models and, in particular, comparing the EKC to the Green Solow and other convergence models.

So what do these time effects look like? Here are the time effects for the most general model for the CDIAC CO2 data plotted against GDP per capita in 1971:

Yes, I also computed standard errors for these, but it's a lot of hassle to do a chart with confidence intervals and a continuous variable on the X-axis in Excel.... There is a slight tendency for the time effect to decline with increased income but there is a big variation across countries at the same income level. And here are the results for SO2:

These are fairly similar, but more negative as would be expected. Clearly the time effects story is not a simple one and one that has largely been ignored in the EKC literature.

Thursday, April 7, 2016

Myles Allen has a new interesting paper in Nature Climate Change:"Drivers of Peak Warming in a Consumption-Maximizing World", which has attracted media attention. The article in The Australian is framed as: "If we spend money now on renewable energy we won't be able to afford carbon sequestration later". This didn't sound right to me as I'm an "all of the above" kind of guy when it comes to climate policy and if there is less carbon in the air that needs scrubbing in the future the less it would seem to cost to scrub it.

I haven't done a thorough read of the mathematics in Allen's paper and this isn't going to a proper critique of his article. I just wanted to understand where the journalist got this idea from.

Allen uses a very simple cost-benefit framework where there is "backstop technology" - a technology that can remove carbon dioxide from the atmosphere at constant cost. The key assumption I think is that the "social cost of carbon" depends linearly on the level of income per capita. The following graph illustrates the main result:

If economic growth is rapid, then the social cost of carbon will rise much faster than if economic growth is slow. Therefore, it will pay off earlier to employ the backstop technology. This means that, paradoxically, peak warming will be less than under slower economic growth.

It is a long leap from this to arguing that we shouldn't be investing in renewable energy. Allen's model allows for an efficient level of abatement until the marginal cost of abatement hits the backstop cost. Also the model has no feedback from abatement cost to the rate of economic growth, which is exogenous. Almost all economic research, including myown, finds that the growth costs of climate mitigation are very small, at least until extreme levels of abatement are reached. So, the model is an interesting thought exercise about CCS but doesn't have as strong policy implications as the media suggests.

Particulate pollution, especially PM2.5, is thought to be the form of pollution with the most serious human health impacts. It is estimated that PM2.5 exposure causes 3.1 million deaths a year, globally, and any level above zero is deemed unsafe, i.e. there is no threshold above zero below which negative health effects do not occur. Black carbon is an important fraction of PM2.5 pollution that may contribute significantly to anthropogenic radiative forcing and, therefore, there may be significant co-benefits to reducing its concentration. In our paper, we use recently developed population-weighted estimates of national average concentrations of PM2.5 pollution that are available from the World Bank Development Indicators. These combine satellite and ground based observations.

Though the environmental Kuznets curve (EKC) was originally developed to model the ambient concentrations of pollutants, most subsequent applications focused on pollution emissions. Yet, previous research suggests that it is more likely that economic growth could eventually reduce the concentrations of local pollutants than emissions. We examine the role of income, convergence, and time related factors in explaining changes in PM2.5 pollution in a global panel of 158 countries between 1990 and 2010. We find that economic growth has positive but relatively small effects, time effects are also small but larger in wealthier and formerly centrally planned economies, and, for our main dataset, convergence effects are small and not statistically significant.

Crucially, when we control for other relevant variables, even for this particulate pollution concentration data there is no environmental Kuznets curve, if what we mean by that is that environmental impacts decline with increasing income once a given in sample level of income is passed - the turning point.

The following graph shows the relationship between the average growth rates over 20 years of particulate pollution concentrations and per capita GDP:

The two big circles are of course China and India where both GDP and particulate pollution grew strongly. We can see that there is a positive relationship between these two growth rates, especially when we focus on the larger countries. The main econometric estimate in the paper shows that a 1% increase in the rate of economic growth is associated with a 0.2% increase in the growth rate of particulate pollution. This is much weaker than the effects we found for emissions of carbon and sulfur dioxides. The estimated income turning point is $66k with a large standard error. On the other hand, when we estimate a model without the control variables, we obtain a turning point of only $3.3k with a standard error of only $1.2k. To check the robustness of this result, we estimate models with other data sets and time periods. These yield quite similar results.

We conclude that growth has smaller effects on the concentrations of particulate pollution than it does on emissions of carbon or sulfur. However, the EKC model does not appear to apply here either, casting further doubt on its general usefulness.

Thursday, March 3, 2016

The Stern Review of the REF (Research Excellence Framework) is the latest British government review of research assessment in the UK, following on from the Metric Tide assessment. I have just made a submission to the enquiry. My main comment in response to the first question (1. What changes to existing processes could more efficiently or more accurately assess the outputs, impacts and contexts of research in order to allocate QR? Should the definition of impact be broadened or refined? Is there scope for more or different use of metrics in any areas?) follows:

"I think that there is substantial scope for using bibliometrics in the conduct of the REF. In Australia the Australian Research Council uses metrics to assess natural science disciplines and psychology. Research that I have conducted with my coauthor, Stephan Bruns, shows that this approach could be extended to economics and probably political science and perhaps other social sciences. We have written a working paper presenting our results that is currently under review by Scientometrics.

The paper shows that university rankings in economics based on long-run citation counts can be easily predicted using early citations. The rank correlation between universities' cumulative citations received over ten years for economics articles published in 2003 and 2004 and citations received in 2003 to 2004 alone is 0.91 in the UK and 0.82 in Australia. We compare these citation-based university rankings with the rankings of the 2008 Research Assessment Exercise in the UK and the 2010 Excellence in Research assessment in Australia. Rank correlations are quite strong but there are differences between rankings based on this type of peer review and rankings based on citation counts. However, if assessors are willing to consider citation analysis to assess some disciplines as is the case for the natural sciences and psychology in Australia there seems no reason to not include economics in this set.

Previously, I published a paper, published in PLoS One showing that the predictability of citations at the article level is similar in economics and political science. This supports the view that metrics based research assessment can cover both economics and political science in addition to the natural sciences and economics.

I believe the REF review should seriously consider these findings in producing recommendations for a lighter touch future REF."

I also made briefer responses to some of their other questions. In particular:

5. How might the REF be further refined or used by Government to incentivise constructive and creative behaviours such as promoting interdisciplinary research, collaboration between universities, and/or collaboration between universities and other public or private sectorbodies?

"A major issue with the REF and the ERA in Australia is the pigeon-holing of research into disciplines, which might not match well the nature of the research conducted. This clearly will discourage publication in interdisciplinary venues that may not be as respected by mainstream reviewers. The situation is less acute in Australia where a single output can be allocated across different assessment disciplines, but I still think that assessment by pure disciplinary panels discourages interdisciplinary work in Australia. So, I imagine this is exacerbated in the UK.

7. In your view how does the REF process influence the development of academic disciplines or impact upon other areas of scholarly activity relative to other factors? What changes would create or sustain positive influences in the future?

Johnston et al. (2014) show that the total number of economics students has increased in UK more rapidly than the total number of all students, but the number of departments offering economics degrees has declined, particularly in post-1992 universities. Also, the number of universities submitting to the REF under economics has declined sharply with only 3 post-1992 universities submitting in the latest round. This suggests that the REF has driven a concentration of economics research in the more elite universities in the UK.

Thursday, February 25, 2016

A new working paper coauthored with Donglan Zha, who is visiting the Crawford School, which will be published in a special issue of Environmental Economics and Policy Studies. Our paper tries to explain recent changes in PM 2.5 and PM 10 particulate pollution in 50 Chinese cities using new measures of ambient air quality that the Chinese government has published only since the beginning of 2013. These data are not comparable to earlier official statistics and we believe are more reliable. We use our recently developed model that relates the rate of change of pollution to the growth of the economy and other factors as well as also estimating the traditional environmental Kuznets curve (EKC) model.

Though the environmental Kuznets curve (EKC) was originally developed to model the ambient concentrations of pollutants, most subsequent applications have focused on pollution emissions. Yet, it would seem more likely that economic growth could eventually reduce the concentrations of local pollutants than emissions. This is the first application of our new model to such concentration data.

The data show that there isn't much correlation between the growth rate of GDP between 2013 and 2014 and the growth rate of PM 2.5 pollution over the same period:

What is obvious is that pollution fell sharply from 2013 to 2014, as almost all the data points have negative pollution growth. We have to be really cautious in interpreting a two year sample. Subsequent events suggest that this trend did not continue in 2015.

In fact, the simple linear relationship between these variables is negative, though statistically insignificant. The traditional EKC model and its growth rate equivalent both have a U shape curve - the effect of growth is negative at lower income per capita levels and positive at high ones. But the (imprecisely estimated, so not statistically significant) turning point fro PM 2.5 is way out of sample at more than RMB 400k.* So, growth has a negative effect on pollution in the relevant range. When we add the initial levels of income per capita and pollution concentrations to the growth rates regression equation the turning point is in-sample and statistically significant. The initial level of pollution has a negative and highly statistically significant effect. So, there is "beta convergence" - cities with initially high pollution concentrations, reduced their level of pollution faster than cleaner cities did.

So what does all this mean? These results are very different than those we found for emissions of CO2, total GHGs, and sulfur dioxide. In all those cases, we found that growth had a positive and quite large effect on emissions. In some cases, the effect was close to 1:1. Of course, we should be cautious about interpreting this small Chinese data set. But our soon to be released research on global PM 2.5 concentrations, will again show that the effect of growth is smaller for these data than it is for the key pollution emissions data. This confirms early research that suggested that pollution concentrations turn down before emissions do, though it doesn't seem to support the traditional EKC interpretation of the data.

"Mills assumes that past fluctuations in temperature are purely random and of unknown causes and ignores greenhouse gases, or the sun, or volcanic eruptions, or any other specific factor that might drive climate change. He then fits simple statistical models based on this assumption to the data. Not surprisingly, if you assume that there isn't any specific factor driving the climate, your best forecast for the future is for not much change because you don't know what random shocks will show up to change the climate in the future. A more sensible approach is to test which of the various proposed drivers might actually have an effect and how large that effect has been. There are a lot of refereed academic papers that do just that including some I published myself. It's pretty easy to show that greenhouse gases have an effect on the climate, it's quite big (but fairly uncertain how big), and if emissions continue on a business as usual path there will be a lot of increase in temperature."

More technically: Mills fits univariate ARIMA models to HADCRUT, RSS global lower troposphere series (only available since 1980) and Central England Temperature series. These include models with no deterministic component (an ARIMA(0,1,3) model of HADCRUT) and a model with a deterministic trend with breakpoints chosen based on "eyeballing" the temperature graph. None of these models predicts any future warming, because there is no trend in the trendless model and because the "hiatus" means there is no recent trend in the segmented trend model. Of course, a model with just a single linear deterministic trend fitted to HADCRUT data would forecast a lot of warming in the 21st Century, though with a very wide forecast error envelope. But that model isn't estimated, for some reason...

This is a prime case of "mathiness" I think - lots of math that will look sophisticated to many people used to build a model on silly assumptions with equally silly conclusions.