Occasionally, the Nobel Committee gives a prize which is unexpected, surprising, yet deft in how it points out underappreciated research. This year, they did no such thing. Both William Nordhaus and Paul Romer have been running favorites for years in my Nobel betting pool with friends at the Federal Reserve. The surprise, if anything, is that the prize went to both men together: Nordhaus is best known for his environmental economics, and Romer for his theory of “endogenous” growth.

On reflection, the connection between their work is obvious. But it is the connection that makes clear how inaccurate many of today’s headlines – “an economic prize for climate change” – really is. Because it is not the climate that both winners build on, but rather a more fundamental economic question: economic growth. Why are some places and times rich and others poor? And what is the impact of these differences? Adam Smith’s “The Wealth of Nations” is formally titled “An Inquiry into the Nature and Causes of the Wealth of Nations”, so these are certainly not new questions in economics. Yet the Classical economists did not have the same conception of economic growth that we have; they largely lived in a world of cycles, of ebbs and flows, with income per capita facing the constraint of agricultural land. Schumpeter, who certainly cared about growth, notes that Smith’s discussion of the “different progress of opulence in different nations” is “dry and uninspired”, perhaps only a “starting point of a sort of economic sociology that was never written.”

As each generation became richer than the one before it – at least in a handful of Western countries and Japan – economists began to search more deeply for the reason. Marx saw capital accumulation as the driver. Schumpeter certainly saw innovation (though not invention, as he always made clear) as important, though he had no formal theory. It was two models that appear during and soon after World War II – that of Harrod-Domar, and Solow-Swan-Tinbergen – which began to make real progress. In Harrod-Domar, economic output is a function of capital Y=f(K), nothing is produced without capital f(0)=0, the economy is constant returns to scale in capital df/dK=c, and the change in capital over time depends on what is saved from output minus what depreciates dK/dt=sY-zK, where z is the rate of depreciation. Put those assumptions together and you will see that growth, dY/dt=sc-z. Since c and z are fixed, the only way to grow is to crank up the savings rate, Soviet style. And no doubt, capital deepening has worked in many places.

Solow-type models push further. They let the economy be a function of “technology” A(t), the capital stock K(t), and labor L(t), where output Y(t)=K^a*(A(t)L(t))^(1-a) – that is, that production is constant returns to scale in capital and labor. Solow assumes capital depends on savings and depreciation as in Harrod-Domar, that labor grows at a constant rate n, and that “technology” grows at constant rate g. Solving this model gets you that the economy grows such that dY/dt=sy-k(n+z+g), and that output is exactly proportional to capital. You can therefore just run a regression: we observe the amount of labor and capital, and Solow shows that there is not enough growth in those factors to explain U.S. growth. Instead, growth seems to be largely driven by change in A(t), what Abramovitz called “the measure of our ignorance” but which we often call “technology” or “total factor productivity”.

Well, who can see that fact, as well as the massive corporate R&D facilities of the post-war era throwing out inventions like the transistor, and not think: surely the factors that drive A(t) are endogenous, meaning “from within”, to the profit-maximizing choices of firms? If firms produce technology, what stops other firms from replicating these ideas, a classic positive externality which would lead the rate of technology in a free market to be too low? And who can see the low level of convergence of poor country incomes to rich, and not think: there must be some barrier to the spread of A(t) around the world, since otherwise the return to capital must be extraordinary in places with access to great technology, really cheap labor, and little existing capital to combine with it. And another question: if technology – productivity itself! – is endogenous, then ought we consider not just the positive externality that spills over to other firms, but also the negative externality of pollution, especially climate change, that new technologies both induce and help fix? Finally, if we know how to incentivize new technology, and how growth harms the environment, what is the best way to mitigate the great environmental problem of our day, climate change, without stopping the wondrous increase in living standards growth keeps providing? It is precisely for helping answer these questions that Romer and Nordhaus won the Nobel.

Romer and Endogenous Growth

Let us start with Paul Romer. You know you have knocked your Ph.D. thesis out of the park when the great economics journalist David Warsh writes an entire book hailing your work as solving the oldest puzzle in economics. The two early Romer papers, published in 1986 and 1990, have each been cited more than 25,000 times, which is an absolutely extraordinary number by the standards of economics.

Romer’s achievement was writing a model where inventors spend money to produce inventions with increasing returns to scale, other firms use those inventions to produce goods, and a competitive Arrow-Debreu equilibrium still exists. If we had such a model, we could investigate what policies a government might wish to pursue if it wanted to induce firms to produce growth-enhancing inventions.

Let’s be more specific. First, innovation is increasing returns to scale because ideas are nonrival. If I double the amount of labor and capital, holding technology fixed, I double output, but if I double technology, labor, and capital, I more than double output. That is, give one person a hammer, and they can build, say, one staircase a day. Give two people two hammers, and they can build two staircases by just performing exactly the same tasks. But give two people two hammers, and teach them a more efficient way to combine nail and wood, and they will be able to build more than two staircases. Second, if capital and labor are constant returns to scale and are paid their marginal product in a competitive equilibrium, then there is no output left to pay inventors anything for their ideas. That is, it is not tough to model in partial equilibrium the idea of nonrival ideas, and indeed the realization that a single invention improves productivity for all is also an old one: as Thomas Jefferson wrote in 1813, “[h]e who receives an idea from me, receives instruction himself without lessening mine; as he who lights his taper at mine, receives light without darkening me.” The difficulty is figuring out how to get these positive spillovers yet still have “prices” or some sort of rent for the invention. Otherwise, why would anyone pursue costly invention?

We also need to ensure that growth is not too fast. There is a stock of existing technology in the world. I use that technology to create new innovations which grow the economy. With more people over time and more innovations over time, you may expect the growth rate to be higher in bigger and more technologically advanced societies. It is in part, as Michael Kremer points out in his One Million B.C. paper. Nonetheless, the rate of growth is not asymptotically increasing by any stretch (see, e.g., Ben Jones on this point). Indeed, growth is nearly constant, abstracting from the business cycle, in the United States, despite a big growth in population and the stock of existing technology.

Romer’s first attempt at endogenous growth was based on his thesis and published in the JPE in 1986. Here, he adds “learning by doing” to Solow: technology is a function of the capital stock A(t)=bK(t). As each firm uses capital, they generate learning which spills over to other firms. Even if population is constant, with appropriate assumptions on production functions and capital depreciation, capital, output, and technology grow over time. There is a problem here, however, and one that is common to any model based on learning-by-doing which partially spills over to other firms. As Dasgupta and Stiglitz point out, if there is learning-by-doing which only partially spills over, the industry is a natural monopoly. And even if it starts competitively, as I learn more than you, dynamically I can produce more efficiently, lower my prices, and take market share from you. A decentralized competitive equilibrium with endogenous technological growth is unsustainable!

Back to the drawing board, then. We want firms to intentionally produce technology in a competitive market as they would other goods. We want technology to be nonrival. And we want technology production to lead to growth. Learning-by-doing allows technology to spill over, but would simply lead to a monopoly producer. Pure constant-returns-to-scale competitive production, where technology is just an input like capital produced with a “nonconvexity” – only the initial inventor pays the fixed cost of invention – means that there is no output left to pay for invention once other factors get their marginal product. A natural idea, well known to Arrow 1962 and others, emerges: we need some source of market power for inventors.

Romer’s insight is that inventions are nonrival, yes, but they are also partially excludable, via secrecy, patents, or other means. In his blockbuster 1990 JPE Endogenous Technological Change, he lets inventions be given an infinite patent, but also be partially substitutable by other inventions, constraining price (this is just a Spence-style monopolistic competition model). The more inventions there are, the more efficiently final goods can be made. Future researchers can use present technology as an input to their invention for free. Invention is thus partially excludable in the sense that my exact invention is “protected” from competition, but also spills over to other researchers by making it easier for them to invent other things. Inventions are therefore neither public nor private goods, and also not “club goods” (nonrival but excludable) since inventors cannot exclude future inventors from using their good idea to motivate more invention. Since there is free entry into invention, the infinite stream of monopoly rents from inventions is exactly equal to their opportunity cost.

From the perspective of final goods producers, there are just technologies I can license as inputs, which I then use in a constant returns to scale way to produce goods, as in Solow. Every factor is paid its marginal product, but inventions are sold for more than their marginal cost due to monopolistic excludability from secrecy or patents. The model is general equilibrium, and gives a ton of insight about policy: for instance, if you subsidize capital goods, do you get more or less growth? In Romer (1986), where all growth is learning-by-doing, cheaper capital means more learning means more growth. In Romer (1990), capital subsidies can be counterproductive!

There are some issues to be worked out: the Romer models still have “scale effects” where growth is not constant, roughly true in the modern world, despite changes in population and the stock of technology (see Chad Jones’ 1995 and 1999 papers). The neo-Schumpeterian models of Aghion-Howitt and Grossman-Helpman add the important idea that new inventions don’t just add to the stock of knowledge, but also make old inventions less valuable. And really critically, the idea that institutions and not just economic fundamentals affect growth – meaning laws, culture, and so on – is a massive field of research at present. But it was Romer who first cracked the nut of how to model invention in general equilibrium, and I am unaware of any later model which solves this problem in a more satisfying way.

Nordhaus and the Economic Solution to Pollution

So we have, with Romer, a general equilibrium model for thinking about why people produce new technology. The connection with Nordhaus comes in a problem that is both caused by, and potentially solved by, growth. In 2018, even an ignoramus knows the terms “climate change” and “global warming”. This was not at all the case when William Nordhaus began thinking about how the economy and the environment interrelate in the early 1970s.

Growth as a policy goal was fairly unobjectionable as a policy goal in 1960: indeed, a greater capability of making goods, and of making war, seemed a necessity for both the Free and Soviet worlds. But by the early 1970s, environmental concerns arose. The Club of Rome warned that we were going to run out of resources if we continued to use them so unsustainably: resources are of course finite, and there are therefore “limits to growth”. Beyond just running out of resources, growth could also be harmful because of negative externalities on the environment, particularly the newfangled idea of global warming an MIT report warned about in 1970.

Nordhaus treated those ideas both seriously and skeptically. In a 1974 AER P&P, he notes that technological progress or adequate factor substitution allow us to avoid “limits to growth”. To put it simply, whales are limited in supply, and hence whale oil is as well, yet we light many more rooms than we did in 1870 due to new technologies and substitutes for whale oil. Despite this skepticism, Nordhaus does show concern for the externalities of growth on global warming, giving a back-of-the-envelope calculation that along a projected Solow-type growth path, the amount of carbon in the atmosphere will reach a dangerous 487ppm by 2030, surprisingly close to our current estimates. In a contemporaneous essay with Tobin, and in a review of an environmentalist’s “system dynamics” predictions of future economic collapse, Nordhaus reaches a similar conclusion: substitutable factors mean that running out of resources is not a huge concern, but rather the exact opposite, that we will have access to and use too many polluting resources, should worry us. That is tremendous foresight for someone writing in 1974!

Before turning back to climate change, can we celebrate again the success of economics against the Club of Rome ridiculousness? There were widespread predictions, from very serious people, that growth would not just slow but reverse by the end of the 1980s due to “unsustainable” resource use. Instead, GDP per capita has nearly doubled since 1990, with the most critical change coming for the very poorest. There would have been no greater disaster for the twentieth century than had we attempted to slow the progress and diffusion of technology, in agriculture, manufacturing and services alike, in order to follow the nonsense economics being promulgated by prominent biologists and environmental scientists.

Now, being wrong once is no guarantee of being wrong again, and the environmentalists appear quite right about climate change. So it is again a feather in the cap of Nordhaus to both be skeptical of economic nonsense, and also sound the alarm about true environmental problems where economics has something to contribute. As Nordhaus writes, “to dismiss today’s ecological concerns out of hand would be reckless. Because boys have mistakenly cried “wolf’ in the past does not mean that the woods are safe.”

Just as we can refute Club of Rome worries with serious economics, so too can we study climate change. The economy affects the climate, and the climate effects the economy. What we need an integrated model to assess how economic activity, including growth, affects CO2 production and therefore climate change, allowing us to back out the appropriate Pigouvian carbon tax. This is precisely what Nordhaus did with his two celebrated “Integrated Assessment Models”, which built on his earlier simplified models (e.g., 1975’s Can We Control Carbon Dioxide?). These models have Solow-type endogenous savings, and make precise the tradeoffs of lower economic growth against lower climate change, as well as making clear the critical importance of the social discount rate and the micro-estimates of the cost of adjustment to climate change.

The latter goes well beyond the science of climate change holding the world constant: the Netherlands, in a climate sense, should be underwater, but they use dikes to restraint the ocean. Likewise, the cost of adjusting to an increase in temperature is something to be estimated empirically. Nordhaus takes climate change very seriously, but he is much less concerned about the need for immediate action than the famous Stern report, which takes fairly extreme positions about the discount rate (1000 generations in the future are weighed the same as us, in Stern) and the costs of adjustment.

Consider the following “optimal path” for carbon from Nordhaus’ most recent run of the model, where the blue line is his optimum.

Note that he permits much more carbon than Stern or a policy which mandates temperatures stay below a 2.5 C rise forever. The reason is the costs to growth in the short term are high: the world is still very poor in many places! There was a vitriolic debate following the Stern report about who was correct: whether the appropriate social discount rate is zero or something higher is a quasi-philosophical debate going back to Ramsey (1928). But you can see here how important the calibration is.

There are other minor points of disagreement between Nordhaus and Stern, and my sense is that there has been some, though not full, convergence if their beliefs about optimal policy. But there is no disagreement whatsoever between the economic and environmental community that the appropriate way to estimate the optimal response to climate change is via an explicit model incorporating some sort of endogeneity of economic reaction to climate policy. The power of the model is that we can be extremely clear about what points of disagreement remain, and we can examine the sensitivity of optimal policy to factors like climate “tipping points”.

There is one other issue: in Nordhaus’ IAMs, and in Stern, you limit climate change by imposing cap and trade or carbon taxes. But carbon harms cross borders. How do you stop free riding? Nordhaus, in a 2015 AER, shows theoretically that there is no way to generate optimal climate abatement without sanctions for non-participants, but that relatively small trade penalties work quite well. This is precisely what Emmanuel Macron is currently proposing!

Let’s wrap up by linking Nordhaus even more tightly back to Romer. It should be noted that Nordhaus was very interested in the idea of pure endogenous growth, as distinct from any environmental concerns, from the very start of his career. His thesis was on the topic (leading to a proto-endogenous growth paper in the AER P&P in 1969), and he wrote a skeptical piece in the QJE in 1973 about the then-leading theories of what factors induce certain types of innovation (objections which I think have been fixed by Acemoglu 2002). Like Romer, Nordhaus has long worried that inventors do not receive enough of the return to their invention, and that we measure innovation poorly – see his classic NBER chapter on inventions in lighting, and his attempt to estimate how much of how much of society’s output goes to innovators.

The connection between the very frontier of endogenous growth models, and environmental IAMs, has not gone unnoticed by other scholars. Nordhaus IAMs tend to have limited incorporation of endogenous innovation in dirty or clean sectors. But a fantastic paper by Acemoglu, Aghion, Bursztyn, and Hemous combines endogenous technical change with Nordhaus-type climate modeling to suggest a middle ground between Stern and Nordhaus: use subsidies to get green energy close to the technological frontier, then use taxes once their distortion is relatively limited because a good green substitute exists. Indeed, since this paper first started floating around 8 or so years ago, massive subsidies to green energy sources like solar by many countries have indeed made the “cost” of stopping climate change much lower than if we’d relied solely on taxes, since now production of very low cost solar, and mass market electric cars, is in fact economically viable.

It may indeed be possible to solve climate change – what Stern called “the greatest market failure” man has ever seen – by changing the incentives for green innovation, rather than just by making economic growth more expensive by taxing carbon. Going beyond just solving the problem of climate change, to solving it in a way that minimizes economic harm, is a hell of an accomplishment, and more than worthy of the Nobel prizes Romer and Nordhaus won for showing us this path!

Some Further Reading

In my PhD class on innovation, the handout I give on the very first day introduces Romer’s work and why non-mathematical models of endogenous innovation mislead. Paul Romer himself has a nice essay on climate optimism, and the extent to which endogenous invention matters for how we stop global warming. On why anyone signs climate change abatement agreements, instead of just free riding, see the clever incomplete contracts insight of Battaglini and Harstad. Romer has also been greatly interested in the policy of “high-growth” places, pushing the idea of Charter Cities. Charter Cities involve Hong Kong like exclaves of a developing country where the institutions and legal systems are farmed out to a more stable nation. Totally reasonable, but in fact quite controversial: a charter city proposal in Madagascar led to a coup, and I can easily imagine that the Charter City controversy delayed Romer’s well-deserved Nobel laurel. The New York Times points out that Nordhaus’ brother helped write the Clean Air Act of 1970. Finally, as is always true with the Nobel, the official scientific summary is lucid and deep in its exploration of the two winners’ work.

Like this:

Many economists of innovation are hostile to patents as they currently stand: they do not seem to be important drivers of R&D in most industries, the market power they lead to generates substantial deadweight loss, the legal costs around enforcing patents are incredible, and the effect on downstream innovation can be particularly harmful. The argument for patents seems most clear cut in industries where the invention requires large upfront fixed costs of R&D that are paid only by the first inventor, where the invention is clearly delineated, where novelty is easy to understand, and where alternative means of inducing innovation (such as market power in complementary markets, or a large first mover advantage) do not exist. The canonical example of an industry of this type is pharma.

Duncan Gilchrist points out that the market power a patentholder obtains also affects the rents of partial substitutes which might be invented later. Imagine there is a blockbuster statin on patent. If I invent a related drug, the high price of the existing patented drug means I can charge a fairly high price too. If the blockbuster drug were off patent, though, my competitors would be generics whose low price would limit how much I can charge. In other words, the “effective” patent strength in terms of the markup I can charge depends on whether alternatives to my new drug are on patent or are generic. Therefore, the profits I will earn from my drug will be lower when alternative generics exist, and hence my incentive to pay a fixed cost to create the new drug will also be lower.

What does this mean for welfare? A pure “me-too” imitation drug, which generates very little social value compared to the existing patented drug, will never enter if its class is going to see generics in a few years anyway; profits will be competed down to zero. That same drug might find it worthwhile to pay a fixed cost of invention and earn duopoly profits if the existing on patent alternative had many years of patent protection remaining. On the other hand, a drug so much better than existing drugs that even at the pure monopoly price most consumers would prefer it to the existing alternative priced at marginal cost will be developed no matter what, since it faces no de facto restriction on its markup from whether the alternatives in its drug class are generics or otherwise. Therefore, longer patent protection from existing drugs increases entry of drugs in the same class, but mainly those that are only a bit better than existing drugs. This may be better or worse for welfare: there is a wasteful costs of entering with a drug only slightly better than what exists (the private return includes the business stealing, while social welfare doesn’t), but there are also lower prices and perhaps some benefit from variety.

I should note a caveat that really should have been noted in the existing model: changes in de facto patent length for the first drug in class also affect the entry decision of that drug. Longer patent protection may actually cause shorter effective monopoly by inducing entry of imitators! This paper is mainly empirical, so no need for a full Aghion Howitt ’92 model of creative destruction, but it is at least worth noting that the welfare implications of changes in patent protection are somewhat misstated because of this omission.

Empirically, Gilchrist shows clearly that the beginning of new clinical trials for drugs falls rapidly as the first drug in their class has less time remaining on patent: fear of competition with generic partial substitutes dulls the incentive to innovate. The results are clear in straightforward scatterplots, but there is also an IV, to help confirm the causal interpretation, using the gap between the first potentially-defensive patent on the fulcrum patent of the eventual drug, and the beginning of clinical trials, a gap that is driven by randomness in things like unexpected delays in in-house laboratory progress. Using the fact that particularly promising drugs get priority FDA review, Gilchrist also shows that these priority review entrants do not seem to be worried at all about competition from generic substitutes: the “me-too” type of drugs are the ones for whom alternatives going off patent is most damaging to profits.

Final published version in AEJ: Applied 8(4) (No RePEc IDEAS version). Gilchrist is a rare example of a well published young economist working in the private sector; he has a JPE on social learning and a Management Science on behavioral labor in addition to the present paper, but works at robo-investor Wealthfront. In my now six year dataset of the economics job market (which I should discuss again at some point), roughly 2% of “job market stars” wind up outside academia. Budish, Roin and Williams used the similar idea of investigating the effect of patents of innovation by taking advantage of the differing effective patent length drugs for various maladies get as a result of differences in the length of clinical trials following the patent grant. Empirical work on the effect of patent rules is, of course, very difficult since de jure patent strength is very similar in essentially every developed country and every industry; taking advantage of differences in de facto strength is surely a trick that will be applied more broadly.

Like this:

(One quick PSA before I get to today’s paper: if you happen, by chance, to be a graduate student in the social sciences in Toronto, you are more than welcome to attend my PhD seminar in innovation and entrepreneurship at the Rotman school which begins on Wednesday, the 7th. I’ve put together a really wild reading list, so hopefully we’ll get some very productive discussions out of the course. The only prerequisite is that you know some basic game theory, and my number one goal is forcing the economists to read sociology, the sociologists to write formal theory, and the whole lot to understand how many modern topics in innovation have historical antecedents. Think of it as a high-variance cross-disciplinary educational lottery ticket! If interested, email me at kevin.bryanATrotman.utoronto.ca for more details.)

Back to Aghion et al. Let’s kick off 2015 with one of the nicer pieces to come out the ridiculously productive decade or so of theoretical work on growth put together by Philippe Aghion and his coauthors; I wish I could capture the famous alacrity of Aghion’s live presentation of his work, but I fear that’s impossible to do in writing! This paper is based around writing a useful theory to speak to two of the oldest questions in the economics of innovation: is more competition in product markets good or bad for R&D, and is there something strange about giving a firm IP (literally a grant of market power meant to spur innovation via excess rents) at the same time as we enforce antitrust (generally a restriction on market power meant to reduce excess rents)?

Aghion et al come to a few very surprising conclusions. First, the Schumpeterian idea that firms with market power do more R&D is misleading because it ignores the “escape the competition” effect whereby firms have high incentive to innovate when there is a large market that can be captured by doing so. Second, maximizing that “escape the competition” motive may involve making it not too easy to catch up to market technological leaders (by IP or other means). These two theoretical results imply that antitrust (making sure there are a lot of firms competing in a given market, spurring new innovation to take market share from rivals) and IP policy (ensuring that R&D actually needs to be performed in order to gain a lead) are in a sense complements! The fundamental theoretical driver is that the incentive to innovate depends not only on the rents of an innovation, but on the incremental rents of an innovation; if innovators include firms that already active in an industry, policy that makes your current technological state less valuable (because you are in a more competitive market, say) or policy that makes jumping to a better technological state more valuable both increase the size of the incremental rent, and hence the incentive to perform R&D.

Here are the key aspects of a simplified version of the model. An industry is a duopoly where consumers spend exactly 1 dollar per period. The duopolists produce partially substitutable goods, where the more similar the goods the more “product market competition” there is. Each of the duopolists produces their good at a firm-specific cost, and competes in Bertrand with their duopoly rival. At the minimal amount of product market competition, each firm earns constant profit regardless of their cost or their rival’s cost. Firms can invest in R&D which gives some flow probability of lowering their unit cost. Technological laggards sometimes catch up to the unit cost of leaders with exogenous probability; lower IP protection (or more prevalent spillovers) means this probability is higher. We’ll look only at features of this model in the stochastic distribution of technological leadership and lags which is a steady state if there infinite duopolistic industries.

In a model with these features, you always want at least a little competition, essentially for Arrow (1962) reasons: the size of the market is small when market power is large because total unit sales are low, hence the benefit of reducing unit costs is low, hence no one will bother to do any innovation in the limit. More competition can also be good because it increases the probability that two firms are at similar technological levels, in which case each wants to double down on research intensity to gain a lead. At very high levels of competition, the old Schumpeterian story might bind again: goods are so substitutable that R&D to increase rents is pointless since almost all rents are competed away, especially if IP is weak so that rival firms catch up to your unit cost quickly no matter how much R&D you do. What of the optimal level of IP? It’s always best to ensure IP is not too strong, or that spillovers are not too weak, because the benefit of increased R&D effort when firms are at similar technological levels following the spillover exceeds the lost incentive to gain a lead in the first place when IP is not perfectly strong. When markets are really competitive, however, the Schumpeterian insight that some rents need to exist militates in favor of somewhat stronger IP than in less competitive product markets.

Final working paper (RePEc IDEAS) which was published in 2001 in the Review of Economic Studies. This paper is the more detailed one theoretically, but if all of the insight sounds familiar, you may already know the hugely influential follow-up paper by Aghion, Bloom, Blundell, Griffith and Howitt, “Competition and Innovation: An Inverted U Relationship”, published in the QJE in 2005. That paper gives some empirical evidence for the idea that innovation is maximized at intermediate values of product market competition; the Schumpeterian “we need some rents” motive and the “firms innovate to escape competition” motive both play a role. I am actually not a huge fan of this paper – as an empirical matter, I’m unconvinced that most cost-reducing innovation in many industries will never show up in patent statistics (principally for reasons that Eric von Hippel made clear in The Sources of Innovation, which is freely downloadable at that link!). But this is a discussion for another day! One more related paper we have previously discussed is Goettler and Gordon’s 2012 structural work on processor chip innovation at AMD and Intel, which has a very similar within-industry motivation.

Like this:

Who benefits from innovation? The trivial answer would be that everyone weakly benefits, but since innovation can change the incentives of firms to offer different varieties of a product, heterogeneous tastes among buyers may imply that some types of innovation makes large groups of people worse off. Consider computers, a rapidly evolving technology. If Lenovo introduces a laptop with a faster processor, they may wish to discontinue production of a slower laptop, because offering both types flattens the demand curve for each, and hence lowers the profit-maximizing markup that can be charged for the better machine. This effect, combined with a fixed cost of maintaining a product line, may push firms to offer too little variety in equilibrium.

As an empirical matter, however, things may well go the other direction. Spence’s famous product selection paper suggests that firms may produce too much variety, because they don’t take into account that part of the profit they earn from a new product is just cannibalization of other firm’s existing product lines. Is it possible to separate things out from data? Note that this question has two features that essentially require a structural setup: the variable of interest is “welfare”, a completely theoretical concept, and lots of the relevant numbers like product line fixed costs are unobservable to the econometrician, hence they must be backed out from other data via theory.

There are some nice IO tricks to get this done. Using a near-universe of laptop sales in the early 2000s, Eizenberg estimates heterogeneous household demand using standard BLP-style methods. Supply is tougher. He assumed that firms get a fixed cost per product line shock, then pick their product mix each quarter, then observe consumer demand, then finally play Nash-Bertrand differentiated product pricing. The problem is that the pricing game often has multiple equilibria (e.g., with two symmetric firms, one may offer a high-end product and the other a low-end one, or vice versa). Since the pricing game equilibria are going to be used to back out fixed costs, we are in a bit of a bind. Rather than select equilibria using some ad hoc approach (how would you even do so in the symmetric case just mentioned?), Eizenberg cleverly just partially identifies fixed costs as backed out from any possible pricing game equilibrium, using bounds in the style of Pakes, Porter, Ho and Ishii. This means that welfare effects are also only partially identified.

Throwing this model at the PC data shows that the mean consumer in the early 2000s wasn’t willing to pay any extra for a laptop, but there was a ton of heterogeneity in willingness to pay both for laptops and for faster speed on those laptops. Every year, the willingness to pay for a given computer fell $257 – technology was rapidly evolving and lots of substitute computers were constantly coming onto the market.

Eizenberg uses these estimates to investigate a particularly interesting counterfactual: what was the effect of the introduction of the lighter Pentium M mobile processor? As Pentium M was introduced, older Pentium III based laptops were, over time, no longer offered by the major notebook makers. The M raised predicted notebook sales by 5.8 to 23.8%, raised mean notebook price by $43 to $86, and lowered Pentium III share in the notebook market from 16-23% down to 7.7%. Here’s what’s especially interesting, though: total consumer surplus is higher with the M available, but all of the extra consumer surplus accrues to the 20% least price-sensitive buyers (as should be intuitive, since only those with high willingness-to-pay are buying cutting edge notebooks). What if a social planner had forced firms to keep offering the Pentium III models after the M was introduced? Net consumer plus producer surplus may have actually been positive, and the benefits would have especially accrued to those at the bottom end of the market!

Now, as a policy matter, we are (of course) not going to force firms to offer money-losing legacy products. But this result is worth keeping in mind anyway: because firms are concerned about pricing pressure, they may not be offering a socially optimal variety of products, and this may limit the “trickle-down” benefits of high tech products.

This paper, by Heidi Williams (who surely you know already) and Bhaven Sampat (who is perhaps best known for his almost-sociological work on the Bayh-Dole Act with Mowery), made quite a stir at the NBER last week. Heidi’s job market paper a few years ago, on the effect of openness in the Human Genome Project as compared to Celera, is often cited as an “anti-patent” paper. Essentially, she found that portions of the human genome sequenced by the HGP, which placed their sequences in the public domain, were much more likely to be studied by scientists and used in tests than portions sequenced by Celera, who initially required fairly burdensome contractual steps to be followed. This result was very much in line with research done by Fiona Murray, Jeff Furman, Scott Stern and others which also found that minor differences in openness or accessibility can have substantial impacts on follow-on use (I have a paper with Yasin Ozcan showing a similar result). Since the cumulative nature of research is thought to be critical, and since patents are a common method of “restricting openness”, you might imagine that Heidi and the rest of these economists were arguing that patents were harmful for innovation.

That may in fact be the case, but note something strange: essentially none of the earlier papers on open science are specifically about patents; rather, they are about openness. Indeed, on the theory side, Suzanne Scotchmer has a pair of very well-known papers arguing that patents effectively incentivize cumulative innovation if there are no transaction costs to licensing, no spillovers from sequential research, and no incentive for early researchers to limit licenses in order to protect their existing business (consider the case of Farnsworth and the FM radio), and if potential follow-on innovators can be identified before they sink costs. That is a lot of conditions, but it’s not hard to imagine industries where inventions are clearly demarcated, where holders of basic patents are better off licensing than sitting on the patent (perhaps because potential licensors are not also competitors), and where patentholders are better off not bothering academics who technically infringe on their patent.

What industry might have such characteristics? Sampat and Williams look at gene patents. Incredibly, about 30 percent of human genes have sequences that are claimed under a patent in the United States. Are “patented genes” still used by scientists and developers of medical diagnostics after the patent grant, or is the patent enough of a burden to openness to restrict such use? What is interesting about this case is that the patentholder generally wants people to build on their patent. If academics find some interesting genotype-phenotype links based on their sequence, or if another firm develops a disease test based on the sequence, there are more rents for the patentholder to garner. In surveys, it seems that most academics simply ignore patents of this type, and most gene patentholders don’t interfere in research. Anecdotally, licenses between the sequence patentholder and follow-on innovators are frequent.

In general, it is really hard to know whether patents have any effect on anything, however; there is very little variation over time and space in patent strength. Sampat and Williams take advantage of two quasi-experiments, however. First, they compare applied-for-but-rejected gene patents to applied-for-but-granted patents. At least for gene patents, there is very little difference in terms of measurables before the patent office decision across the two classes. Clearly this is not true for patents as a whole – rejected patents are almost surely of worse quality – but gene patents tend to come from scientifically competent firms rather than backyard hobbyists, and tend to have fairly straightforward claims. Why are any rejected, then? The authors’ second trick is to look directly at patent examiner “leniency”. It turns out that some examiners have rejection rates much higher than others, despite roughly random assignment of patents within a technology class. Much of the difference in rejection probability is driven by the random assignment of examiners, which justifies the first rejected-vs-granted technique, and also suggested an instrumental variable to further investigate the data.

With either technique, patent status essentially generates no difference in the use of genes by scientific researchers and diagnostic test developers. Don’t interpret this result as turning over Heidi’s earlier genome paper, though! There is now a ton of evidence that minor impediments to openness are harmful to cumulative innovation. What Sampat and Williams tell us is that we need to be careful in how we think about “openness”. Patents can be open if the patentholder has no incentive to restrict further use, if downstream innovators are easy to locate, and if there is no uncertainty about the validity or scope of a patent. Indeed, in these cases the patentholder will want to make it as easy as possible for follow-on innovators to build on their patent. On the other hand, patentholders are legally allowed to put all sorts of anti-openness burdens on the use of their patented invention by anyone, including purely academic researchers. In many industries, such restrictions are in the interest of the patentholder, and hence patents serve to limit openness; this is especially true where private sector product development generates spillovers. Theory as in Scotchmer-Green has proven quite correct in this regard.

One final comment: all of these types of quasi-experimental methods are always a bit weak when it comes to the extensive margin. It may very well be that individual patents do not restrict follow-on work on that patent when licenses can be granted, but at the same time the IP system as a whole can limit work in an entire technological area. Think of something like sampling in music. Because all music labels have large teams of lawyers who want every sample to be “cleared”, hip-hop musicians stopped using sampled beats to the extent they did in the 1980s. If you investigated whether a particular sample was less likely to be used conditional on its copyright status, you very well might find no effect, as the legal burden of chatting with the lawyers and figuring out who owns what may be enough of a limit to openness that musicians give up samples altogether. Likewise, in the complete absence of gene patents, you might imagine that firms would change their behavior toward research based on sequenced genes since the entire area is more open; this is true even if the particular gene sequence they want to investigate was unpatented in the first place, since having to spend time investigating the legal status of a sequence is a burden in and of itself.

Like this:

Patents may increase or hinder cumulative invention. On the one hand, a patentholder can use his patent to ensure that downstream innovators face limited competition and thus have enough rents to make it worthwhile developing their product. On the other hand, holdup and other licensing difficulties have been shown in many theoretical models to make patents counterproductive. Galasso and Schankerman use patent invalidation trials to try and separate out the effect, and the broad strokes of the theory appear to hold up: on average, patents do limit follow-up invention, but this limitation appears to solely result from patents held by large firms, used by small firms, in technologically complex areas without concentrated power.

The authors use a clever IV to generate this result. The patent trials they look at involve three judges, selected at random. Looking at other cases the individual judges have tried, we can estimate the proclivity to strike down a patent for a given judge, and thus predict the probability a certain panel in the future will strike down a certain patent. That is, the proclivity of the judges to strike down the patent is a nice IV for whether the patent is actually struck down. In the second stage of the IV, investigate how this predicted probability of being invalidated, along with covariates and the pre-trial citation path, impact post-trial citations. And the impact is large: on average, citations increase 50% following an invalidation (and indeed, the Poisson IV estimate mentioned in a footnote, which seems more justified econometrically to me, is even larger).

There is, however, substantial heterogeneity. Estimating a marginal treatment effect (using a trick of Heckman and Vycatil’s) suggests the biggest impact of invalidation on patents whose unobservables make them less likely to be overturned. To investigate this heterogeneity further, the authors run their regressions again including measures of technology class concentration (what % of patents in a given subclass come from the top few patentees) and industry complexity (using the Levin survey). They also denote how many patents the patentee involved in the trial received in the years around the trial, as well as the number of patents received by those citing the patentee. The harmful effect of patents on future citations appears limited to technology classes with relatively low concentration, complex classes, large firms with the invalidated patent, and small firms doing the citing. These characteristics all match well with the type of technologies theory imagines to be linked to patent thickets, holdup potential or high licensing costs.

In the usual internal validity/external validity way, I don’t know how broadly these results generalize: even using the judges as an IV, we are still deriving treatment effects conditional on the patent being challenged in court and actually reaching a panel decision concerning invalidation; it seems reasonable to believe that the mere fact a patent is being challenged is evidence that licensing is problematic, and the mere fact that a settlement was not reached before trial even more so. The social welfare impact is also not clear to me: theory suggests that even when patents are socially optimal for cumulative invention, the primary patentholder will limit licensing to a small number of firms in order to protect their rents, hence using forward citations as a measure of cumulative invention allows no way to separate socially optimal from socially harmful limits. But this is at least some evidence that patents certainly don’t democratize invention, and that result fits squarely in with a growing literature on the dangers of even small restrictions on open science.

Like this:

When we talk about strategic equilibrium, we can talk in a very formal sense, as many refinements with their well-known epistemic conditions have been proposed, the nature of uncertainty in such equilibria has been completely described, the problems of sequential decisionmaking are properly handled, etc. So when we do analyze history, we have a useful tool to describe how changes in parameters altered the equilibrium incentives of various agents. Path dependence, the idea that past realizations of history matter (perhaps through small events, as in Brian Arthur’s work) is widespread. A typical explanation given is increasing returns. If I buy a car in 1900, I make you more likely to buy a car in 1901 by, at the margin, lowering the production cost due to increasing returns to scale or lowering the operating cost by increasing incentives for gas station operators to operate.

This is quite informal, though; worse, the explanation of increasing returns is neither necessary nor sufficient for history-dependence. How can this be? First, consider that “history-dependence” may mean (at least) six different things. History can effect either the path of history, or its long-run outcome. For example, any historical process satisfying the assumptions of the ergodic theorem can be history-dependent along a path, yet still converge to the same state (in the network diffusion paper discussed here last week, a simple property of the network structure tells me whether an epidemic will diffuse entirely in the long-run, but the exact path of that eventual diffusion clearly depends on something much more complicated). We may believe, for instance, that the early pattern of railroads affected the path of settlement of the West without believing that this pattern had much consequence for the 2010 distribution of population in California. Next, history-dependence in the long-run or short-run can depend either on a state variable (from a pre-defined set of states), the ordered set of past realizations, or the unordered set of past realizations (the latter called path and phat dependence, respectively, since phat dependence does not depend on order). History matters in elections due to incumbent bias, but that history-dependence can basically be summed up by a single variable denoting who is the current incumbent, omitting the rest of history’s outcomes. Phat dependence is likely in simple technology diffusion: I adopt a technology as a function of which of my contacts has adopted it, regardless of the order in which they adopted. Path dependence comes up, for example, in models of learning following Aumann and Geanakoplos/Polemarchakis, consensus among a group can be broken if agents do not observe the time at which messages were sent between third parties.

Now consider increasing returns. For which types of increasing returns is this necessary or sufficient? It turns out the answer is, for none of them! Take again the car example, but assume there are three types of cars in 1900, steam, electric and gasoline. For the same reasons that gas-powered cars had increasing returns, steam and electric cars do as well. But the relative strength of the network effect for gas-powered cars is stronger. Page thinks of this as a biased Polya process. I begin with five balls, 3 G, 1 S and 1 E, in an urn. I draw one at random. If I get an S or an E, I return it to the urn with another ball of the same type (thus making future draws of that type more common, hence increasing returns). If I draw a G, I return it to the urn along with 2t more G balls, where t is the time which increments by 1 after each draw. This process converges to having arbitrarily close to all balls of type G, even though S and E balls also exhibit increasing returns.

Why about the necessary condition? Surely, increasing returns are necessary for any type of history-dependence? Well, not really. All I need is some reason for past events to increase the likelihood of future actions of some type, in any convoluted way I choose. One simple mechanism is complementarities. If A and B are complements (adopting A makes B more valuable, and vice versa), while C and D are also complements, then we can have the following situation. An early adoption of A makes B more valuable, increasing the probability of adopting B the next period which itself makes future A more valuable, increasing the probability of adopting A the following period, and so on. Such reasoning is often implicit in the rhetoric linking market-based middle class to a democratic political process: some event causes a private sector to emerge, which increases pressure for democratic politics, which increases protection of capitalist firms, and so on. As another example, consider the famous QWERTY keyboard, the best-known example of path dependence we have. Increasing returns – that is, the fact that owning a QWERTY keyboard makes this keyboard more valuable for both myself and others due to standardization – is not sufficient for killing the Dvorak or other keyboards. This is simple to see: the fact that QWERTY has increasing returns doesn’t mean that the diffusion of something like DVD players is history-dependent. Rather, it is the combination of increasing returns for QWERTY and a negative externality on Dvorak that leads to history-dependence for Dvorak. If preferences among QWERTY and Dvorak are Leontief, and valuations for both have increasing returns, then I merely buy the keyboard I value highest – this means that purchases of QWERTY by others lead to QWERTY lock-in by lowering the demand curve for Dvorak, not merely by raising the demand curve for QWERTY. (And yes, if you are like me and were once told to never refer to effects mediated by the market as “externalities”, you should quibble with the vocabulary here, but the point remains the same.)

All in all interesting, and sufficient evidence that we need a better formal theory and taxonomy of history dependence than we are using now.

Final version in the QJPS (No IDEAS version). The essay is written in a very qualitative/verbal manner, but more because of the audience than the author. Page graduated here at MEDS, initially teaching at Caltech, and his CV lists quite an all-star cast of theorist advisers: Myerson, Matt Jackson, Satterthwaite and Stanley Reiter!

Like this:

The relation between competition and innovation is theoretically ambiguous. On the one hand, as Schumpeter pointed out, having market power allows you to recover rents from new product sales, so you might expect monopolies to innovate more. On the other hand, innovation is costly, so without competitive pressure, you may simply rest on your laurels and keep selling your old product.

Goettler and Gordon, in a recent JPE, use the Intel/AMD microprocessor competition to investigate this issue. Innovation is easy to measure here – we simply look at the processor speed at the frontier for each firm, and avoid any messy issues about the difference between patented inventions and “actual” inventions. We can also track for over a decade the price differences in each firm’s top chips, the speed differences, and the response. The market is also for all practical purposes a duopoly with very little attempted entry. Computers possess another interesting property, in that they are durable goods. Past products compete with future sales. You may wish to keep prices high when you have market power this period in order not to cannibalize future sales if you expect a good innovation to appear next period for which you can charge even higher prices. Many sectors of the economy involve durable goods, of course.

The authors use a simple model to estimate consumer preferences in a structural model with spillovers (it is harder to push the frontier than to catch up). They find that, if Intel had a monopoly, innovation would have been 4% faster, but consumer surplus would have been 4% lower due to the higher prices charged by Intel, which is the standard Schumpeterian tradeoff. They find consumer surplus is maximized in a world where Intel has some anticompetitive power, though not monopoly power. The reason is that monopoly firms in durable goods markets still need to innovate because of competition with their old products, whereas duopolists can only earn rents to cover R&D costs if the two firms are selling different technologies. There are a number of interesting comparative statics as well. If spillovers are nonexistent, then the two firms race until one has a sufficiently large technological lead, at which point the other firm gives up, and no more innovation takes place, while if spillovers are large, the returns to each firm from doing R&D are low. In both cases, monopolists in a durable goods market innovate more. If spillovers are of an intermediate level, then duopolists will innovate more. As the authors note, “such variation might be one reason cross-industry studies have difficulty identifying robust relationships.”

The estimation involves some technical difficulties which may interest the Pakes-style IO readers. I am not an IO guy myself, so perhaps a reader can comment as to the more general style of this sort of paper. While I find the theory interesting, and am impressed by the difficulty of the empirical estimation, what exactly is the value of this sort of estimation? We know from theory the important qualitative tradeoffs. The style of estimation here can really only be done ex-post – the methods here could not be used, for example, to identify contemporaneously whether a anticompetitive behavior in a particularly durable goods industry is harmful for social welfare. I don’t mean to single this paper out, as this comment applies to a huge number of IO articles.

Like this:

Carl Shapiro, in addition to being a bigshot in the academic study of invention, is also a member of Obama’s Council of Economic Advisers. I’m not sure how much of a role he had in advising on the Leahy-Smith patent reform act that was passed last year, but many of the reforms seem to come directly from this NBER Working Paper, so I imagine his role was a big one.

Most academic economists working on IP-related issues think, for a variety of reasons, that IP is currently far stronger than the optimal level. Indeed, many would prefer a world with no patents and copyrights at all to the current system. But let’s take the simplest possible reform: if the social benefit granted by a patent exceeds the social value created by the invention, we ought limit the strength of the patent. You might wonder, how is it even possible for the patentholder to gain more than the social value of his invention? A standard monopolist with a patent still creates consumer surplus and some deadweight loss – that is, social value not captured by the inventor – unless the monopolist is perfectly price discriminating. Shapiro, drawing on a number of earlier papers, gives three nice examples where return to the patentholder exceeds social value. Unless otherwise noted, we assume there is zero deadweight loss created by the patent; if there is deadweight loss, the reason for weakening the patent is even stronger.

First, we know from Loury (1979) and Tandon (1983) that if a patent gives the first firm to invent the full social value of his invention, there will be too much effort expended trying to win that prize; when each firm is deciding whether to expend more effort on R&D, they do not take into account that their increased effort lowers the probability of winning for the other firm. Tandon shows that this “patent race” effect is particularly strong for inventions that are relatively cheap to produce, such as those that are close to obvious. One way to fix this problem somewhat is to allow a second firm who independently invents at roughly the same time as the first firm to invent to sell the product without needing a license. That is, if a product is easy to invent, and two firms expend a lot of effort on it in an attempt to win the patent race, the second firm’s effort is not a total social waste since it may lead to a second independent invention, turning the eventual monopoly (with high deadweight loss) into a duopoly (with lower deadweight loss). Many economists and legal scholars have proposed allowing an independent inventor exception, but Congress has thus far shown no interest in taking up this idea. This is perhaps no surprise: Congress refused to pass the Public Domain Enhancement Act a few years back, an IP-related law that is as big a free lunch as you will ever see.

Second, probabilistic patents are often not challenged. Imagine a patent that, if challenged in court, has a 30% chance of being upheld as valid; many such weak patents exist. Assume that is totally free to challenge the patent, meaning there are no legal or transaction costs. Shapiro shows the following example, drawing on a paper of his with Joseph Farrell. Let a patent with probability .3 of being upheld when challenged be licensed to an oligopolistic downstream industry. The patent adds $10 of value to the products of all downstream inventors, so if the license royalty is greater than $3, the patentholder is earning more than the expected value of his patent. Imagine a royalty of $6. If I challenge the patent in court and my rival does not, then when I win the challenge, I and my rival in the downstream product market are both able to use the invention without paying any license fee, hence our costs are the same, and hence winning the challenge does not earn me any more profits due to competition with my rival. If I lose the challenge, then my rival pays a royalty of only $6, whereas I will have to pay $10 for each unit where I infringe, and hence I will be at a disadvantage in the downstream market. Therefore, neither firm will challenge the patent in equilibrium, and the inventor will earn more than his true social contribution.

Third, hold-up, particularly in the form of the “patent ambush,” can lead to excess returns. Imagine I can sell my product with noninfringing design A at a price of 100 dollars, or with infringing design B, for which I will need to license a previous invention, at a price of 120 dollars. The patent thus increases the value of my product by $20. If I Nash bargain with the inventor, we will split the gains from using his invention in my product, and therefore I will pay $10 to use the invention, and earn $110 per unit by producing design B. This intuition is very different if I first make investments, then learn about the patent. Imagine A and B both require 40 dollars of fixed cost per unit, each, to design. If I don’t know about the patent, I will design product B, and plan to earn 80 dollars per unit. The patentholder will then come to me and tell me I need a license or he will sue for infringement. Once the fixed cost of B is sunk, the surplus from obtaining a license is 20+40=60 dollars, since not obtaining a license means I will need to produce A, which costs another 40 dollars and sells for 20 dollars than design B. So a Nash bargaining outcome is that I pay 30 dollars for the license and produce B. That is, the patentholder can use holdup to extract extra rents after I have made specific investments.

One way to fix the last two problems is to allow informal post-grant challenges to patents, perhaps by third parties. This makes weak patents in important industries less likely to cause hold-up after specific investment, and also limits the ability of third parties to take advantage of the reluctance of licensees to challenge once license terms have been established. The new patent reform does vastly increase the scope for post-grant review.

What’s too bad about the 2011 patent reform is that the types of examples provided by Shapiro above are only the most clean-cut, overwhelmingly obvious ways to improve the efficiency of the patent system. They don’t even pretend to approach what would be necessary for an optimal IP regime. Aside from a handful of congressmen, (Zoe Lofgren and Ron Wyden on the democratic side, or Jason Chaffetz on the Republican side, among them) Congress is filled with IP maximalists. For the sake of social welfare, it’s too bad.

Like this:

This post continues a series of notes on the main theoretical models of innovation. The first post covered the patent race literature. Here I’ll cover the sequential innovation literature most associated with Suzanne Scotchmer, particularly in her 1991 JEP and her 1995 RAND with Jerry Green.

Let there be two inventions instead of one, where the second builds upon the first. Let invention 1 cost c1, and invention 2 cost c2, with firm 1 having the ability to invent invention 1, and firm 2 invention 2. If only invention 1 exists, the inventing firm earns v1 (where v1 is a function of patent length T). If both invention 1 and 2 exist, and compete for sales in a market, then they earn v1c and v2c, where c stands for “compete”. If both invention 1 and 2 exist, but are sold by a monopolist, they earn v12>=v1c+v2c. With probability p, 2 will infringe on 1, and hence inventor 2 will need a license to sell product 2.

With one invention, it’s intuitive that the length of the patent should be just long enough to allow the inventor to cover the cost of that invention. This logic does not hold when inventions build on each other. Invention 1 makes invention 2 possible, so it seems we should give some of the social surplus created by invention 2 to the inventor of 1. But doing so makes it impossible to give all of the surplus created by invention 2 to inventor 2. This is a standard problem in the theory of complementary goods: if left shoe has social value 0, and right shoe by itself has social value 0, but the two together have value 1, then the “marginal value” created by each shoe is 1. Summing the marginal values created gives us 2, but the total social value of the pair of shoes is only 1. This wedge between the partial equilibrium concept of marginal value and our intuition about general equilibrium actually comes up quite a bit: willingness-to-pay, by definition, is only meaningful in a partial equilibrium sense despite frequent misuse to the contrary.

So how should efficiently give patent rights with sequential inventions? First assume there is no possibility to form an ex-ante license between the two firms, though of course firms can sell inventions to each other once the product is invented. Also assume that profits are divided using Nash bargaining when firms sell a patent to each other: in this case, each firm garners half of the profit earned using the patents minus the threat point representing what each firm earns in the absence of an agreement. Consider our logic from the one invention case, where we get incentives correct by making patent length just long enough to cover costs: v12(T)=c1+c2, where v12 is the revenue earned by having both products 1 and 2 in the same monopoly firm given patent length T, c1 is the cost of developing product 1, and c2 is the cost to develop product 2. Setting v12(T) just equal to c1+c2 will, in general, provide insufficient incentives for both products to be developed. That is, making patent length long enough that inventor 1 can afford to cover her costs, and the costs of inventor 2, while making precisely zero profit, is insufficient for inducing the invention of both 1 and 2. Why? One reason is that once 2 is invented, the development costs of 2 are sunk. Therefore, once 2 is invented, the licensing agreement will not take into account inventor 2’s costs. Inventor 2, knowing this, may be reluctant to invest in product 2 in the first place.

How might I fix this? Allow ex-ante joint ventures. That is, let firm 1 and firm 2 form a joint venture before the costs of creating invention 2 are sunk. If ex-ante joint ventures is allowed, the optimal patent breadth is p=1: the second invention always infringes. The reason is simply that longer patent length diminishes the bargaining power of firm 2 at the stage in the game where the joint venture is created, 2 knows that he will be required to get an ex-post license after inventing if no joint venture is formed. The Nash bargaining share given to 2 in an ex-post license is always higher if there is a chance that 2 does not infringe because 2’s Nash threat point is higher. Therefore, the share of monopoly profit that needs to be given in an ex-ante joint venture to 2 is higher, because this share is determined in Nash bargaining by the “threat” 2 has of not signing the joint venture agreement, developing product 2, and then signing an ex-post license agreement. Since the total surplus when both products are sold by a monopoly is a fixed amount, giving more profit to 2 means giving less profit to 1. By construction, this increased profit for 2 does change the probability that firm 2 invests in invention 2; rather, the distortion is that less profit to 1 means less incentive for firm 1 to invest in invention 1. So optimal patent breadth is always p=1: follow-up inventions should always infringe.

The intuition above has been modified in many papers. Scotchmer and Green themselves note that that if the value v2c of the second invention is stochastic, and only realizes after firm 2 invests, less than perfectly broad patents can be optimal. Bessen and Maskin’s 2009 RAND, discussed previously on this site, notes that imperfect information across firms about research costs can make patents strictly worse than no patents, because with patents I will only offer joint ventures that are acceptable to low-cost researchers even when social welfare maximization would require both low and high-cost researchers to work on the next invention. A coauthor here at Northwestern and I have a result, which I’ll write up here at some point, that broad patents are not optimal when we allow for multiple paths toward future inventions. Without giving away the whole plot, the basic point is that broad patents cause distortions early on – as firms race inefficiently to get the broad patent – whereas narrow patents cause distortions later – as firms inefficiently try to invent around the patent. The second problem can be fixed with licenses granted by the patentholder, but the first cannot as the distortion occurs before there is anything to license.

The main papers discussed here are Scotchmer and Green’s 1995 RAND (Final RAND copy, IDEAS) and Scotchmer’s 1991 JEP (Final JEP copy, IDEAS). Despite the dates, the JEP was written after the RAND’s original working paper.