Richard Taylor(taylorr@maths.ox.ac.uk)Mathematics InstituteOxford University24-29 St. GilesOxford, OX1 3LBUnited KingdomMay 24, 2000The authors would like to give special thanks to N. Boston, K. Buzzard, andB. Conrad for providing so much valuable feedback on earlier versions of thispaper. They are also grateful to A. Agboola, M. Bertolini, B. Edixhoven, J.Fearnley, R. Gross, L. Guo, F. Jarvis, H. Kisilevsky, E. Liverance, J. Manoharmayum, K. Ribet, D. Rohrlich, M. Rosen, R. Schoof, J.-P. Serre, C. Skinner,D. Thakur, J. Tilouine, J. Tunnell, A. Van der Poorten, and L. Washingtonfor their helpful comments.Darmon thanks the members of CICMA and of the Quebec-Vermont Number Theory Seminar for many stimulating conversations on the topics of thispaper, particularly in the Spring of 1995. For the same reason Diamond isgrateful to the participants in an informal seminar at Columbia Universityin 1993-94, and Taylor thanks those attending the Oxford Number TheorySeminar in the Fall of 1995.1

Parts of this paper were written while the authors held positions at otherinstitutions: Darmon at Princeton University, Diamond at the Institute forAdvanced Study, and Taylor at Cambridge University. During some of the period, Diamond enjoyed the hospitality of Princeton University, and Taylor thatof Harvard University and MIT. The writing of this paper was also supportedby research grants from NSERC (Darmon), NSF # DMS 9304580 (Diamond)and by an advanced fellowship from EPSRC (Taylor).This article owes everything to the ideas of Wiles, and the arguments presented here are fundamentally his [W3], though they include both the work[TW] and several simplifications to the original arguments, most notably thatof Faltings. In the hope of increasing clarity, we have not always statedtheorems in the greatest known generality, concentrating instead on what isneeded for the proof of the Shimura-Taniyama conjecture for semi-stable elliptic curves. This article can serve as an introduction to the fundamental papers[W3] and [TW], which the reader is encouraged to consult for a different, andoften more in-depth, perspective on the topics considered. Another usefulmore advanced reference is the article [Di2] which strengthens the methods of[W3] and [TW] to prove that every elliptic curve that is semistable at 3 and 5is modular.

IntroductionFermats Last TheoremFermats Last Theorem states that the equationxn + y n = z n ,

xyz 6= 0

has no integer solutions when n is greater than or equal to 3. Around 1630,

Pierre de Fermat claimed that he had found a truly wonderful proof of thistheorem, but that the margin of his copy of Diophantus Arithmetica was toosmall to contain it:Cubum autem in duos cubos, aut quadrato quadratum in duosquadrato quadratos, et generaliter nullam in infinitum ultra quadratum potestatem in duos ejusdem nominis fas est dividere; cujusrei demonstrationem mirabile sane detexi. Hanc marginis exiguitasnon caperet.Among the many challenges that Fermat left for posterity, this was to provethe most vexing. A tantalizingly simple problem about whole numbers, itstood unsolved for more than 350 years, until in 1994 Andrew Wiles finallylaid it to rest.Prehistory: The only case of Fermats Last Theorem for which Fermat actually wrote down a proof is for the case n = 4. To do this, Fermat introducedthe idea of infinite descent which is still one the main tools in the study ofDiophantine equations, and was to play a central role in the proof of FermatsLast Theorem 350 years later. To prove his Last Theorem for exponent 4, Fermat showed something slightly stronger, namely that the equation x4 +y 4 = z 2has no solutions in relatively prime integers with xyz 6= 0. Solutions to suchan equation correspond to rational points on the elliptic curve v 2 = u3 4u.Since every integer n 3 is divisible either by an odd prime or by 4, the resultof Fermat allowed one to reduce the study of Fermats equation to the casewhere n = ` is an odd prime.In 1753, Leonhard Euler wrote down a proof of Fermats Last Theorem forthe exponent ` = 3, by performing what in modern language we would calla 3-descent on the curve x3 + y 3 = 1 which is also an elliptic curve. Eulersargument (which seems to have contained a gap) is explained in [Edw], ch. 2,and [Dic1], p. 545.It took mathematicians almost 100 years after Eulers achievement to handle the case ` = 5; this was settled, more or less simultaneously, by Gustav3

Peter Lejeune Dirichlet [Dir] and Adrien Marie Legendre [Leg] in 1825. Theirelementary arguments are quite involved. (Cf. [Edw], sec. 3.3.)In 1839, Fermats equation for exponent 7 also yielded to elementary methods, through the heroic efforts of Gabriel Lame. Lames proof was even moreintricate than the proof for exponent 5, and suggested that to go further, newtheoretical insights would be needed.The work of Sophie Germain: Around 1820, in a letter to Gauss, SophieGermain proved that if ` is a prime and q = 2` + 1 is also prime, then Fermatsequation x` + y ` = z ` with exponent ` has no solutions (x, y, z) with xyz 6= 0(mod `). Germains theorem was the first really general proposition on Fermats Last Theorem, unlike the previous results which considered the Fermatequation one exponent at a time.The case where the solution (x, y, z) to x` + y ` = z ` satisfies xyz 6= 0 (mod`) was called the first case of Fermats Last Theorem, and the case where `divides xyz, the second case. It was realized at that time that the first casewas generally easier to handle: Germains theorem was extended, using similarideas, to cases where k` + 1 is prime and k is small, and this led to a proof thatthere were no first case solutions to Fermats equation with prime exponents` 100, which in 1830 represented a significant advance. The division betweenfirst and second case remained fundamental in much of the later work on thesubject. In 1977, Terjanian [Te] proved that if the equation x2` + y 2` = z 2` hasa solution (x, y, z), then 2` divides either x or y, i.e., the first case of FermatsLast Theorem is true for even exponents. His simple and elegant proof usedonly techniques that were available to Germain and her contemporaries.The work of Kummer: The work of Ernst Eduard Kummer marked thebeginning of a new era in the study of Fermats Last Theorem. For the firsttime, sophisticated concepts of algebraic number theory and the theory ofL-functions were brought to bear on a question that had until then beenaddressed only with elementary methods. While he fell short of providinga complete solution, Kummer made substantial progress. He showed howFermats Last Theorem is intimately tied to deep questions on class numbersof cyclotomic fields which are still an active subject of research. Kummersapproach relied on the factorization(x + y)(x + ` y) (x + ``1 y) = z `of Fermats equation over the ring Z[` ] generated by the `th roots of unity.One observes that the greatest common divisor of any two factors in the product on the left divides the element (1 ` ), which is an element of norm `.4

Since the product of these numbers is a perfect `-th power, one is tempted toconclude that (x + y), . . . , (x + ``1 y) are each `-th powers in the ring Z[` ] upto units in this ring, and up to powers of (1 ` ). Such an inference would bevalid if one were to replace Z[` ] by Z, and is a direct consequence of uniquefactorization of integers into products of primes. We say that a ring R hasproperty U F if every non-zero element of R is uniquely a product of primes,up to units. Mathematicians such as Lame made attempts at proving Fermats Last Theorem based on the mistaken assumption that the rings Z[` ]had property U F . Legend even has it that Kummer fell into this trap, although this story now has been discredited; see for example [Edw], sec. 4.1. Infact, property U F is far from being satisfied in general: one now knows thatthe rings Z[` ] have property U F only for ` < 23 (cf. [Wa], ch. 1).It turns out that the full force of property U F is not really needed in theapplications to Fermats Last Theorem. Say that a ring R has property U F`if the following inference is valid:ab = z ` , and gcd(a, b) = 1 a and b are `th powers up to units of R.If a ring R has property U F , then it also has property U F` , but the converseneed not be true. Kummer showed that Fermats last theorem was true forexponent ` if Z[` ] satisfied the property U F` (cf. [Wa]). The proof is far fromtrivial, because of difficulties arising from the units in Z[` ] as well as fromthe possible failure of property U F . (A number of Kummers contemporaries,such as Cauchy and Lame, seem to have overlooked both of these difficultiesin their attempts to prove Fermats Last Theorem.)Kummer then launched a systematic study of the property U F` for therings Z[` ]. He showed that even if Z[` ] failed to have unique factorization,it still possessed unique factorization into prime ideals. He defined the idealclass group as the quotient of the group of fractional ideals by its subgroupconsisting of principal ideals, and was able to establish the finiteness of thisclass group. The order of the class group of Z[` ], denoted h` , could be takenas a measure of the failure of the ring Z[` ] to satisfy U F . It was ratherstraightforward to show that if ` did not divide h` , then Z[` ] satisfied theproperty U F` . In this case, one called ` a regular prime. Kummer thus showedthat Fermats last theorem is true for exponent ` if ` is a regular prime.He did not stop here. For it remained to give an efficient means of computing h` , or at least an efficient way of checking when ` divides h` . The classnumber h` can be factorized as a product

h` = h +` h` ,

+where h+` is the class number of the real subfield Q(` ) , and h` is defined as+++h` /h` . Essentially because of the units in Q(` ) , the factor h` is somewhatdifficult to compute, while, because the units in Q(` )+ generate the group ofunits in Q(` ) up to finite index, the term h` can be expressed in a simple

closed form. Kummer showed that if ` divides h+

` , then ` divides h` . Hence, `divides h` if and only if ` divides h` . This allowed one to avoid the difficulties+inherent in the calculation of h` . Kummer then gave an elegant formula for h`by considering the Bernoulli numbers Bn , which are rational numbers definedby the formulaX Bnx=xn .ex 1n!He produced an explicit formula for the class number h` , and concluded thatif ` does not divide the numerator of B2i , for 1 i (` 3)/2, then ` isregular, and conversely.The conceptual explanation for Kummers formula for h` lies in the workof Dirichlet on the analytic class number formula, where it is shown that h `can be expressed as a product of special values of certain (abelian) L-series

L(s, ) =

(n)ns

n=1

associated to odd Dirichlet characters. Such special values in turn can be

expressed in terms of certain generalized Bernoulli numbers B1, , which arerelated to the Bernoulli numbers Bi via congruences mod `. (For more details,see [Wa].)These considerations led Kummer to initiate a deep study relating congruence properties of special values of L-functions and of class numbers, whichwas to emerge as a central concern of modern algebraic number theory, andwas to reappear in a surprisingly different guise at the heart of Wilesstrategy for proving the Shimura-Taniyama conjecture.Later developments: Kummers work had multiple ramifications, and ledto a very active line of enquiry pursued by many people. His formulae relating Bernoulli numbers to class numbers of cyclotomic fields were refinedby Kenneth Ribet [R1], Barry Mazur and Andrew Wiles [MW], using newmethods from the theory of modular curves which also play a central role inWiles more recent work. (Later Francisco Thaine [Th] reproved some of theresults of Mazur and Wiles using techniques inspired directly from a readingof Kummer.) In a development more directly related to Fermats Last Theorem, Wieferich proved that if `2 does not divide 2`1 1, then the first caseof Fermats Last Theorem is true for exponent `. (Cf. [Ri], lecture VIII.)6

There were many other refinements of similar criteria for Fermats Lasttheorem to be true. Computer calculations based on these criteria led to averification that Fermats Last theorem is true for all odd prime exponents lessthan four million [BCEM], and that the first case is true for all ` 8.858 10 20[Su].The condition that ` is a regular prime seems to hold heuristically for about61% of the primes. (See the discussion on p. 63, and also p. 108, of [Wa], forexample.) In spite of the convincing numerical evidence, it is still not knownif there are infinitely many regular primes. Ironically, it is not too difficult toshow that there are infinitely many irregular primes. (Cf. [Wa].)Thus the methods introduced by Kummer, after leading to very strongresults in the direction of Fermats Last theorem, seemed to become mired indifficulties, and ultimately fell short of solving Fermats conundrum1 .Faltings proof of the Mordell conjecture: In 1985, Gerd Faltings [Fa]proved the very general statement (which had previously been conjecturedby Mordell) that any equation in two variables corresponding to a curve ofgenus strictly greater than one had (at most) finitely many rational solutions.In the context of Fermats Last Theorem, this led to the proof that for eachexponent n 3, the Fermat equation xn + y n = z n has at most finitely manyinteger solutions (up to the obvious rescaling). Andrew Granville [Gra] andRoger Heath-Brown [HB] remarked that Faltings result implies Fermats LastTheorem for a set of exponents of density one.However, Fermats Last Theorem was still not known to be true for aninfinite set of prime exponents. In fact, the theorem of Faltings seemed illequipped for dealing with the finer questions raised by Fermat in his margin,namely of finding a complete list of rational points on all of the Fermat curvesxn + y n = 1 simultaneously, and showing that there are no solutions on thesecurves when n 3 except the obvious ones.Mazurs work on Diophantine properties of modular curves: Althoughit was not realized at the time, the chain of ideas that was to lead to a proofof Fermats Last theorem had already been set in motion by Barry Mazurin the mid seventies. The modular curves X0 (`) and X1 (`) introduced insection 1.2 and 1.5 give rise to another naturally occurring infinite familyof Diophantine equations. These equations have certain systematic rationalsolutions corresponding to the cusps that are defined over Q, and are analogous1

However, W. McCallum has recently introduced a technique, based on the method

of Chabauty and Coleman, which suggests new directions for approaching Fermats LastTheorem via the cyclotomic theory. An application of McCallums method to showing thesecond case of Fermats Last Theorem for regular primes is explained in [Mc].

to the so-called trivial solutions of Fermats equation. Replacing Fermat

curves by modular curves, one could ask for a complete list of all the rationalpoints on the curves X0 (`) and X1 (`). This problem is perhaps even morecompelling than Fermats Last Theorem: rational points on modular curvescorrespond to objects with natural geometric and arithmetic interest, namely,elliptic curves with cyclic subgroups or points of order `. In [Maz1] and [Maz2],B. Mazur gave essentially a complete answer to the analogue of Fermats LastTheorem for modular curves. More precisely, he showed that if ` 6= 2, 3, 5and 7, (i.e., X1 (`) has genus > 0) then the curve X1 (`) has no rational pointsother than the trivial ones, namely cusps. He proved analogous results forthe curves X0 (`) in [Maz2], which implied, in particular, that an elliptic curveover Q with square-free conductor has no rational cyclic subgroup of order `over Q if ` is a prime which is strictly greater than 7. This result appeared afull ten years before Faltings proof of the Mordell conjecture.Freys strategy: In 1986, Gerhard Frey had the insight that these constructions might provide a precise link between Fermats Last Theorem and deepquestions in the theory of elliptic curves, most notably the Shimura Taniyamaconjecture. Given a solution a` + b` = c` to the Fermat equation of primedegree `, we may assume without loss of generality that a` 1 (mod 4) andthat b` 0 (mod 32). Frey considered (following Hellegouarch, [He], p. 262;cf. also Kubert-Lang [KL], ch. 8, 2) the elliptic curveE : y 2 = x(x a` )(x + b` ).This curve is semistable, i.e., it has square-free conductor. Let E[`] denote the

group of points of order ` on E defined over some (fixed) algebraic closure Q

of Q, and let L denote the smallest number field over which these points aredefined. This extension appears as a natural generalization of the cyclotomicfields Q(` ) studied by Kummer. What singles out the field L for specialattention is that it has very little ramification: using Tates analytic descriptionof E at the primes dividing abc, it could be shown that L was ramified only at 2and `, and that the ramification of L at these two primes was rather restricted.(See theorem 2.15 of section 2.2 for a precise statement.) Moreover, the resultsof Mazur on the curve X0 (`) could be used to show that L is large, in thefollowing precise sense. The space E[`] is a vector space of dimension 2 over the

Mazurs results in [Maz1] and [Maz2] imply that E,` is irreducible if ` > 7(using the fact that E is semi-stable). In fact, combined with earlier resultsof Serre [Se6], Mazurs results imply that for ` > 7, the representation E,` issurjective, so that Gal (L/Q) is actually isomorphic to GL2 (F` ) in this case.Serres conjectures: In [Se7], Jean-Pierre Serre made a careful study of mod` Galois representations : GQ GL2 (F` ) (and, more generally, of representations into GL2 (k), where k is any finite field). He was able to make veryprecise conjectures (see section 3.2) relating these representations to modularforms mod `. In the context of the representations E,` that occur in Freysconstruction, Serres conjecture predicted that they arose from modular forms(mod `) of weight two and level two. Such modular forms, which correspond todifferentials on the modular curve X0 (2), do not exist because X0 (2) has genus0. Thus Serres conjecture implied Fermats Last Theorem. The link betweenfields with Galois groups contained in GL2 (F` ) and modular forms mod ` stillappears to be very deep, and Serres conjecture remains a tantalizing openproblem.Ribets work: lowering the level: The conjecture of Shimura and Taniyama (cf. section 1.8) provides a direct link between elliptic curves and modularforms. It predicts that the representation E,` obtained from the `-divisionpoints of the Frey curve arises from a modular form of weight 2, albeit a formwhose level is quite large. (It is the product of all the primes dividing abc,where a` + b` = c` is the putative solution to Fermats equation.) Ribet [R5]proved that, if this were the case, then E,` would also be associated with amodular form mod ` of weight 2 and level 2, in the way predicted by Serresconjecture. This deep result allowed him to reduce Fermats Last Theorem tothe Shimura-Taniyama conjecture.Wiles work: proof of the Shimura-Taniyama conjecture: In [W3]Wiles proves the Shimura-Taniyama conjecture for semi-stable elliptic curves,providing the final missing step and proving Fermats Last Theorem. Aftermore than 350 years, the saga of Fermats Last theorem has come to a spectacular end.The relation between Wiles work and Fermats Last Theorem has beenvery well documented (see, for example, [R8], and the references containedtherein). Hence this article will focus primarily on the breakthrough of Wiles[W3] and Taylor-Wiles [TW] which leads to the proof of the Shimura-Taniyamaconjecture for semi-stable elliptic curves.From elliptic curves to `-adic representations: Wiles opening gambitfor proving the Shimura-Taniyama conjecture is to view it as part of the more9

general problem of relating two-dimensional Galois representations and modular forms. The Shimura-Taniyama conjecture states that if E is an ellipticcurve over Q, then E is modular. One of several equivalent definitionsof modPnularity is that for some integer N there is an eigenform f =an q of weighttwo on 0 (N ) such that#E(Fp ) = p + 1 apfor all but finitely primes p. (By an eigenform, here we mean a cusp formwhich is a normalized eigenform for the Hecke operators; see section 1 fordefinitions.)This conjecture acquires a more Galois theoretic flavour when one considersthe two dimensional `-adic representationE,` : GQ GL2 (Z` )obtained from the action of GQ on the `-adic Tate module of E: T` E = An `-adic representation of GQ is said to arise from an eigenlim E[ln ](Q).

Pform f =an q n with integer coefficients an iftr ((Frob p )) = ap ,

for all but finitely many primes p at which is unramified. Here Frob p is aFrobenius element at p (see section 2), and its image under is a well-definedconjugacy class.A direct computation shows that #E(Fp ) = p + 1 tr (E,` (Frob p )) forall primes p at which E,` is unramified, so that E is modular (in the sensedefined above) if and only if for some `, E,` arises from an eigenform. Infact the Shimura-Taniyama conjecture can be generalized to a conjecture thatevery `-adic representation, satisfying suitable local conditions, arises from amodular form. Such a conjecture was proposed by Fontaine and Mazur [FM].

Galois groups and modular forms

Viewed in this way, the Shimura-Taniyama conjecture becomes part of a muchlarger picture: the emerging, partly conjectural and partly proven correspondence between certain modular forms and two dimensional representationsof GQ . This correspondence, which encompasses the Serre conjectures, theFontaine-Mazur conjecture, and the Langlands program for GL2 , represents afirst step toward a higher dimensional, non-abelian generalization of class fieldtheory.10

Two-dimensional representations of GQ : In the first part of this century,

class field theory gave a complete description of GabQ , the maximal (continuous) abelianquotientofG.InfacttheKronecker-Webertheorem asserts thatQQ ab GQ = p Zp , and one obtains a complete description of all one-dimensionalrepresentations of GQ . In the second half of this century much attention hasfocused on attempts to understand the whole group GQ , or more precisely todescribe all its representations. Although there has been a fair degree of success in using modular forms to construct representations of GQ , less is knownabout how exhaustive these constructions are. The major results in the latter direction along these lines are the work of Langlands [Ll2] and the recentwork of Wiles ([W3] completed by [TW]). Both concern two-dimensional representations of GQ and give significant evidence that these representations areparametrised (in a very precise sense) by certain modular forms. The purposeof this article is to describe both the proven and conjectural parts of this theory, give a fairly detailed exposition of Wiles recent contribution and explainthe application to Fermats Last theorem. To make this description somewhatmore precise let us distinguish three types of representation.Artin representations and the Langlands-Tunnell theorem: Continuous representations : GQ GL2 (C) are called (two-dimensional) Artinrepresentations. Such representations necessarily have finite image, and aretherefore semi-simple. We restrict our attention to those which are irreducible.They are conjectured to be in bijection (in a precise way) with certain newforms (a special class of eigenforms). Those which are odd (i.e. the determinant of complex conjugation is 1), should correspond to weight 1 holomorphic newforms. Those which are even should correspond to certain nonholomorphic (Maass) newforms. Two partial but deep results are known.(a) (Deligne-Serre) If f is a holomorphic weight one newform then the corresponding Artin representation can be constructed ([DS]).(b) (Langlands-Tunnell) If is a two dimensional Artin representation withsoluble image then the corresponding modular form exists ([Ll2] and[Tu]).The proof of the latter result is analytic in nature, invoking the trace formulaand the theory of L-functions.`-adic representations and the Fontaine-Mazur conjecture: By an `adic representation we shall mean any continuous representation : GQ GL2 (K) which is unramified outside a finite set of primes and where K is afinite extension of Q` (generalizing slightly the notion of `-adic representation11

that was introduced before). Given a holomorphic newform f one can attachto f a system of `-adic representations, following Eichler, Shimura, Deligne andSerre. These `-adic representations are called modular. The Fontaine-Mazurconjecture (see [FM]) predicts if is an odd, irreducible, `-adic representationwhose restriction to the decomposition group at ` is well enough behaved,then is modular. (The restriction on the behaviour of the representationon the decomposition group at ` is essential in this conjecture; it is not truethat all odd, irreducible two dimensional `-adic representation are modular.)Before Wiles work almost nothing was known about this conjecture, exceptthat certain very special cases could be deduced from the work of Hecke,Langlands and Tunnell.Mod ` representations and Serres conjecture: A mod ` representation ` ). For example if E/Q is anis a continuous representation : GQ GL2 (Felliptic curve then the action of GQ on the `-division points of E gives rise to amod ` representation E,` which is just the reduction modulo ` of E,` . One canuse the work of Eichler, Shimura, Deligne and Serre to associate to each mod` eigenform a mod ` representation of GQ . The mod ` representations whicharise in this way are called modular. Serre has conjectured [Se7] that everyodd (absolutely) irreducible mod ` representation is modular and should arisefrom a mod ` eigenform with certain very specific properties. This conjecturecan be thought of as having two parts.The first asserts that every odd irreducible mod ` representation is modular.About this very little is known. It is known for : GQ GL2 (F2 ) by workof Hecke. It is also known for : GQ GL2 (F3 ). This latter result is anapplication of the Langlands-Tunnell theorem usingthe two accidents that

there is a section to the homomorphism GL2 (Z[ 2])

GL2 (F3 ) and thatGL2 (F3 ) is soluble. Partial results for : GQ GL2 (F5 ) follow from Wileswork.Given a mod ` representation arising from a mod ` eigenform, the secondpart of Serres conjecture predicts the minimal weight and level for that mod` eigenform. Here the situation is much better. There has been a lot of workover the last decade (including ideas from Mazur, Ribet, Carayol and Gross)and the problem is nearly completely resolved (see [Di1]). As was pointedout earlier, Ribets contribution [R5] implies that, if one can show that theGalois representation E,` arising from the (semi-stable) Frey curve attachedto a solution of Fermats equation with exponent ` is modular, then one canshow that this representation does not existbecause it would be modular ofweight two and level two and hence one can deduce Fermats Last Theorem.However we have seen that to show E,` is modular it suffices to show that12

for some `0 , the `0 -adic representation E,`0 is modular. In particular it suffices

to verify that either E,3 or E,5 is modular. Hence the Shimura-Taniyamaconjecture can be reduced to (part of) the Fontaine-Mazur conjecture for ` = 3and 5. We have seen that for these primes part of Serres conjecture is known,so it turns out it suffices to prove results of the form Serres conjecture for `implies the Fontaine-Mazur conjecture for `. This is the direction of Wileswork, although nothing quite this general has been proven yet.Deformation theory: Thus the problem Wiles faces is to show that if isan odd `-adic representation which has irreducible modular reduction andwhich is sufficiently well behaved when restricted to the decomposition groupat `, then is modular. In fact he only proves a weakened version of such aresult, but one which is sufficient to conclude that all semistable elliptic curvesare modular.Wiles approaches the problem by putting it in a more general setting. Onthe one hand he considers lifts of to representations over complete noetherianlocal Z` -algebras R. For each finite set of primes , one can consider lifts oftype ; these are lifts which are well-behaved on a decomposition group at `,and whose ramification at primes not in is rather restricted. In particular,such a lift is unramified outside S where S is the set of ramified primesof . A method of Mazur (see [Maz3]) can then be used to show that if isabsolutely irreducible, then there is a representation: GQ GL2 (R )univ

which is universal in the following sense. If : GQ GL2 (R) is a lift of of

type , then there is a unique local homomorphism R R such that isequivalent to the pushforward of univ . Thus the equivalence classes of type lifts to GL2 (R) can be identified with Hom(R , R). The local ring R is calledthe universal deformation ring for representations of type .On the other hand Wiles constructs a candidate for a universal modularlifting of type mod: GQ GL2 (T ).

The ring T is constructed from the algebra of Hecke operators acting on

a certain space of modular forms. The universal property of R gives amap R T . The problem thus becomes: to show that this map is anisomorphism2 . In fact, it can be shown to be a surjection without great dif2

Maps of this kind were already considered in [Maz3] and [BM], and it is conjectured in[MT] that these maps are isomorphisms in certain cases, though not in exactly the situationsconsidered by Wiles.

13

ficulty, and the real challenge is to prove injectivity, i.e., to show, in essence,that R is not larger than T .By an ingenious piece of commutative algebra, Wiles found a numericalcriterion for this map to be an isomorphism, and for the ring T to be alocal complete intersection. This numerical criterion seems to be very closeto a special case of the Bloch-Kato conjecture [BK]. Wiles further showed(by combining arguments from Galois cohomology and from the theory ofcongruences between modular forms) that this numerical criterion was satisfiedif the minimal version T of this Hecke algebra (obtained by taking = , i.e.,allowing the least possible amount of ramification in the deformations) was acomplete intersection. Finally in [TW] it was proved that T is a completeintersection.

Outline of the paper

Chapter 1 recalls some basic notions from the classical theory of elliptic curvesand modular forms, such as modular forms and modular curves over C and Q,Hecke operators and q-expansions, and Eichler-Shimura theory. The ShimuraTaniyama conjecture is stated precisely in section 1.8.Chapter 2 introduces the basic theory of representations of GQ . We describeMazurs deformation theory and begin our study of the universal deformationrings using techniques from Galois cohomology and from the theory of finiteflat group schemes. We also recall some basic properties of elliptic curves,both to explain Freys argument precisely and illustrate the uses of `-adicrepresentations.Chapter 3 explains how to associate Galois representations to modularforms. We then describe what was known and conjectured about associatingmodular forms to Galois representations before Wiles work. After introducingthe universal modular lifts of certain mod ` representations, we give the proofof Wiles main theorems, taking for granted certain results of a more technicalnature that are proved in the last two chapters.Chapter 4 explains how to prove the necessary results concerning the structure of Hecke algebras: the generalization by Taylor and Wiles of a result ofde Shalit, and the generalization by Wiles of a result of Ribet.Chapter 5 establishes the fundamental results from commutative algebradiscovered by Wiles, following modifications of the approach of Wiles andTaylor-Wiles proposed by Faltings and Lenstra.

Elliptic curves and modular forms

Elliptic curves

We begin with a brief review of elliptic curves. A general reference for theresults discussed in this section is [Si1] and [Si2].An elliptic curve E over a field F is a proper smooth curve over F of genusone with a distinguished F -rational point. If E/F is an elliptic curve and if is a non-zero holomorphic differential on E/F then E can be realised in theprojective plane by an equation (called a Weierstrass equation) of the form(W )

Y 2 Z + a1 XY Z + a3 Y Z 2 = X 3 + a2 X 2 Z + a4 XZ 2 + a6 Z 3

such that the distinguished point is (0 : 1 : 0) (sometimes denoted because

One can check that the equation (W ) defines an elliptic curve if and only if is nonzero. One can also check that such equations define elliptic curveswhich are isomorphic over F if and only if they give the same quantity j. Thusj only depends on E so we will denote it jE . The quantity depends onlyon the pair (E, ) so we shall denote it (E, ). If u belongs to F thenu12 (E, u) = (E, ).An elliptic curve E/F has a natural structure of a commutative algebraicgroup with the distinguished F -rational point as the identity element.An algebraic map between two elliptic curves which sends the distinguishedpoint of one to the distinguished point of the other is automatically a morphism16

of algebraic groups. A map between elliptic curves which has finite kernel (andhence, is generically surjective) is called an isogeny.Elliptic curves over C: If F = C, then the curve E is isomorphic as acomplex analytic manifold to the complex torus C/, where is a latticein C, i.e., a discrete Z-submodule of C of rank 2. The group law on E(C)corresponds to the usual addition in C/. In terms of , an affine equationfor E in A2 (C) is given byy 2 = 4x3 + g2 x + g3 ,whereg2 = 60

{0}

1,z4

g3 = 140

{0}

1.z6

In terms of this equation, the map from C/ to E(C) sends z to (x, y) =

((z), 0 (z)), where (z) is the Weierstrass -function associated to the lattice. (Cf. [Si1], ch. VI.) The inverse map is given by integratingthe holomorphicRdifferential , i.e., sending P E(C) to the image of in C/, where isRany path on E(C) from to P , and is the lattice of periods , where ranges over the integral homology H1 (E(C), Z). Replacing by u changes to u, so that is determined by E only up to homotheties. We scale sothat one of its Z-generators is 1, and another, , has strictly positive imaginarypart. This gives the analytic isomorphism:E(C) ' C/h1, i.The complex number in the complex upper half plane H is well defined,modulo the natural action of SL2 (Z) on H by Mobius transformations. (Thusthe set of isomorphism classes of elliptic curves over C can be identified withthe quotient H/SL2 (Z).)The map z 7 e2iz identifies C/h1, i with C /q Z , where q = e2i is themultiplicative Tate period. The analytic isomorphismE(C) ' C /q Zhas the virtue of generalizing to the p-adic setting in certain cases, as we willsee shortly.Note that |q| < 1. The invariant j can be expressed in terms of q by aconvergent power series with integer coefficients:j = q 1 + 744 + 196884q + .

(1.1.1)

The following basic facts are a direct consequence of the analytic theory:17

Proposition 1.1 The subgroup E[n](C) of points of order n on E(C) is isomorphic (non-canonically) to Z/nZ Z/nZ. More generally, if F is any fieldof characteristic zero, the subgroup E[n](F ) is contained in Z/nZ Z/nZ.Proof: The analytic theory shows that E(C) is isomorphic as an abstract groupto a product of two circle groups, and the first statement follows. The secondstatement follows from the Lefschetz principle (cf. [Si1], ch. VI, 6).2Proposition 1.2 The endomorphism ring End C (E) of an elliptic curve overC is isomorphic either to Z or to an order in a quadratic imaginary field. Thesame is true if one replaces C by any field of characteristic 0.Proof: An endomorphism of E(C) ' C/ induces multiplication by complexnumber on the tangent space. Hence End C (E) is isomorphic to the ringof C satisfying . Such a ring is isomorphic either to Z or to aquadratic imaginary order. The corresponding statement for fields of characteristic 0 follows as in the proof of proposition 1.1.2If End C (E) Q is a quadratic imaginary field, we say that E has complexmultiplication.Remark 1.3 It follows from the arithmetic theory of complex multiplication(cf. [Si2], ch. 1) that any elliptic curve E with complex multiplication is definedover an abelian extension of the quadratic imaginary field K = End C (E) Q.If E is defined over Q, then K has class number one. There are only finitelymany elliptic curves over Q with complex multiplication, up to twists (i.e.,C-isomorphism).Elliptic curves over Qp : Now suppose that E is an elliptic curve definedover the p-adic field Qp . There is an equation(W min )

Y 2 Z + a1 XY Z + a3 Y Z 2 = X 3 + a2 X 2 Z + a4 XZ 2 + a6 Z 3

for E with the property ai Zp for all i and || is minimal amongst all suchequations for E. Although (W min ) is not unique, the associated discriminantmindepends only on E and is denoted min)E . Moreover the reduction of (W

modulo the uniformizer p defines a projective curve E, which is independent

of the particular minimal equation chosen. If (W ) is any equation for E withdivides .coefficients in Zp and with discriminant , then minE

If E is a smooth curve we say that E has good reduction at p. If E has a

unique singular point which is a node we say that E has multiplicative reductionat p. Otherwise E has a unique singular point which is a cusp and we say that18

E has additive reduction at p. If E has good or multiplicative reduction we

say that it has semi-stable reduction at p, or simply that E is semi-stable.If (W ) defines a smooth curve mod p then E has good reduction at pand (W ) is a minimal equation. If 0 mod p but b22 6= 24b4 mod p, thenmodulo p the equation (W ) defines a curve with a node. In this case E hasmultiplicative reduction at p and (W ) is a minimal equation.Curves with good reduction: In that case p does not divide minE , and the

reduction E is an elliptic curve over Fp .

If q is any power of p, and Fq is the field with q elements, we define theinteger Nq to be the number of solutions to the equation (W min ) in the projec q ). We definetive plane P2 (Fq ). Thus Nq is the order of the finite group E(Fthe integer aq by the formulaaq = q + 1 N q .The integers aq are completely determined by ap : more precisely, we have(1 ap ps + p12s )1 = 1 + ap ps + ap2 p2s + ap3 p3s + .

(1.1.2)

We call the expression on the left the (local) L-function associated to E

over Qp , and denote it by L(E/Qp , s). Concerning the size of ap we have thefollowing fundamental result of Hasse, whose proof can be found in [Si1], ch.V, 1:

Theorem 1.4 |ap | 2 p.

A further division among curves of good reduction plays a significant rolein our later discussion. We say that E has (good) ordinary reduction if p doesnot divide ap , and that it has supersingular reduction if p divides ap .When E has good reduction at p, we define its local conductor at p to bemp (E) = 0.Curves of multiplicative reduction: Elliptic curves over Qp which have multiplicative reduction at p can be understood by using the p-adic analytic description discovered by Tate. More precisely, we can formally invert the powerseries (1.1.1) expressing j in terms of q, to obtain to a power series for q inj 1 , having integer coefficients:q = j 1 + 744j 2 + 750420j 3 + 872769632j 4 + .

19

(1.1.3)

If E has multiplicative reduction, then j Qp is non-integral, and hence the

power series (1.1.3) converges, yielding a well-defined value of q in pZp . Thisis called Tates p-adic period associated to E over Qp . Note that we havevp (q) = vp (j) = vp (minE ).We say that E has split (resp. non-split) multiplicative reduction at p if p ) have slopes defined over Fp (resp.the two tangent lines to the node on E(FFp2 ).Proposition 1.5 (Tate) There is a p-adic analytic isomorphism /q Z E(Q p ),:Qpwhich has the property that((x)) = (x() ),

GQp ,

where : GQp 1 is the trivial character, if E has split multiplicative reduction; the unique unramified quadratic character of GQp , if E has non-splitmultiplicative reduction.The proof of this proposition is explained in [Si2], ch. V, for example.We define the L-function L(E/Qp , s) to be

(1 ps )1 if E has split reduction,

L(E/Qp , s) =(1 + ps )1 if E has non-split reduction.

(1.1.4)

In both cases the conductor mp (E) is defined to be 1.

The conductor mp (E) is defined to be 2, if p > 3. When p = 2 or 3, it is

determined by a somewhat more complicated recipe, given in [Ta].Elliptic curves over Q: Let E be an elliptic curve defined over Q. Inparticular E may be viewed as a curve over Qp for every p, and we define its(global) conductor byYNE =pmp (E) .p

20

The curve E is said to be semi-stable if it is semi-stable over all p-adic fields

Qp . Note that E is semi-stable if and only if its conductor NE is square-free.Using the fact that Q has class number 1, one can show that E has a globalminimal Weierstrass model (W min ) which gives the equation of a minimalWeierstrass model over each Qp . The associated discriminant, denoted minE ,Nerondepends only on E. The associated differential, denoted E , is called theNeron differential. It is well-defined up to sign.The following, known as the Mordell-Weil theorem, is the fundamentalresult about the structure of the group of rational points E(Q). (Cf. Forexample [Si1].)Theorem 1.6 The group E(Q) is a finitely generated abelian group. HenceE(Q) ' T Zr ,where T is the (finite) torsion subgroup of E(Q), and r 0 is the rank of Eover Q.Concerning the possible structure of T , there is the following deep result ofMazur, a variant of which also plays a crucial role in the proof of FermatsLast Theorem:Theorem 1.7 If E/Q is an elliptic curve, then its torsion subgroup is isomorphic to one of the following possibilities:Z/nZ,

1 n 10,

Z/2nZ Z/2Z,

n = 12,

1 n 4.

The proof is given in [Maz1] (see also [Maz2]). Thanks to this result, thestructure of the torsion subgroup T is well understood. (Recently, the techniques of Mazur have been extended by Kamienny [Kam] and Merel [Mer] toprove uniform boundedness results on the torsion of elliptic curves over generalnumber fields.)Much more mysterious is the behaviour of the rank r. It is not known if rcan be arbitrarily large, although results of Mestre [Mes] and Nagao [Na] showthat it is greater or equal to 13 for infinitely many elliptic curves over Q. Itturns out that many of the deep results on E(Q) and on r are based on therelation with L-functions.We define the global L-function of the complex variable s by:YL(E/Q, s) =L(E/Qp , s).(1.1.6)p

21

Exercise 1.8 Using theorem 1.4, show that the infinite product defining theL-function L(E/Q, s) converges absolutely on the right half plane Real(s) >3/2.Conjecture 1.9 (Birch-Swinnerton-Dyer) The L-function L(E/Q, s) hasan analytic continuation to the entire complex plane,and in particular is analytic at s = 1. Furthermore:ords=1 L(E/Q, s) = r.There is also a more precise form of this conjecture, which expresses the leadingcoefficient of L(E/Q, s) at s = 1 in terms of certain arithmetic invariants ofE/Q. For more details, see [Si1], conj. 16.5.As we will explain in more detail in section 1.8, the analytic continuationof L(E/Q, s) now follows from the work of Wiles and Taylor-Wiles and astrengthening by Diamond [Di2], for a very large class of elliptic curves overQ, which includes all the semi-stable ones.Abelian varieties: Elliptic curves admit higher-dimensional analogues, calledabelian varieties, which also play a role in our discussion. Analytically, the setof complex points on an abelian variety is isomorphic to a quotient Cg /,where is a lattice in Cg of rank 2g, satisfying the so-called Riemann periodrelations. A good introduction to the basic theory of abelian varieties can befound in [CS] and [We1].

1.2

Modular curves and modular forms over C

Modular curves: The group SL2 (Z) of two by two integer matrices of determinant one acts by fractional linear (Mobius) transformations on the complexupper half planeH = {z C | Im (z) > 0},equipped with its standard complex analytic structure. The principal congruence group (N ) of level N is the subgroup of matrices in SL2 (Z) whichreduce to the identity matrix modulo the positive integer N . A subgroup ofSL2 (Z) is called a congruence group if it contains (N ) for some N . The levelof is the smallest N for which this is true. The most important examples ofcongruence groups are: The group 0 (N ) consisting of all matrices that reduce modulo N to anupper triangular matrix.

a b7 d mod N.c dFor any subgroup H of (Z/N Z) , we denote by H (N ) the group of matricesin 0 (N ) whose image in 0 (N )/1 (N ) belongs to H.If is a congruence subgroup of SL2 (Z), define Y to be the quotient ofthe upper half plane H by the action of . One equips Y with the analyticstructure coming from the projection map : H Y . (More precisely, ify = ( ), and G is the stabilizer of in , then the local ring OY ,y isidentified with the local ring of germs of holomorphic functions at which areinvariant under the action of G .) This makes Y into a connected complexanalytic manifold of dimension one, i.e., a Riemann surface. If is 0 (N ) (resp.1 (N ), or (N )), we will also denote Y by Y0 (N ) (resp. Y1 (N ), or Y (N )). Onecompactifies Y by adjoining a finite set of cusps which correspond to orbitsof P1 (Q) = Q {} under . Call X the corresponding compact Riemannsurface. (For more details, notably on the definition of the analytic structureon X at the cusps, see for example [Kn], p. 311, or [Shi2], ch. 1.) It followsfrom the definition of this analytic structure that the field K of meromorphicfunctions on X is equal to the set of meromorphic functions on H satisfying (Transformation property): f ( ) = f ( ), for all ; (Behaviour at the cusps):PFor all SL2 (Z), the function f ( ) has an/hPuiseux series expansion in fractional powers of q = e2i .m an q

Use this to compute the genus of X0 (p), X1 (p), and X(p) for p prime. Fordetails, see [Shi2], sec. 1.6 or [Ogg].3. For = (2), show that X is isomorphic to P1 , and that Y is isomorphicto P1 {0, 1, }. Show that /h1i is the free group on the two generators1210g1 = 0 1 and g2 = 2 1 .

4. Define a homomorphism (2) Z/nZ Z/nZ, by sending g1 to (1, 0)

and g2 to (0, 1), and let denote its kernel. Show that is not in general acongruence subgroup and that the curve Y := H/ is birationally isomorphicto the Fermat curve of degree n with affine equation xn + y n = 1.Moduli interpretations: The points in Y = H/ can be interpreted aselliptic curves over C with some extra level N structure. More precisely, If = 0 (N ), then the -orbit of H corresponds to the complextorus E = C/h1, i with the distinguished cyclic subgroup of order Ngenerated by N1 . Thus, points on Y0 (N ) parametrize isomorphism classesof pairs (E, C) where E is an elliptic curve over C and C is a cyclicsubgroup of E of order N . If = 1 (N ), then the -orbit of corresponds to the complex torusE = C/h1, i with the distinguished point of order N given by N1 . Hence,points on Y1 (N ) parametrize isomorphism classes of pairs (E, P ) wherenow P is a point on E of exact order N .Remark 1.10 One checks that the above rules set up a bijection betweenpoints on Y and elliptic curves with the appropriate structures, and that the24

projection Y1 (N ) Y0 (N ) sending 1 (N ) to 0 (N ) becomes the forgetful map sending (E, P ) to (E, hP i).Remark 1.11 (This remark will be used in section 1.3 when discussing Heckeoperators.) Define an n-isogeny of -structures to be an n-isogeny of theunderlying elliptic curves which sends one -structure to the other. If p is aprime not dividing N , then there are exactly p + 1 distinct p-isogenies from(C/h, 1i, N1 ), whose images are the pairs:

p1 +i(i = 0, . . . , p 1),C/hp, 1i,.,1 ,C/pNNIf p divides N , then there are only p distinct p-isogenies from (C/h, 1i, N1 ),since (C/hp, 1i, Np ) is not a 1 (N )-structure (the point p/N not being of exactorder N on the complex torus C/hp, 1i).Modular forms: Let k be an even positive integer. A modular form of weightk on is a holomorphic function f on H satisfying:

(Transformation property): f ( ) = (c + d)k f ( ), for all = ac db

. (Behaviour at the cusps): For all PP SL2 (Z), the function (c +

n/hd)k f ( ) has a Puiseuxseriesexpansionin fractional powers0 an qP2in/hof q = e . We callan qthe Fourier expansion of f at the cusp 1 (i).A modular form which satisfies the stronger property that the constantcoefficient of its Fourier expansion at each cusp vanishes is called a cusp form.We denote by Mk () the complex vector space of modular forms of weight kon , and by Sk () the space of cusp forms on . (For examples, see [DI],sec. 2.2 and the references therein, especially, [Shi2], ch. 2.)This article is mainly concerned with modular forms of weight 2, and hencewe will focus our attention on these from now on. A pleasant feature of the casek = 2 is that the cusp forms in S2 () admit a direct geometric interpretationas holomorphic differentials on the curve X .Lemma 1.12 The map f ( ) 7 f := 2if ( )d is an isomorphism betweenthe space S2 () and the space 1 (X ) of holomorphic differentials on the curveX .

25

Sketch of proof: One checks that the transformation property satisfied by f ( )

under causes the expression f ( )d to be -invariant, and that the conditionof vanishing at the cusps translates into holomorphicity of f ( )d . (Note, forexample, that 2id = dq/q, so that f is holomorphic at i precisely whenf (q) vanishes at q = 0.)2As a corollary, we find:Corollary 1.13 The space S2 () is finite-dimensional, and its dimension isequal to the genus g of X .Proof: This follows directly from the Riemann-Roch theorem, cf. [Ki], sec. 6.3.2To narrow still further the focus of our interest, we will be mostly concernedwith the cases = 0 (N ) and 1 (N ). A slightly more general framework issometimes convenient, so we suppose from now on that satisfies1 (N ) 0 (N ).Such a group is necessarily of the form H (N ) for some subgroup H of(Z/N Z) . Because the transformation 7 + 1 belongs to the formsin S2 () are periodic functions on H of period 1, and hence their Fourierexpansions at i are of the formXf ( ) =an q n , q = e2i , an C.n>0

The Petersson inner product: The spaces S2 () are also equipped with anatural Hermitian inner product given byZZif g =f ( )g ( )dxdy,hf, gi = 28 XH/where = x + iy. This is called the Petersson inner product.The diamond operators: Suppose now that = 1 (N ) and let d be an element of (Z/N Z) . The map hdi which sends an elliptic curve with -structure(E, P ) to the pair (E, dP ) gives an automorphism of Y which

extends to X .abIt is called the diamond operator. For in H and = c d in 0 (N ), wehavehdi( ) = ( ).

In geometric terms, the diamond operators are the Galois automorphisms of

the natural (branched) covering X1 (N ) X0 (N ) whose Galois group isisomorphic to 0 (N )/h1 (N )i = (Z/N Z) /h1i. Given an even Dirichletcharacter : (Z/N Z) C , say that f is a modular form of level N andcharacter if it belongs to the -eigenspace in S2 (1 (N )) under this action.Let S2 (N, ) denote the space of all such forms. Thus a function f in S2 (N, )is a cusp form on 1 (N ) which satisfies the stronger transformation property:

It is a complex vector space of dimension g = genus(X ). The integral homology = H1 (X , Z) maps naturally toR V by sending a homology cycle cto the functional c defined by c (f ) = c f . The image of is a lattice inV , i.e., a Z-module of rank 2g which is discrete (cf. [Mu1], cor. 3.8). Fix abase point 0 H, and define the Abel-Jacobi map AJ : X (C) V / byRPAJ (P )(f ) = 0 f . Note that this is well-defined, i.e., it does not depend onthe choice of path on X from 0 to P , up to elements in .We extend the map AJ by linearity to the group Div (X ) of divisors onX , and observe that the restriction of AJ to the group Div 0 (X ) of degree0 divisors does not depend on the choice of base-point 0 . Moreover we havethe Abel-Jacobi theorem:27

where i ( ) = +i, and ( ) = hpip represent the p + 1 curves with pstructure that are images of (C/h, 1i, N1 ) by a p-isogeny, and the i are thepull-back maps on differential forms on H. (An isogeny of elliptic curves with-structure is simply an isogeny between the underlying curves which sendsone -structure to the other.) Such a description makes it evident thatPTp (fn)belongs to S2 (), if f does. In terms of the Fourier expansion of f =an q ,the formula for the operator Tp on S2 (N, ) is given by:XXTp (f ) =an q n/p + p(p)an q pn .p|n

28

If p divides N , then we define the Hecke operator Up analogously, by summing

again over all the cyclic p-isogenies of -structures. Since there are only p ofthem, the formula becomes simpler: X

p11 X +iUp (f ) ==an q n/p .fp i=0pp|n

The reader is invited to check that the Hecke operators of the form Tp orUq commute with each other, and also that they commute with the diamondoperators introduced in the previous section.We extend the definition of the Hecke operators to operators Tpn , withn > 1, by the inductive formulaeTpn+1 = Tp Tpn hpipTpn1 ,

if (p, N ) = 1,

and Tpn = Upn otherwise. We then define the operator Tn , where n =

written as a product of powers of distinct primes pi , byYTn =Tpei i .

(m, n) = 1. (A more conceptual definition of the Hecke operator Tn is thatTn (f ) is obtained by summing the pullback of f over the maps describingall the cyclic n-isogenies of -structures.) The relations among the differentHecke operators can be stated succinctly by saying that they obey the following(formal) identity:YXYTn ns .(1 Tp ps + hpip12s )1 (1 Up ps )1 =n(1.3.1)p|Np6 |N

The reader can consult [DI], sec. 3 and the references therein (especially [Shi2],ch. 3 or [Kn]) for more details and different points of view on Hecke operators. Let T be the subring of EndC (S2 ()) generated over C by all the Heckeoperators Tp for p6 |N , Uq for q|N , and hdi acting on S2 ().Definition 1.16 A modular form f is an eigenform if it is a simultaneouseigenvector for all the Hecke operators in T, i.e., if there exists a C-algebrahomomorphism : T C such that T f = (T )f , for all T T.A direct calculation shows that the coefficients an of an eigenform f can berecovered from the homomorphism by the formula:an (f ) = a1 (f )(Tn ).29

It follows that the first Fourier coefficient a1 of a non-zero eigenform is always

non-zero, and that the non-trivial eigenspaces for T are all one-dimensional:Proposition 1.17 Given a non-zero algebra homomorphism : T C,there is exactly one eigenform f up to scaling, which satisfies T f = (T )f ,for all T T.Sketch of proof: The proof of the existence of f is an exercise in commutativealgebra (localize S2 () at the kernel of ), and the uniqueness is clear fromthe formula above.2We call an eigenform satisfying a1 = 1 a normalized eigenform.Atkin-Lehner theory: It is natural to ask whether S2 () can be decomposedinto a basis consisting of distinct normalized eigenforms. Unfortunately, thisis not always possible, as the following exercise illustrates.Exercise 1.18 Suppose that p3 divides N exactly. Let T0 be the algebra of3Hecke operators (generated by the operators, and Uq withP Tq nwith q6 |N/p330q|N/p ) acting on S2 (N/p ). Let f = n=1 an q be a T -eigenform of levelN/p3 in S2 (N/p3 ). Show that the space Sf spanned by the forms f ( ), f (p ),f (p2 ), and f (p3 ) is contained in S2 (N ), and is stable for the action of theHecke operators Tq , q6 |N , and Uq , q|N . Show that Sf has no basis of simultaneous eigenforms for the Hecke algebra T of level N , so that the action of Ton Sf is not semi-simple.Let T0 denote the subalgebra of T generated only by the good Hecke operators Tq with q6 |N , and hdi.Proposition 1.19 If q does not divide N , the adjoint of the Hecke operatorTq with respect to the Petersson scalar product is the operator hqi1 Tq , and theadjoint of hqi is hqi1 . In particular, the Hecke operators commute with theiradjoints.Proof: See [Kn], th. 9.18 and 8.22, or [Ogg].2Proposition 1.19 implies, by the spectral theorem for commuting operatorsthat commute with their adjoints:Proposition 1.20 The algebra T0 is semi-simple (i.e, it is isomorphic to aproduct C C of a certain number of copies of C), and there is a basisof S2 () consisting of simultaneous eigenvectors for the operators Tq .

30

Thus, T0 has the merit of being semi-simple, while T is not in general. Thecost of replacing T by T0 , however, is that one loses multiplicity one, i.e.,the eigenspaces for T0 need not be one-dimensional. For example, the spaceSf defined in the previous exercise is contained in a single eigenspace for T0 .The theory of Atkin-Lehner [AL] gives essentially a complete understandingof the structure of the algebra T, and the structure of the space of eigenforms.To motivate the main result, observe that the problem in the exercise aboveseems to be caused by forms of level N that are coming from forms of lowerlevel N/p3 by a straightforward operation, and are therefore not genuinelyof level N . They are the analogues, in the context of modular forms, of nonprimitive Dirichlet characters.Definition 1.21 We define the old subspace of S2 () to be the space spannedby those functions which are of the form g(az), where g is in S2 (1 (M )) forsome M < N and aM divides N . We define the new subspace of S2 ()to be the orthogonal complement of the old subspace with respect to thePetersson scalar product. A normalized eigenform in the new subspace iscalled a newform of level N .The following result is the main consequence of the theory of Atkin-Lehner. Itgives a complete answer to the question of what is the structure of the algebraT acting on S2 ().Theorem 1.22 If f is in the new subspace of S2 () and is an eigenvector forall the operators in T0 , then it is also an eigenform for T, and hence is uniqueup to scaling. More generally, if f is a newform of level Nf |N , then the spaceSf defined bySf = {g S2 () such that T g = f (T )g, for all T T0 }is stable under the action of all the Hecke operators in T. It is spanned by thelinearly independent forms f (az) where a ranges over the divisors of N/N f .Furthermore, we haveS2 () = f Sf ,

where the sum is taken over all newforms f of some level Nf dividing N .

See [AL] for the proof in the case = 0 (N ), and [La2], ch. VIII for thegeneral case. (See also [DI], sec. 6 for an overview.)Exercise 1.23 Consider the case where = 0 (22). Show that X0 (22) is ofgenus 2, and hence that S2 (22) has dimension 2. Show that S2 (22) is equalto Sf , where f = (( )(11 ))2 is a newform of level 11, so that in particularthere are no newforms of level 22 on . Show that T0 is isomorphic to C inthis case, and that T is isomorphic to the semisimple algebra C C.31

Action on homology and Jacobians: Note that the Hecke operators acton V = S2 () by duality. One checks (cf. [Kn], props. 11.23, 11.24) that thesublattice of V is stable under the action of all the Hecke operators Tn , andof the diamond operators hdi. Therefore the operators Tn and hdi give riseto endomorphisms of the torus V /, and hence the Jacobian variety J , ina natural way. The involution 7 gives rise to an involution on X (C)(which is complex conjugation on the model of X over R deduced from theQ- model defined in section 1.5). Since complex conjugation is continuous italso acts on the integral homology = H1 (X (C), Z). Let + and be thesublattices of on which complex conjugation acts by +1 and 1. These aresublattices of of rank g which are stable under the Hecke operators, sincecomplex conjugation commutes with the Hecke action.A more algebraic description of the action of Tp on J is given via thenotion of an algebraic correspondence. A correspondence on a curve X is adivisor C on X X which is taken modulo divisors of the form {P } X andX {Q}. Let 1 and 2 denote the projections of X X onto each factor.Then the correspondence C induces a map on divisors of X, by settingC(D) = 2 (11 (D) C).(For the definition of the intersection D1 D2 of two divisors, see [We1].) Themap C preserves divisors of degree 0, and sends principal divisors on X toprincipal divisors. It gives a well defined algebraic endomorphism of the Jacobian variety Jac (X). Given a correspondence C, its transpose C is defined tobe the divisor of X X obtained by interchanging the two factors of X X.One can define a natural notion of composition for correspondences, and theset of correspondences forms a ring. The general theory of correspondencesand the proofs of the above facts are given in [We1], particularly the secondchapter.The Hecke correspondence Tn is defined as the closure in X X of thelocus of points (A, B) in Y Y such that there is a degree n cyclic isogeny ofelliptic curves with -structure from A to B. For example, if p is a prime notdividing N , then Tp is an algebraic curve in X1 (N )X1 (N ) which is birationalto X1 (N )0 (p) . The induced map on divisors in this case satisfiesXTp ((E, P )) =(E/C, P mod C)where the sum runs over the subgroups C of E having order p. Note also thatif (A, B) belongs to Tp , then the isogeny dual to A B gives a p-isogenyfrom B to hpiA, so thatTp = hpi1 Tp .32

1.4

The L-function associated to a cusp form

For this section,P

let f in S2 (1 (N )) be a cusp form with Fourier expansionat i given by n an q n . One has the following estimate for the size of theFourier coefficients an :Theorem 1.24 The coefficients an C satisfy the inequality

|an | c(f )0 (n) n,

where c(f ) is a constant depending only on f , and 0 (n) denotes the numberof positive divisors of n.Sketch of proof: This follows from proposition 1.51 of section 1.7 which relatesthe p-th Fourier coefficients of eigenforms, for p a prime not dividing the levelof , to the number of points on certain abelian varieties over the finite fieldFp . The estimates of Hasse and Weil for the number of points on abelianvarieties over finite fields (stated in theorem 1.4 of section 1.1 for the specialcase of elliptic curves; see [We1], IV for the general case) thus translate intoasymptotic bounds for the Fourier coefficients of these eigenforms. We notethat the cruder estimate |an | = O(n), which is enough for the purposes of thissection, can be derived by a more elementary, purely analytic argument; cf.[Ogg], ch. IV, prop. 16.2The L-function associated to f is defined by the formula:XL(f, s) =an ns .

As in exercise 1.8, one can show that the infinite sum defining L(f, s) convergesabsolutely in the right half-plane Re (s) > 23 . A much better insight is gainedinto the function L(f, s) by noting that it is essentially the Mellin transform ofthe modular form f . More precisely, if we set (f, s) = N s/2 (2)s (s)L(f, s),then we haveZ s/2(f, s) = Nf (iy)y s dy/y(1.4.1)0

Exercise 1.25 Check the formula above.

This integral representation for L(f, s) gives the analytic continuation ofL(f, s) to the entire complex plane. The modular invariance of f translatesinto a functional equation for L(f, s): more precisely, let wN be the AtkinLehner involution defined by wN ( ) = 1/N . The reader may check that33

wN induces an involution of X , and hence of 1 (X ) = S2 (). One finds that

L(f, s) satisfies the functional equation:(f, s) = (wN (f ), 2 s).For a proof of this, see [Ogg], ch. V, lemma 1. Eigenforms for T in S2 (N, )have a great importance in the theory because their associated L-functionshave an Euler product expansion, in addition to an analytic continuation andfunctional equation:PTheorem 1.26 If f =an q n is a normalized eigenformP insS2 (N, ) for allthe Hecke operators, then the L-function L(f, s) =an n has the Eulerproduct expansionYY(1 ap ps + (p)p12s )1 (1 ap ps )1 .p|Np6 |NProof: This follows directly from equation (1.3.1) of section 1.3.2If f is a newform of level N , then it is also an eigenform for wN , so that thefunctional equation may be viewed as relating L(f, s) and L(f, 2 s). We canalso state the following more precise version of theorem 1.24 (see lemma 3.2of [Hi2] for example for parts (b), (c) and (d)).Theorem 1.27 Suppose that f is a newform of level Nf and let N denotethe conductor of its character .

(a) If p does not divide Nf then |ap | 2 p.

(b) If p||Nf and p does not divide N then a2p = 0 (p) where 0 is theprimitive character associated to .

(c) If p divides Nf and p does not divide Nf /N then |ap | = p.

(d) If p2 divides Nf and p divides Nf /N then ap = 0.

1.5

Modular curves and modular forms over Q

Modular curves: For between 0 (N ) and 1 (N ), the modular curve X

has a model over Q. We describe such a model in the case of = 0 (N ); theconstruction for general follows from similar considerations.The key remark here is that, as was noted in section 1.2, the complexpoints on the curve Y0 (N ) have a natural interpretation as moduli of elliptic34

curves together with a cyclic subgroup of order N , given by sending the point H/0 (N ) to the pair (C/h1, i, h1/N i).Consider the universal elliptic curve361Ej : y 2 + xy = x3 x.j 1728j 1728It is an elliptic curve over the function field Q(j), with j-invariant j. Let d bethe order of P1 (Z/N Z), and let C1 , . . . , Cd denote the set of all cyclic subgroupsof Ej of order N , defined over Q(j), an algebraic closure of Q(j). Fix one ofthese subgroups, C. The Galois group Gal (Q(j)/Q(j)) permutes the Ci in anatural way. Let FN be the smallest extension of Q(j) (viewed as embedded inQ(j)) with the property that (C) = C, for all Gal (Q(j)/F N ). It can beseen (cf. [Shi2], thm. 6.6) that the Galois action on the Ci is transitive so that = Q.FN /Q(j) is of degree d, and that it is a regular extension, i.e., FN QGeometrically, FN can be viewed as the function field of a curve X/Q over Q,with the inclusion of Q(j) corresponding to a map from X/Q to the projectivej-line over Q. The pair (Ej , C) is an elliptic curve over Q(j) with a subgroupof order N defined over FN . Using (Ej , C), each complex point of X gives anelliptic curve over C with a subgroup of order N , provided the point doesnot lie over j = 0, 1728 or . The resulting map to X0 (N ) extends to anisomorphism from X to X0 (N ). The curve X thus constructed, together withthis identification, is the desired model of X0 (N ) over Q.More concretely, the functions j = j( ) and jN = j(N ) are related by apolynomial equation N (j, jN ) = 0 with coefficients in Q, of bidegree d. Thefield FN is the function field of the affine curve N (X, Y ) = 0, and the mapping 7 (j( ), j(N )) gives a birational equivalence between H/0 (N ) and thecomplex curve defined by the equation N . In practice it is not feasible towrite down the polynomial N , except for certain very small values of N . Tostudy the models over Q of X0 (N ), more indirect methods are needed, whichrely crucially on the moduli interpretation of X0 (N ). Similar remarks hold forX1 (N ).Models over Z: The work of Igusa [Ig], Deligne-Rapoport [DR], Drinfeld [Dr],and Katz-Mazur [KM] uses the moduli-theoretic interpretation to describe acanonical proper model for X over Spec Z. These models allow us to talkabout the reduction of X over finite fields Fp , for p prime. The curve hasgood reduction at primes p not dividing N , with the non-cuspidal points of p with -structure. The singularX/Fp corresponding to elliptic curves over Ffibers at primes p dividing N can also be described precisely; an importantspecial case (see [DR]) is that of 0 (N ) with p exactly dividing N . For furtherdiscussion and references, see [DI], sec. 8, 9.35

From now on, when we write X , X0 (N ), or X1 (N ), we will mean the curve

over Q which are the models described above for the complex curves definedin section 1.2.Remark 1.28 When considering q-expansions, it is more convenient to usea different set of models over Z for these complex curves. We define X1 (N )in the case of 1 (N ) as a model over Z which parametrizes pairs (E, i) wherei is an embedding of N in the (generalized) elliptic curve E. (So X1 (N ) isthe model denoted X (N ) in [DI], sec. 9.3, assuming N > 4.) For between0 (N ) and 1 (N ), we define X as the corresponding quotient of X1 (N ). Thismodel has good reduction at primes p not dividing N , but unlike the modelsmentioned above, its fibers at primes p dividing N are smooth and irreducible,

but not proper. In the case of = 0 (N ), the curve X,Q

can be identifiedwith X0 (N ). However, this is not the case in general: the cusp is a rational

point of X,Qbut not necessarily of X .Jacobians: Weils theory [We1] of the Jacobian shows that the Jacobians J defined in section 1.2 as complex tori also admit models over Q. When wespeak of J , J0 (N ) and J1 (N ) from now on, we will refer to these as abelianvarieties defined over Q. Thus, the points in J (K), for any Q-algebra K, areidentified with the divisor classes on X of degree 0, defined over K.We let J/Z , denote the Neron model of the Jacobian J over Spec (Z).Using this model we define J/A for arbitrary rings A. In particular we canconsider J/Fp , the reduction of the Jacobian in characteristic p, which is closelyrelated to the reduction of the integral model of the curve X mentioned above.In particular, if p does not divide the level of , then J/Fp can be identifiedwith the Jacobian of X/Fp . For a treatment of the case = 0 (N ) with pexactly dividing N , see the appendix of [Maz1]; for more general discussionand references, see [DI], sec. 10, especially sec. 10.3.Hecke operators: The Hecke operators have a natural moduli interpretation,which was already touched upon in section 1.3. In particular, one finds thatthe operator Tn arises from a correspondence which is defined over Q, andgives rise to an endomorphism of the Jacobian J which is defined over Q.This in turn gives rise to an endomorphism of the Neron model J/Z , and wecan then consider the endomorphism Tn on the reduction of the Jacobian incharacteristic p. Recall that if p is a prime not dividing N , we may identifythis reduction with the Jacobian of X/Fp . (Cf. [MW], ch. 2, sec. 1, prop. 2.)Furthermore, one can show that the moduli-theoretic interpretation of theHecke operator remains valid in characteristic p; i.e., the endomorphism Tn of36

J/Fp is induced by a map on divisors satisfying, for all ordinary elliptic curvesA with -structure:XTn (A) =i(A),i

where the sum is taken over all cyclic isogenies of degree n. (See [DI], sec. 10.2for further discussion and references.)This description allows one to analyse, for example, the Hecke operatorTp over Fp , when (p, N ) = 1. Let us work with = 1 (N ), to illustrate p , let X be the Frobenius morphism on Xthe idea. For a variety X over Fdefined by raising coordinates to the pth power. Thus if (E, P ) correspondsto a point of X1 (N )/Fp , then E is an isogeny of degree p from (E, P ) to thepair (E , P ) = X1 (N ) (E, P ). The graph of X1 (N ) in (X1 (N ) X1 (N ))/Fp isa correspondence of degree p, which we call F . Let F 0 be the transpose of thiscorrespondence. The endomorphism F of J induced by F is the Frobeniusendomorphism J , and the endomorphism F 0 is the dual endomorphism (inthe sense of duality of abelian varieties). Now consider the divisorF 0 ((E, P )) = (E1 , P1 ) + + (Ep , Pp ),where the (Ei , Pi ) are elliptic curves with -structure in characteristic p. SinceEi is an isogeny of degree p from (Ei , Pi ) to (E, P ), we also have the dualisogeny from (E, P ) to (Ei , pPi ). If E is ordinary at p, then the points(E , P ), (E1 , pP1 ), . . . , (Ep , pPp ) are a complete list of the distinct curveswith -structure which are p-isogenous to (E, P ). Hence one has the equalityof divisors on X1 (N )/Fp :Tp ((E, P )) = (E , P ) + (E1 , pP1 ) + + (Ep , pPp ) = (F + hpiF 0 )((E, P )).Since the ordinary points are dense on X1 (N )/Fp , we deduce that Tp = (F +hpiF 0 ) as endomorphisms of J1 (N )/Fp . This equation, known as EichlerShimura congruence relation, plays a central role in the theory. (For moredetails, see [DI], sec. 10.2, 10.3.)Theorem 1.29 If p6 |N then the endomorphism Tp of J/Fp satisfiesTp = F + hpiF 0 .Remark 1.30 This was proved by Eichler [Ei] to hold for all but finitely manyp in the case of 0 (N ), and by Shimura ([Shi1], see also [Shi2], ch. 7) in thecase of 1 (N ). The fact that it holds for all p not dividing N follows fromwork of Igusa [Ig].37

Modular forms: In the same way that modular curves have models overQ and over Z, the Fourier coefficients of modular forms also have naturalrationality and integrality properties. We start by sketching the proof of:Theorem 1.31 The space S2 () has a basis consisting of modular forms withinteger Fourier coefficients.Proof: The Hecke operators act on the integral homology + in a way thatis compatible with the action on S2 () and respects the natural (Poincare)duality between these two spaces. Hence, if {n }nN is a system of eigenvaluesfor the Tn , then the n are algebraic integers in some finite extension K of Q,and the system {n }nN is a system of eigenvalues for the Tn for any Galois

automorphism of Q/Q.Hence, we have shown:Proposition 1.32 If f S2 (M, ) is a newform of some level M dividingN , then its Fourier coefficients lie in a finite extension K of Q. Moreover,

if Gal (Q/Q)is any Galois automorphism, then the Fourier series f obtained by applying to the Fourier coefficients is a newform in S2 (M, ).The explicit description of S2 () given in section 1.3 implies that S2 () isspanned by forms having Fourier coefficients which are algebraic integers insome finite (Galois) extension K of Q, and that the space of forms with Fouriercoefficients in K is stable under the natural action of Gal (K/Q) on Fourierexpansions. An application of Hilberts theorem 90 shows that S2 () has abasis consisting of forms with rational Fourier expansions, and the integralityof the Fourier coefficients of eigenforms yields the integrality statement oftheorem 1.31.2We define S2 (, Z) to be the space of modular forms with integral Fouriercoefficients in S2 (). Theorem 1.31 states that S2 (, Z) C = S2 (). Givenany ring A, we defineS2 (, A) = S2 (, Z) A,and define S2 (N, A) and S2 (N, , A) (where now is a character with valuesin A ) in the obvious way. If A is contained in C, the q-expansion principlebelow allows us to identify S2 (, A) with the set of modular forms in S2 ()with Fourier coefficients in A.The q-expansion principle: Because the modular curve X0 (N ) has a modelover Q, the space of modular forms S2 (N ) = 1 (X0 (N )) has a natural rationalstructure, given by considering the differential forms on X0 (N ) defined overQ. The fundamental q-expansion principle (see [DR], ch. 7, or [Kat], sec. 1.6)says that these algebraic structures are the same as those obtained analytically38

by considering q-expansions at . More generally, using the model X , we

invertible in A, then the induced map S2 (, A) 1 (X,A

) is an isomorphism.Furthermore, if A is a subring of C, then this isomorphism identifies S2 (, A)with set of modular forms in S2 () having coefficients in A.

1.6

The Hecke algebra

It follows directly from the formulas for the Hecke operators acting on qexpansions that the Tn leave S2 (0 (N ), Z) stable, as well as the subspaceof S2 (N, ) with coefficients in Z[]. Using the q-expansion principle (theorem 1.33), one can also show ([DI], sec. 12.4) that the diamond operatorspreserve the spaces of cusp forms on with integral Fourier expansions, andhence that the space S2 (, Z) is preserved by all the Hecke operators.We define TZ to be the ring generated over Z by the Hecke operators Tnand hdi acting on the space S2 (, Z). More generally, if A is any ring, we defineTA to be the A-algebra TZ A. This Hecke ring acts on the space S2 (, A)in a natural way. Before studying the structure of the Hecke rings TA as wevary the rings A, we note the following general result (Cf. [Shi2], ch. 3.):Lemma 1.34 The space S2 (, A) = HomA (S2 (, A), A) is a free TA -moduleof rank one.Sketch of proof: One checks that the pairing TZ S2 (, Z) Z defined by(T, f ) 7 a1 (T f ) sets up a perfect, TZ -equivariant duality between TZ andS2 (, Z). The result for arbitrary A follows.2Hecke rings over C: If A = C, then the structure of the ring T = TCis completely described by theorem 1.22. More precisely, if TC,f denotes theimage of the Hecke algebra acting on the space Sf defined in section 1.3, thenTC = f TC,f ,where the direct sum ranges over all distinct newforms f of some level Nfdividing N . Furthermore, the algebra TC,f can be described explicitly. Inparticular, if f is a newform of level N then TC,f is isomorphic to C, but if Nfis not equal to N then the ring TC,f need not be a semi-simple algebra over C.Lemma 1.34 in the case A = C says that V = S2 () is a free TC -moduleof rank one, but we also have:39

Gal (Q/Q))of a normalized newform f of some level Nf dividing N , and letKf be the field extension of Q generated by the Fourier coefficients of f . Thespace g[f ] Sg is a vector space of dimension [Kf : Q]0 (N/Nf ), which isspanned by modular forms with rational Fourier coefficients. Let S[f ] be theQ-subspace of forms in g[f ] Sg with rational Fourier coefficients. The spaceS[f ] is stable under the action of TQ , and letting TQ,[f ] be the image of TQacting on S[f ] , we obtain the direct sum decompositionTQ = [f ] TQ,[f ] ,where the sum is taken over the distinct GQ -orbits of normalized newformsf of some level Nf dividing N . If Nf is equal to N , then the algebra TQ,[f ]is isomorphic to the field Kf . If Nf is a proper divisor of N , then, as in thecomplex case, the algebra TQ,[f ] is a (not necessarily semi-simple) algebra overQ of rank 0 (N/Nf )[Kf : Q]. The nature of the fields Kf , and in general thestructure of TQ , is very poorly understood at this stage; for example, one doesnot know how to characterize the number fields that occur as a Kf for somef (but they are all known to be totally real or CM fields).The ring TQ acts naturally on the rational homology H1 (X , Q) = Q,and we haveLemma 1.37 The module Q is free of rank two over TQ .Sketch of proof: The modules + C ' V and C ' V are free of rankone over TC , by lemma 1.34. This implies that + Q and Q are bothfree of rank one over TQ .2

40

Hecke rings over Z: The ring TZ is a certain (not necessarily maximal) orderin TQ . One still has an injectionTZ , [f ] TZ,[f ] ,where now TZ,[f ] denotes the ring generated over Z by the Hecke operatorsacting on S[f ] . Of course the structure of TZ is even more mysterious thanthat of TQ ! The ring TZ acts naturally on , but it is not the case in generalthat is free of rank two over TZ , i.e., that the integral analogue of lemma 1.37is true. (See remark 1.42 below.)Hecke rings over Q` : The study of the algebras TZ` and TQ` arises naturallybecause of the Hecke action on the Tate module T` (J ),T` (J ) := lim(J )[`n ],

where the inverse limit is taken with respect to the multiplication by ` maps.The action of TZ` on T` (J ) is compatible with that of GQ , and it is this pair ofactions on the Tate module which is used to associate two-dimensional Galoisrepresentations to modular forms.It will sometimes be more convenient to consider the ring TQ` and its actiononV = T` (J ) Z` Q` .

QDecomposing TQ` as i Ri where each factor Ri is a finite-dimensional localQ` -algebra, we obtain an isomorphismRi R i = Ri Ri

for each i. At least one of the four maps Ri Ri deduced from thisisomorphism must be surjective, and by counting dimensions, we see that it2must be injective as well. It follows that TQ` is isomorphic to TQ` .Recall that for primes p not dividing N , the Jacobian J has good reductionmod p, and the Eichler-Shimura relation, theorem 1.29, states that on J/Fp ,we haveTp = F + hpiF 0 .

For primes p not dividing N `, we may identify T` (J ) with the `-adic Tatemodule of the reduction (see [ST]) and consider the Frobenius endomorphismF on the free rank two TQ` -module V. As a consequence of the Eichler-Shimurarelation, we find:Theorem 1.41 For p not dividing N `, the characteristic polynomial of F onthe TQ` -module V isX 2 Tp X + hpip.Proof: (We are grateful to Brian Conrad for showing us this argument.) SinceF F 0 = p, it follows from the Eichler-Shimura relation thatF 2 Tp F + hpip = 0.To conclude that this is in fact the characteristic polynomial, it suffices tocompute the trace of F . To do so, we use the TQ` isomorphismV V42

defined by the modified pairing h, wi. Under this modified pairing, F is

adjoint to wF 0 w = hpiF 0 , so the trace of F on V is the same as that of 7 (hpiF 0 ) on V . Choosing bases for V and TQ` , one sees that this isthe same as the trace of hpiF 0 . Hence2 tr F = tr F + tr (hpiF 0 ) = tr (Tp ) = 2 Tp .2Hecke rings over Z` : The ring TZ` is free of finite rank over Z` . It thereforedecomposes asYT Z` =Tm ,

where the product runs over the maximal ideals m of TZ` , and Tm is thelocalization of TZ` at m. For each m, Tm is a complete local Z` -algebra, freeof finite rank as a Z` -module.Remark 1.42 While the analogue of lemma 1.39 does not always hold forTZ` (see [MRi], sec. 13) we shall see that it holds for certain localizations T m .Results of this type are much deeper than lemma 1.39 and were first obtainedby Mazur [Maz1], sec. 14, 15. We shall return in chapter 4 to explain Mazursresult and its generalizations, which play a role in the arguments of [W3] and[TW].Example 1.43 The curve X0 (19) has genus 1, and X0 (57) has genus 5. Byconsulting the tables in the Antwerp volume [Ant4] or Cremonas book [Cr],one finds that there is exactly one newform of level 19, and that there arethree newforms of level 57, which all have rational Fourier coefficients. TheirFourier coefficients ap , for the first few primes p, are listed in the followingtable:19A57A57B57C

It sends U3 to (1 + 2, 1 2, 1, 1, 1) and U19 to (1, 1, 1, 1, 1).

where m is the ideal generated by 3 and Tn an (57B) for all n.

1.7

The Shimura construction

PLet f =an q n be an eigenform on with (not necessarily rational) Fouriercoefficients, corresponding to a surjective algebra homomorphism f : TQ Kf , where Kf is the field generated over Q by the Fourier coefficients of f . Webriefly review in this section a construction of Shimura ([Shi2], ch. 7) whichassociates to f (or rather, to the orbit [f ] of f under GQ ) an abelian varietyAf defined over Q and of dimension [Kf : Q].Let If TZ be the ideal ker (f ) TZ . The image If (J ) is a (connected)subabelian variety of J which is stable under TZ and is defined over Q.Definition 1.44 The abelian variety Af associated to f is the quotientAf = J /If JFrom this definition one sees that Af is defined over Q and depends only on[f ], and that its endomorphism ring contains TZ /If which is isomorphic to anorder in Kf .Remark 1.45 Using theorem 1.22, one can show that J0 (N ) is isogenous toQ0 (N/Nf ).[f ] A[f ]

We now describe the abelian variety Af as a complex torus. Let Vf be the

subspace of V = S2 () on which T acts by f . Theorem 1.22 and lemma 1.34show that Vf is a one-dimensional complex vector space. Let f be the orthogonal projection of V to Vf , relative to the Petersson scalar product. Theprojector f belongs naturally to TKf .44

Let [f ] be the set of all eigenforms whose Fourier coefficients are Galoisconjugate to those of f . The number of forms in [f ] is equal to the degree dof Kf over Q. Now, we setXV[f ] = g[f ] Vg , [f ] =g .g[f ]

Note that [f ] is simply the orthogonal projection of V to V[f ] . Note also that[f ] belongs to the Hecke algebra TQ .Lemma 1.46 The abelian variety Af is isomorphic over C to the complextorus V[f ] /[f ] (), with the map [f ] : V / V[f ] /[f ] () corresponding tothe natural projection from J to Af .In particular, one sees that Af is an abelian variety of dimension d = [Kf : Q].Hence if f has rational Fourier coefficients, then the abelian variety Af is anelliptic curve. This elliptic curve is called the strong modular elliptic curveassociated to f if also f is a newform of level N and = 0 (N ).Example 1.47 If = 0 (26), one checks that the genus of X is two, andthat there are two distinct normalized eigenforms f1 and f2 in S2 (26). Fromthe tables in [Ant4] or [Cr], one sees that f1 and f2 have integral Fouriercoefficients, whose values for the primes 31 are:f1f2

2357 11 13 17 19 23 29 3111 3 161 3 20 6 41 3 11 2 1 3 6 4 24

Hence the abelian varieties Af1 and Af2 are elliptic curves. The above tablesuggests (and this can be checked directly by looking at the equations for thesecurves given in [Ant4] or [Cr], or by using the discussion in [DO], lemma 2.1)that the Fourier coefficients of f1 and f2 are congruent modulo 2. The naturalprojection J0 (26) Af1 Af2 is an isogeny whose kernel is isomorphic toZ/2Z Z/2Z, and J0 (26) is not isomorphic to Af1 Af2 . More generally, oneknows that a Jacobian can never decompose as a non-trivial direct product oftwo principally polarized abelian varieties, cf. [Maz1], prop. 10.6. (There areno non-zero homomorphisms from Af1 to Af2 , and so a non-trivial productdecomposition of J would have to induce a decomposition as a product ofprincipally polarized abelian varieties.)Let T` (Af ) be the Tate module of the abelian variety Af ,T` (Af ) := lim(Af )[`n ],

45

where the inverse limit is taken with respect to the multiplication by ` maps.This module is naturally a module for Tf Z` , and T` (Af ) Q` is a modulefor Kf Q` as well.Lemma 1.48 The module T` (Af )Q` is a free module of rank 2 over Kf Q` .Proof: This follows directly from lemma 1.39.

Proposition 1.49 The algebra End Q (Af ) Q is isomorphic to Kf . In particular Af is simple over Q.The proof is given in [R2], cor. 4.2. The main ingredient is the irreducibility ofthe Galois representation attached to Af as in section 3.1. (Cf. theorem 3.1.)Properties of Af : good reduction:Theorem 1.50 If f is an eigenform of level N , and p is a prime that doesnot divide N , then the abelian variety Af has good reduction at p.Proof: This follows from the fact that J has good reduction at such primes,which in turn is a consequence of the good reduction of X .2If p is a prime not dividing N , it then becomes natural to study the numberNf,p of points on the abelian variety Af over the finite field Fp . It is given bythe following formula:Proposition 1.51 The number of points Nf,p is given by the formulaNf,p = NormKf /Q (f (1 ap (f ) + hpip)),where af (p) Kf is the p-th Fourier coefficient of f .Proof: By Weils theory [We1], this number is given by the determinantNf,p = det(1 F ),where F is the Frobenius endomorphism acting on the `-adic Tate moduleT` (Af ). So theorem 1.41 impliesdet(1 F ) = NormKf /Q (f (1 Tp + hpip)),and proposition 1.51 follows.2Defining the local Hasse-Weil L-function of Af over Fp by the formulaL(Af /Fp , s) = det(1 F ps )1 ,46

the proof above gives the formula:

L(Af /Fp , s) =

Lp (f , s),

where the product is taken over all complex embeddings : Kf , C, and

Lp (f , s) is the Euler factor at p in the L-function that was associated to f in section 1.4.In particular, if Af = E is an elliptic curve, i.e., f has rational Fouriercoefficients and f is on 0 (N ), then the number of points on E over Fp isgiven by the formula#E(Fp ) = p + 1 ap (f ).

any abelian variety A over Q is its arithmetic conductor, which measures theamount of badQreduction of A. If we factor this conductor as a product ofprime powers, p pmp , then the exponents mp are equal to 0 precisely whenA has good reduction at p. The definition of mp (A) was already given insection 1.1 when A is an abelian variety of dimension 1, i.e., an elliptic curve.In general, the exponent mp coincides with mp (A,` ) (see section 2.1 below),where A,` is the Galois representation on the `-adic Tate module of A.Regarding the bad reduction of the abelian variety Af , one has the followingconsequence of the results of Langlands, Deligne and Carayol [Ca3] discussedbelow in section 3.1.Theorem 1.52 If f is a newform of level N , then the conductor of the abelianvariety Af is equal to N g .

1.8

The Shimura-Taniyama conjecture

Let E be any elliptic curve defined over Q, and let N denote its arithmeticconductor, defined as in section 1.1. Then we have:Proposition 1.53 The following are equivalent:(a) The curve E is isogenous over Q to Af , for some newform f on somecongruence group .(b) The curve E is isogenous over Q to Af , for a newform f on 0 (N ).(c) There is a non-constant morphism defined over Q, from X0 (N ) to E.47

Sketch of proof: The implication (b) (a) is immediate, and (a) (b) followsfrom the work of Carayol [Ca3] (cf. proposition 3.20 below). If (b) holds, thenthere is a surjective map J0 (N ) E. Composing with the Abel-Jacobi mapX0 (N ) J0 (N ) gives a map to E which is not constant, by the definition ofthe Jacobian, and hence (c) holds. Conversely, a non-constant map of curves : X0 (N ) E induces, by Albanese (covariant) functoriality, a surjectivemap of Jacobians : J0 (N ) E, where we have identified E with its ownJacobian in the natural way. Since J0 (N ) is isogenous to a product (possiblywith multiplicities) of abelian varieties Af where f run over newforms of levelNf dividing N (cf. remark 1.45), it follows that there is a surjective mapAf E for some Af . Since Af is simple (proposition 1.49), this map is anisogeny, and part (b) follows.2We call an elliptic curve over Q satisfying the equivalent properties abovea modular elliptic curve. A startling conjecture that was first proposed byTaniyama in the 1950s and made more precise by Shimura predicts that theShimura construction explained in the previous section is surjective:Conjecture 1.54 All elliptic curves defined over Q are modular.This conjecture is now known to be true for a very wide class of elliptic curves.The results of Wiles [W3] and Taylor-Wiles [TW] imply it is true for all semistable elliptic curves, and a strengthening of the method, [Di2], implies theconjecture for all elliptic curves which have semi-stable reduction at 3 and 5.Remark 1.55 See [DI], sec. 13 for a more thorough list of equivalent formsof the conjecture.Relation with L-functions: Define numbers ap = ap (E) as in section 1.1.Recall the (global) Hasse-Weil L-function defined in section 1.1 by equations(1.1.2), (1.1.4), (1.1.5), and (1.1.6). The following proposition reveals some ofthe importance of the Shimura-Taniyama conjecture:Proposition 1.56 If E is modular, i.e., is associated to a newform f by theShimura construction, thenL(E/Q, s) = L(f, s).In particular, L(E/Q, s) has an analytic continuation to the entire complexplane, and a functional equation.

48

Sketch of proof: If E is isogenous to an elliptic curve Af associated to a

newform f on 0 (N ) by the Shimura construction, then the two L-functionsL(Af /Q, s) and L(E/Q, s) are equal. On the other hand, formula (1.7.1)directly implies that L(Af /Q, s) is equal to L(f, s), at least up to finitelymany Euler factors (corresponding precisely, by the work of Igusa, to theprimes dividing N ). The equality of all the Euler factors follows from thework of Deligne, Langlands, and Carayol (cf. [Ca1]).2Knowing the analytic continuation and functional equation of the HasseWeil L-function of an elliptic curve is of great importance in the theory. Forexample, the conjecture of Birch and Swinnerton-Dyer (which was stated ina weak form in section 1.1, conjecture 1.9) relates arithmetic invariants of Esuch as the rank of its Mordell-Weil group and the order of its ShafarevichTate group to the behaviour at s = 1 of L(E/Q, s). (Note that s = 1 isoutside the domain of absolute convergence of the infinite product used todefine L(E/Q, s).)One consequence of this conjecture is that E(Q) is finite if L(E/Q, 1) isnon-zero. This was proved by Coates and Wiles in [CW] for elliptic curves withcomplex multiplication, and by Kolyvagin [Kol] for modular elliptic curves.(Recently, a different proof for modular elliptic curves has been announced byK. Kato.) Thanks to the breakthroughs of [W3], [TW], and [Di2], the resultsof Kolyvagin and Kato are now unconditional for a very large class of ellipticcurves.The Shimura-Taniyama conjecture for abelian varieties: We may viewconjecture 1.54 as asserting that the mapoonnNewforms of weight 2 on X0 (N ) Isogeny classes of elliptic curvesover Q of conductor Nwith rational Fourier coefficients

provided by the Shimura construction is a bijection. It is natural to extend

this result to all eigenforms, not just those having rational Fourier expansions.We say that an abelian variety defined over Q is modular if it is isogenous toan abelian variety of the form Af for some newform f on 1 (N ). It wouldnot be reasonable to expect that all abelian varieties over Q are modular: theabelian varieties arising from the Shimura construction are very special, inmany respects. Most importantly, they have a very large ring of endomorphisms. Following Ribet [R7], we make the definition:Definition 1.57 An abelian variety over Q of dimension g is said to be ofGL2 -type if its endomorphism ring over Q contains an order in a field ofdegree g over Q.Then we have the following generalized Shimura-Taniyama conjecture:49

Conjecture 1.58 Every simple abelian variety A over Q which is of GL2 -typeis modular.This generalization of the Shimura-Taniyama conjecture is far from beingproved, and tackling it still seems to require some major new ideas, evenafter the ideas introduced by Wiles.

22.1

Galois theoryGalois representations

If F is a perfect field, we will let F denote an algebraic closure of F and

F contained in F . Hence GF carries a natural topology. If ` is a prime

different from the characteristic of F we will let ` : GF Z` denote the`-adic cyclotomic character, i.e. if is an `-power root of unity in F then() = ` () for all GF . We will simply write if the choice of ` is clearfrom the context. the profinite comThe finite fields Fp : The group GFp is isomorphic to Z,pletion of Z. A natural topological generator is provided by the Frobenius p which raises elements of F p to theirelement Frob p , the automorphism of Fthp power. If ` is different from p, then ` (Frob p ) = p.The p-adic fields Qp : The p-adic valuation vp : Q Z extends uniquelyp

p and

to a valuation vp : Qp Q. Let OQ p denote the ring of integers of QmQ p its maximal ideal. The residue field OQ p /mQ p is an algebraic closure of p . The valuation vp is compatible with theFp , which we will identify with Faction of GQp , and hence OQ p and mQ p are stable under GQp . In particularGQp acts on OQ p /mQ p and we obtain a map % : GQp GFp , which is in facta surjection. We call the kernel the inertia group and denote it Ip . It has aunique maximal pro-p subgroup: the wild inertia group, denoted Pp . There isa canonical isomorphismYZ` (1),Ip /Pp =`6=p

where the (1) indicates that if f GQp /Pp is a lifting of Frobenius and if Ip /Pp then f f 1 = p .The inertia group Ip is filtered by a series of subgroups called the higherramification groups. More precisely, there are closed normal subgroups Ipu in50

GQp for any u [1, ], with the following properties:

These groups are defined in section 3 of chapter IV of [Se2], where they are1denoted GuQp (except that there, G1Qp = GQp whereas we have Ip = Ip ).The local Kronecker-Weber theorem ([Se2], ch. 14, sec. 7, for example)asserts that the map

% p : GabQp GFp Zp .is an isomorphism. (We will use Gab to denote the abelianisation of a profinitegroup G, i.e. the unique maximal abelian continuous image.) Under it Ip goesdueuZp ) Z to Zp , where due denotes the leastp and for u > 0, Ip goes to (1 + pinteger greater than or equal to u. If ` 6= p then ` is trivial on Ip and takesFrob p to p Z` .

The field Q: If p is a rational prime, the usual p-adic valuation vp : Q Z

Q. This extension is not unique, but allextends to a valuation v : Qextensions are permuted transitively by GQ . Fix one such extension vp and vplet Gp denote its stabiliser in GQ . Then Gp acts on the completion (Q)and preserves the subfield of elements algebraic over Qp . This field is an p . One can check thatalgebraic closure of Qp , which we shall identify with Qthe resulting map Gp GQp is an isomorphism. Thus we obtain subgroupsGp Ip Ipu for u [1, ] and a distinguished element Frob p Gp /Ip .These objects depend on the choice of vp and vary by conjugation in GQ asthis choice is varied. We call an algebraic extension F/Q unramified at p ifall conjugates of Ip lie in GF GQ ; otherwise we say that F/Q is ramified.If F/Q is Galois and unramified at p, then there is a well-defined conjugacyclass [Frob p ] Gal (F/Q).By replacing the p-adic completion by an Archimedean completion, onehas a well-defined conjugacy class [c] in GQ consisting of those elements that , C. We will denote byarise as complex conjugation for some embedding QG the subgroup {1, c} for one such element c.We have the following fundamental results concerning the structure of GQ(see for instance [La1]).

51

Theorem 2.1 If F/Q is a finite extension then F is only ramified at finitely

many primes (those dividing the discriminant of F/Q).Theorem 2.2 (Hermite-Minkowski) If S is a finite set of primes and ifd Z>0 then there are only finitely many extensions F/Q of degree d which

are unramified outside S (contained in a fixed algebraic closure Q).

Theorem 2.3 (Chebotarev) If SF/Q is a Galois extension unramified outside a finite set S of primes then p6S [Frob p ] is dense in Gal (F/Q).

Theorem 2.4 (Kronecker-Weber) The product of the p-adic cyclotomic

characters gives an isomorphismYY

p : GabZQ p.p

We remark that the Chebotarev density theorem (theorem 2.3) applied

to the extension of Q generated by the mth roots of unity implies Dirichletstheorem that if n and m are coprime integers then there are infinitely manyprimes p with p n mod m. We also remark that the Kronecker-Webertheorem is equivalent to the fact that the maximal abelian extension of Q isobtained by adjoining all roots of unity to Q.Representations: A d-dimensional representation of GQ is a homomorphismGQ GLd (K),where K is any field. (In our later discussion, we will also consider representations with coefficients in a ring.) Often K (and hence GLd (K)) comesequipped with a natural topology, and it is then customary to restrict onesattention to continuous homomorphisms GQ GLd (K).Since any one-dimensional representation has abelian image, theorem 2.4allows one to give a complete description of the one-dimensional representations of GQ together with the behaviour of these representations on the decomposition groups at all primes. The aim of this article is to discuss attemptsto give a similar theory for the two-dimensional representations of GQ .We will call a representation of GQ unramified at p if it is trivial on theinertia group Ip . Otherwise we say that it is ramified at p. If is unramifiedat p, then (Frob p ) is well-defined (and its conjugacy class is independent ofthe choice of vp ).If is a representation of a group G and i is a non-negative integer, thenwe let i denote the representation of G on the ith exterior power of the52

underlying module of . If H is a subgroup of G, we will let H (resp. H )

denote the H-invariants (resp. H-coinvariants) of the underlying module of. If H is normal in G, then we shall also use H and H to denote thecorresponding representation of G/H. In particular if p is a prime and is arepresentation of GQ or GQp , then Ip (Frob p ) and Ip (Frob p ) are well-defined.We will be primarily interested in three types of representation of GQ . Artin representations, i.e. continuous representations GQ GLd (C).Since all compact totally disconnected subgroups of GLd (C) are finite,Artin representations have finite image. Hence they are semi-simple andare unramified at all but finitely many primes, by theorem 2.1. Mod ` representations, i.e. continuous representations GQ GLd (k),where k is a finite field of characteristic `. These always have finite imageand hence, like Artin representations, are unramified at all but finitelymany primes. `-adic representations, i.e. continuous representations GQ GLd (K),where K is a finite extension of Q` . We require that an `-adic representation be unramified at all but finitely many primes.Remark 2.5 (a) Continuous representations GQ GLd (K), unlike thoseto GLd (C) or GLd (k), may be ramified at infinitely many primes. Weshall not need to consider such representations. (One rarely if ever does.)However, the term `-adic representation is often used in the moregeneral sense elsewhere in the literature.(b) Note that the image of an `-adic representation can very well be infinite. For instance this is the case if d = 1 and the representation is thecyclotomic character.(c) Note that in the case of mod ` and `-adic representations, the representations need not be semi-simple.Proposition 2.6 Let S be any finite set of primes.(a) An Artin representation : GQ GLd (C) is determined by the valuesof tr (Frob p ) on the primes p / S at which is unramified.(b) A semi-simple mod ` representation : GQ GLd (k) is determined bythe values of tr i (Frob p ) (i = 1, ..., d) on the primes p / S at which is unramified. If ` > d then it is determined by tr (Frob p ) on theprimes p / S at which is unramified.53

(c) A semi-simple `-adic representation : GQ GLd (K) is determined by

the values of tr (Frob p ) on the primes p / S at which is unramified.Proof: Combining the Chebotarev density theorem (theorem 2.3) with thecontinuity of we see that we must show that tr (resp. tr i ) determines up to conjugacy in the various settings. For the characteristic zero representations, see [Bour], ch. 8, sec. 12.1, prop. 3; for mod ` representations, this isthe Brauer-Nesbitt theorem ([CR], (30.16)).2Let K denote a finite extension of Q` , let O be its ring of integers, themaximal ideal of O and k the residue field. If : GQ GLd (K) is an `-adicrepresentation, then the image of is compact, and hence can be conjugated to a homomorphism GQ GLd (O). Reducing modulo the maximalideal gives a residual representation : GQ GLd (k). This representationmay depend on the particular GLd (K)-conjugate of chosen, but its semisimplification ss (i.e. the unique semi-simple representation with the sameJordan-Holder factors) is uniquely determined by , by proposition 2.6 (b).Note that the kernel of the reduction map GLd (O) GLd (k) is a pro`-group. In particular we see that if p 6= ` then (Pp ) is finite and (Pp ) isisomorphic to (Pp ).Suppose is a representation of GQ of one of the three sorts above. If p 6= `then we define the conductor, mp (), of at p byZ Z uIpuIpcodimIp du.mp () =codim du = codim +1

This is well-defined as (Pp ) is a finite group. If is an Artin representation

this makes sense also for p = `. (It would even make sense for a mod `representation if p = `, but in this case it does not seem to be a useful notion.)It is known that mp () is an integer (see chapter VI of [Se2]). Moreover it iseasily seen that mp () = 0 if and only if is unramified at p. We define theconductor N () of to beYpmp () ,p

where the product is over all primes p 6= ` in the case of an `-adic or mod `representation and over all primes in the case of an Artin representation. Thismakes sense because is unramified almost everywhere.The following lemma is an exercise.

Lemma 2.7 Suppose that : GQ GLd (K) is an `-adic representation and

that : GQ GLd (k) is a reduction of . Then for each prime p 6= ` we have54

R0

codimIp du =

R0

codimIp du. ThusN () = N ()

Ip

p(dim

dim Ip )

p6=`

and in particular N () divides N ().We take this opportunity to introduce some important notation. Let Gdenote a group and R a ring. If : G GLd (R) is a representation weshall let M denote the underlying R[G]-module (so that M = Rd as an Rmodule). If M is an R[G]-module we shall let End (M ) denote the module ofR-linear endomorphisms of M . It is also an R[G]-module with the G-actionbeing defined by(g())(m) = g((g 1 m)).If M is a finitely generated free R-module we will let End 0 (M ) denote thesub-R[G]-module of End (M ) consisting of endomorphisms of trace zero. Wewill use ad to denote End (M ) and ad0 to denote End 0 (M ). Note that ifd is invertible in R then ad = ad0 R as R[G]-modules, where R has thetrivial action of G. Note also then that the kernel of ad0 is the same as the

kernel of the composite map G GLd (R) P GLd (R).

We also remark that if R is an algebraically closed field of characteristicother than 2 and : G GL2 (R) is irreducible, then either ad0 is irreducibleor there is a subgroup H G of index 2 and a character : H R such thatG0 0

= Ind GH . In the latter case ad = Ind H (/ ), where is the nontrivialcharacter of G/H and 0 is the composite of and conjugation by an element0 0of G H. Moreover, either Ind GH (/ ) is irreducible or ad = 1 2 3where the i are distinct characters of G of order 2. In the latter case, isinduced from a character from each of the subgroups Hi = ker i of index 2 inG. We leave the verification of these facts as an exercise to the reader.

2.2

Representations associated to elliptic curves

Perhaps the simplest examples of non-abelian `-adic representations arise from

elliptic curves defined over Q. denote the group of n-torsion points on E(Q).

We will let E[n](Q)

By2

(Z/nZ).proposition 1.1 there is a non-canonical isomorphism E[n](Q)=

Furthermore, E[n](Q) carries a natural action of GQ and so we get a representation (defined up to conjugation)E,n : GQ GL2 (Z/nZ).55

If ` is a prime different from the characteristic of Q then we set T` E =

which is non-canonically isomorphic to Z2 . Again it has a natlim E[`n ](Q),`

ural continuous action of GQ , and so we get a representation (defined up to

(a) The determinant of E,` is ` .

(b) The representation E,` is absolutely irreducible for all `. For fixed E,E,` is absolutely irreducible for all but finitely many `.(c) If E does not have complex multiplication then E,` (and hence E,` ) issurjective for all but finitely many `.Proof: Part (a) follows from the existence of the non-degenerate alternatingGalois-equivariant Weil pairingT` E T` E Z` (1) := lim `n .

Part (b) is proved in [Se5], ch. IV. Part (c) is the main result of [Se6].2The following deep result, which is stronger than the second assertion ofpart (b) of proposition 2.8, is due to Mazur [Maz2], thms. 1 and 2. (Parts (b)and (c) can actually be deduced directly from theorem 1.7, using the fact thata semi-stable curve with a rational subgroup of order ` is necessarily isogenousto a curve with a rational point of order `.)Theorem 2.9 Suppose that E/Q is an elliptic curve.(a) If ` > 163 is a prime then E,` is irreducible.(b) If E is semistable then E,` is irreducible for ` > 7.(c) If E is semistable and E,2 is trivial then E,` is irreducible for ` > 3.Remark 2.10 Combined with the results of Serre [Se6], Mazurs results implythat if E is semistable everywhere then E,` is surjective for ` > 7 ([Maz2],thm. 4).Local behaviour of E,` and E,` :56

Proposition 2.11 Suppose E has good reduction at p.

(b) For all n 1 there is a finite flat group scheme Fn /Zp such that p) p)E[pn ](Q= Fn (Q

as Gp -modules.(c)

If E has good ordinary reduction at p (i.e., ap is not divisible by p)

p ) has order p andthen Ep [p](F

p E,p |Ip 0 1 ;

p ) is trivial and If E has supersingular reduction at p, then Ep [p](FE,p |Gp is irreducible.

Proofs: The proof of part (a) can be found in chapters V and VII of [Si1].For part (b) one considers the finite flat group scheme Fn = E[pn ], where Eis the model for E over Zp defined by (W min ). (See [Sha] for an introductionto theory of finite flat group schemes.) Part (c) follows from the results inchapters V and VII of [Si1] together with basic results on finite flat groupschemes.Proposition 2.12 Suppose that E has multiplicative reduction at p. Let q =qE,p pZp be the multiplicative Tate period attached to E, defined as in section 1.1. Let : Gp /Ip {1} be the unique non-trivial unramified quadraticcharacter of Gp if E has non-split multiplicative reduction, and let be thetrivial character if E has split multiplicative reduction. Then(a)E,` |Gp and if ` 6= p then mp (E,` ) = 1.

p ) if p) (d) There is a finite flat group scheme F/Zp such that E[p](Q= F(Qminand only if p|vp (E ) = vp (jE ).Sketch of proof: By Tates proposition 1.5 of section 1.1, we have p) /q Z )()E(Q= (Qp /q Z )() by : x 7as Gp -modules. (The () indicates that Gp acts on (Qp()(x) .) The proposition follows directly from this, together with [Edi], prop.8.2. for the last part. (For further details, see ch. VII, prop. 5.1 of [Si1], ch.IV, thm. 10.2. and ch. V of [Si2], and sec. 2.9 of [Se7].)2Proposition 2.13 Suppose that E has additive reduction at p. Then for any` 6= p the conductor of E,` |Gp is at least 2, and if p > 3 then it is equal to 2.Remark 2.14 For an elliptic curve E with any type of reduction at p, theconductor of E,` |Gp (for ` 6= p) coincides with the local conductor of E at pgiven by a formula of Ogg (proved by Saito [Sa] in the case p = 2). In particularthe conductor of E,` |Gp is independent of ` 6= p. For further discussion andreferences, see ch. IV, secs. 10 and 11 of [Si2].The Frey curve: We now consider the elliptic curveEA,B : Y 2 Z = X(X AZ)(X + BZ),where A and B are non-zero coprime integers with A + B 6= 0. Set C =A + B. This equation has discriminant = 16(ABC)2 . If p is an odd primethen this curve has good reduction at p if p6 |(ABC) and has multiplicative412nreduction at p if p|ABC. Thus min(ABC)2 for some n Z0 .EA,B = 2If for instance A 1 mod 4 and B 0 mod 16 then EA,B has a minimalWeierstrass equationY 2 Z + XY Z = X 3 +

considered by Hellegouarch [He] and Frey [Fr], where a` + b` = c` is a hypothetical non-trivial solution to Fermats Last theorem, and where we havesupposed without loss of generality that a, b and c are pairwise coprime, thatb is even and that a 1 mod 4. Frey suggested that properties of this hypothetical elliptic curve E could lead to a contradiction if the curve were alsoknown to be modular. This suggestion was made precise by Serre [Se7], whoobserved that the following list of properties would yield the desired contradiction, when combined with his conjectures on Galois representations associatedto modular forms (to be discussed in section 3.2).Theorem 2.15 Suppose that ` > 3 is prime. The representation E,` has thefollowing properties.(a) E,` is irreducible.(b) E,` is unramified outside ` and 2. `) ` ) as(c) There is a finite flat group scheme F/Z` such that F(Q= E[`](QG` -modules.(d) #E,` (I2 ) = `.Proof: Part (a) follows from Mazurs theorem (theorem 2.9, (c)). To prove (b),82`(c), and (d), we note the key fact that minE = 2 (abc) is a perfect `-th power,up to powers of 2. The fact that E,` is unramified at p 6= `, 2 (part (b)) followsfrom this, from proposition 2.11, (a) (for primes p 6= ` of good reduction for E)and from proposition 2.12 (c) (for primes p 6= `, 2 of multiplicative reductionfor E). Part (c) is a consequence of proposition 2.12 (d), and part (d) followsfrom proposition 2.12 (c).2Remark 2.16 Part (a) implies that the image of E,` is large, while parts (b)and (c) state that the image has very limited ramification. We remark thatthe conclusion of the theorem ensures that mp (E,` ) = 0 for p 6= 2, ` and thatm2 (E,` ) = 1.

2.3

Galois cohomology

In this section M will denote a continuous discrete GQ -module of finite cardinality. If M and N are two such modules we will endow the space Hom(M, N )of homomorphisms of abelian groups with an action of GQ via the formula(g())(m) = g((g 1 m)).59

M . If L is a collection of local conditions for M we define the corresponding

Suppose that v is a finite place of Q. We have an exact sequence

Frob 1

vM Iv H 1 (Gv /Iv , M Iv ) (0),(0) H 0 (Gv , M ) M Iv

and hence we see that #H 1 (Gv /Iv , M Iv ) = #H 0 (Gv , M ).

We have the following important observation of Wiles [W3], inspired by aformula of Greenberg [Gre]. An important theme in number theory has beenthe calculation of Selmer groups. This result, although it does not allow theabsolute calculation of Selmer groups, allows the comparison of two (dual)Selmer groups. In the applications in this paper we shall apply the theoremwhere the various data have been chosen to make one of the Selmer groupstrivial. In such a situation it allows the exact calculation of the order of thenon-trivial Selmer group.Theorem 2.18 If L is a collection of local conditions for M then the Selmergroup HL1 (Q, M ) is finite. Moreover we have the formula#HL1 (Q, M )#Lv#H 0 (GQ , M ) Y=.1

0#HL (Q, M )#H (GQ , M ) v #H (Gv , M )

We note that all but finitely many terms in the product are 1 so that it makessense.Proof of theorem 2.18: Choose a finite set, S, of places of Q, which contains ,all the places whose residue characteristic divides the order of M , all placesat which M is ramified and all places at which Lv 6= H 1 (Gv /Iv , M Iv ). LetQS denote the maximal extension of Q unramified outside S and let GS =Gal (QS /Q). Then we have an exact sequenceM(0) HL1 (Q, M ) H 1 (GS , M ) H 1 (Gv , M )/Lv .vS

In this section we assume that ` is an odd prime. If G is any topological

group then by a finite O[G]-module we shall mean a discrete O-module offinite cardinality with a continuous action of G. By a profinite O[G]-modulewe shall mean an inverse limit of finite O[G]-modules.If M is a profinite O[G` ]-module then we will call M good if for every discrete quotient M 0 of M there is a finite flat group ` ) as Z` [G` ]-modules;scheme F/Z` such that M 0 = F (Q

62

ordinary if there is an exact sequence

(0) M (1) M M (0) (0)of profinite O[G` ]-modules such that I` acts trivially on M (0) and by on M (1) (equivalently, if for all , I` we have ( ())( 1) = 0on M ); semi-stable if M is either good or ordinary.Suppose that R is a complete Noetherian local O-algebra with residue field k.We will call a continuous representation : G` GL2 (R) good, ordinary orsemistable, ifdet |I` =

(2.4.1)

and if the underlying profinite O[G` ]-module, M is good, ordinary or semistable. We write for mod mR . We record the following consequence ofNakayamas lemma.(1)

Lemma 2.20 If M and are ordinary, then M

of rank one over R and is ordinary.

(0)

and M

are each free

Remark 2.21 (a) These definitions are somewhat ad hoc, but at the moment that is all that seems to be available (though the work of Fontaineand Laffaille [FL] and its generalisations may well provide a more systematic setting).(b) For part of the motivation for our definitions, see proposition 2.23 andremark 2.24. For further motivation, we shall see in the next chapterthat representations arising from certain modular forms are semistable.Moreover the Fontaine-Mazur conjecture predicts that, conversely, anyrepresentation of GQ with semistable restriction to G` arises from sucha modular form. We shall state a weak form of the Fontaine-Mazurconjecture below (conjecture 3.17).(c) The terminology in [W3] to describe representations of G` is slightlydifferent. In particular, we impose the condition (2.4.1) in our definitionsof good and ordinary, as this is all we shall need here. Assuming satisfies this condition, the notion of ordinary in [W3] coincides with theone here, and is flat in the sense of [W3] if and only if it is good but notordinary. (Note that a representation may be both good and ordinary,for instance E,` for an elliptic curve with good, ordinary reduction; seeproposition 2.23.)63

The following assertions, in the good case, are consequences of results of

Raynaud [Ray] (in particular, see sec. 2.1, prop. 2.3.1 and cor. 3.3.6 of [Ray]).In the ordinary case they are elementary.Lemma 2.22 (a) Stable profinite O[G` ]-modules are closed under takingsub-objects, quotients and direct products. The same is true for ordinary profinite O[G` ]-modules.(b) Suppose that M is a profinite O[G` ]-module and that {Mi } is a family ofsub-objects with trivial intersection, such that each M/Mi is good (resp.ordinary). Then M is good (resp. ordinary).(c) If M is a finite O[G` ]-module then M is good if and only if there is a ` ) as Z` [G` ]-modules.finite flat group scheme F/Z` such that M = F(Q(d) If M and M 0 are profinite O[G` ]-modules with M 0 = M as Z` [I` ]-modules0then M is good (resp. ordinary) if and only if M is good (resp. ordinary).(e) Suppose that M is a profinite O[G` ]-module which is finite and free overO. Then M is good if and only if there exists an `-divisible group F/Z`such that M is isomorphic to the Tate module of F as a Z` [G` ]-module.Together with results stated in section 2.2, we have the following.Proposition 2.23 Suppose that E is an elliptic curve over Q and O = Z` . if E has good (resp. semistable) reduction at `, then E,` and E,` aregood (resp. semistable); if E has semistable reduction at `, then E,` is ordinary if and only ifE,` is ordinary if and only if either E has multiplicative reduction or Ehas good ordinary reduction.Remark 2.24 It is also true that if E,` is good (resp. semistable), then Ehas good (resp. semistable) reduction at `, but the result is more difficult andwe shall not need it. denoteWe will need a few more definitions. We will let : I` F`

2the character 7 ($)/$ mod $ where $ = ` 1 `. If F is a field ofcharacteristic other than ` and if M is a profinite Z` [G` ]-module then weset M (1) = lim M Z` `n (F ). If : G` GL2 (R) is ordinary the extension

(0) R(1) M R (0)

64

of R[I` ]-modules gives rise to a class c H 1 (I` , R(1)). Kummer theory and I` ) give rise to a map v : H 1 (I` , R(1)) H Z R R,the valuation on (Q`` I` ) . Then we also have thewhere H denotes the `-adic completion of (Q`following lemma.Lemma 2.25 (a) If : G` GL2 (k) is good then either is ordinary or|I` k = ` .(b) If : G` GL2 (R) is such that M is good and is ordinary, then isgood and ordinary.(c) If : G` GL2 (R) is ordinary then is good if and only if v(c ) = 0.Note that we need only consider the case that R has finite cardinality toprove the lemma. Parts (a) and (b) again follow from Raynauds results [Ray](for part (b) consider the connected-etale sequences for M and M).We sketch the proof of part (c) using an argument suggested to us byEdixhoven. As in [Edi], prop. 8.2 it suffices to determine which extensions(0) R(1) M R (0)

(2.4.2)

of R[I` ]-modules arise from finite flat group schemes over Znr` , the ring ofintegers of the maximal unramified extension of Z` . By prop. 17.4 of [Oo],the extension (2.4.2) arises from a finite flat group scheme if and only if itcorresponds to an extension of sheaves of R-modules in the fpqc topology overZnr` . Therefore we must compute the image ofExt1Rmod/fpqc (R, R(1)) Ext1R[I` ] (R, R(1)).Since the sheaf Ext1 of R by R(1) vanishes, this is equivalent to computingthe image of11Hfpqc(Znr` , R(1)) H (I` , R(1)).Part (c) follows from the fact this is precisely the kernel of v.

Remark 2.26 The authors expect that the theory of Fontaine and Laffaille[FL] discussed in the next section could be used to prove that if : G` GL2 (R) is such that M is good and is good, then det |I` is cyclotomic, i.e., is good.For semistable representations : G` GL2 (O/n ) we shall define OsubmodulesHf1 (G` , ad) Hss1 (G` , ad) H 1 (G` , ad).65

Before doing so, let us consider more generally a continuous representation

Suppose that : G` GL2 (O/ ) is ordinary, so that we have

#Hss1 (G` , ad0 ) #H 0 (G` , ad0 )#(O/n )#(O/(n , c` ))

where c` = (1 /2 )(Frob ` )1. Moreover if is good, then equality holds.(b) If n = 1 and is not good then #Hss1 (G` , ad0 ) = #k.End of proof of proposition 2.27: To deduce proposition 2.27 (in the good case)from proposition 2.28 and lemma 2.29, note that if : G` GL2 (O/n ) isgood, thenH 1 (G` /I` , O/n ) Hf1 (G` , ad) H 1 (G` , O/n )by lemma 2.22 (d), and this gives the inequality in proposition 2.27(a). Furthermore if is also ordinary, then the above group is contained inIm (Hf1 (G` , ad) H 1 (G` , O/n )) H 1 (G` /I` , O/n )where the last inclusion comes from lemma 2.25 (b). Therefore#Hf1 (G` , ad0 ) = #H 0 (G` , ad0 )#(O/n ).2We will prove proposition 2.28 in the next section using the theory ofFontaine and Laffaille. The proof of lemma 2.29 is a somewhat technicalexercise in the Galois cohomology of local fields, for which we refer the readerto ch. 1 of [W3], prop. 1.9, parts (iii) and (iv). We remark that for ourpurposes, inequalities would suffice in part (b) of proposition 2.27 and part (a)68

of lemma 2.29 (and this is all that is proved in [W3]). We have included theprecise formulas for the sake of completeness, since they are not much moredifficult to obtain. The only additional observation required is that if isgood, then the composite of the natural mapsH 1 (G` /I` , (ad0 /ad(1) )I` ) H 1 (G` , ad0 /ad(1) ) H 2 (G` , ad(1) )is trivial. To prove this, rewrite the composite asH 1 (G` /I` , O/n ) H 1 (G` , O/n ) H 2 (G` , ad(1) )with the second map given by c where c is the class in H 1 (G` , ad(1) )defining the extension M and apply lemma 2.25 (c). (In fact, one only needsthe easier half of lemma 2.25 (c): if is good then v(c ) = 0.)

2.5

The theory of Fontaine and Laffaille

In this section we again assume that ` is an odd prime. We mentioned in

the last section the importance of understanding good representations of G` .However the definition of good is somewhat indirect, and this makes computations difficult. The key result we use to address the problem is an equivalencebetween the category FFO of good finite O[G` ]-modules and a category MFOwhich we define below following Fontaine and Laffaille [FL]. The beauty andutility of the result stems from the elementary algebraic nature of the definition of MFO ; we can convert questions about good representations intoquestions in linear algebra.Remark 2.30 Let GFO denote the category of finite flat commutative groupschemes over Z` with an action of O. By results of Raynaud [Ray], taking ` points defines an equivalence between the categories GFO and FFO . TheQequivalence between GFO and a category closely related to MFO was firstestablished by Fontaine [Fo2] and [Fo1]. While Fontaines results would sufficefor our purposes here, our formulation will be closer to that in [FL], wherean equivalence between MFO and FFO is defined as part of a more generalconstruction of representations of G` from linear-algebraic data. We cautionhowever that our formulation is not exactly the same as that of [FL] since wewish to work with covariant functors.We now turn to the definition of MFO . The objects are O-modules Dof finite cardinality together with a distinguished submodule D 0 and O-linearmaps 1 : D D and 0 : D0 D which satisfy:69

1 |D0 = `0 , Im 1 + Im 0 = D.The morphisms are O-linear maps compatible with the additional data of thedistinguished submodules and maps .It is useful to note that if D is an object of MFO , then there is a surjection1 0 : D/D0 D0 /`D0 D/`0 (D0 ),and on counting orders we see that this is in fact an isomorphism. Thus thereis an isomorphism

D/(D0 + D) D 0 /D0 D/D.

It follows that D 0 /D0 D/D is injective, and hence that D 0 is (noncanonically) a direct summand of D as an O-module. Note also that 0 isinjective, and if D = D 0 D0 as O-modules, then also D = 0 (D0 ) 1 (D0 )as O-modules.It is then straightforward to check that there is a contravariant functor from MFO to itself defined by: D = Hom(D, Q` /Z` ); (D )0 = Hom(D/D 0 , Q` /Z` ); 1 (f )(z) = f (`x + y), where z = 1 (x) + 0 (y); 0 (f )(z) = f (x mod D 0 ), where z 1 (x) mod (0 D0 ).Moreover the canonical isomorphism D = (D ) of O-modules defines a natural isomorphism in MFO .We leave it as an exercise for the reader to use the above observations todefine cokernels and then kernels of morphisms in MFO and verify that it isan abelian category (or see [FL], sec. 1).Theorem 2.31 There is a covariant functor D : FFO MFO which definesan O-additive equivalence of categories. Moreover if M is an object of FFO ,then we have(a) M and D(M ) have the same cardinality;(b) D(M ) = D(M )0 if and only if M is unramified.70

Remark 2.32 (a) It follows on applying part (a) to M/i M for each i thatM and D(M ) are in fact (non-canonically) isomorphic as O-modules.(b) As mentioned above, our formulation differs from that of Fontaine andLaffaille in that we are using covariant functors. To deduce theorem 2.31from the results in [FL], sec. 9, we define D as a quasi-inverse of thefunctor U S ( [1]), where U S is defined in [FL] and [1] indicates a shiftby 1 in filtration degrees. In particular, if F is a finite flat group scheme ` ))over Z` with an action of O, then the underlying O-module of D(F(Qcan be identified with the covariant Dieudonne module of F/F` , and 1with F .Suppose now that : G` GL2 (O/n ) is a good continuous representation. Let D = D(M ) be the corresponding object of MFO . ThenD = O/n .= (O/n )2 as an O-module while D0 Lemma 2.33 The following are isomorphic.(a) The group of extensions of M by itself in the category of good finiteO/n [G` ]-modules.(b) The group of extensions of D by itself in the full subcategory of MFOconsisting of objects which are O/n -modules.(c) Pairs (1 , 0 ) where 1 HomO (D , D ), 0 HomO (D0 , D ) and`0 = 1 |D0 , modulo pairs of the form (a,1 ,1 a, a,0 ,0 a|D0 )where a HomO (D , D ) and aD0 D0 .Proof: The first two groups of extensions are isomorphic by the FontaineLaffaille theorem. Following Ramakrishna [Ram] we explain how to calculatethe second group of extensions. We will write D for D . If(0) D E D (0)

Thus the Ext group we want is the quotient of the set of pairs of matrices(1 , 0 ) as above by the submodule generated by

`z w `x , z ) and ( 0 y , 0 .z0`z 00 `z

Note that either z or w is a unit in O/ from which it follows that the Extgroup we want is isomorphic to (O/n )2 O/(z, n ). On the other handH 0 (I` , M ) corresponds to the largest submodule C D 0 such that 0 C = C,i.e. C = {d D 0 : zd = 0} 2= O/(z, n ).Corollary 2.35 Suppose that : G` GL2 (O/n ) is a continuous goodrepresentation. Then Hf1 (G` , ad) is isomorphic to (O/n )2 H 0 (G` , ad0 ).Proof: If is not ordinary then H 0 (G` , ad0 ) and H 0 (I` , M ) are both trivial, so suppose that is ordinary. In this case, let 0 denote the ordinaryrepresentation defined byM0 = Hom(M(0) , M ) ad0 ()(1)

and let M0

= ad0 ()(1) . Then

H 0 (G` , ad0 ) = H 0 (G` , M0 ) H 0 (I` , M0 )

= H 0 (I` , M ).72

(1)

Since H 0 (I` , M0

) is trivial, the restriction homomorphism

(1)

H 1 (G` , M0

(1)

) H 1 (I` , M0

is injective. We deduce from the long exact sequences associated to

(1)

(0) M0

M0 O/n 0

that H 0 (G` , M0 ) = H 0 (I` , M0 ).

2.6

Deformations of representations

In this section we shall review Mazurs theory of deformations of representations of profinite groups (see [Maz3]).Let CO denote the category whose objects are complete noetherian localO-algebras with residue field k and whose morphisms are O-algebra homomorphisms which are local (i.e. take maximal ideals into maximal ideals). (Thestructure maps from O to every object of CO are also assumed to be local.) LetG denote a topologically finitely generated profinite group and let denote anabsolutely irreducible representation of G into GLd (k). (In fact all we shalluse in the sequel is that k is the centraliser in Md (k) of the image of .) LetD0 denote the category of profinite O[G]-modules with continuous morphisms.We will let D denote a full subcategory of D0 which is closed under taking subobjects, quotients and direct products and which contains M. Note that ifM is an object of D0 and Mi is a collection of subobjects which have trivialintersection and suchQ that each M/Mi is an object of D, then M is an objectof D, since M i M/Mi .Let : G O be a continuous character such that det = mod . Bya lifting of of type D = (O, , D) we shall mean an object R of CO and acontinuous representation : G GLd (R) such that:(a) mod mR = ,(b) det = ,(c) M is an object of D.Note that if : R R0 and : G GLd (R) is of type D so is .Theorem 2.36 There is a lifting univ: G GLd (RD ) of of type D suchDthat if : G GLd (R) is any lifting of of type D then there is a uniquehomomorphism of O-algebras : RD R such that is conjugate to univD .73

The representation univ

is referred to as the universal deformation of typeDD. Mazur [Maz3] proved this theorem for D0 and certain other categories D.Ramakrishna [Ram] observed that the arguments work with any category Dsatisfying the above hypotheses. We will sketch a proof which was suggestedby Faltings. (We remark that another explicit construction of the deformationring has been given by de Smit and Lenstra in [dSL].)Proof of theorem 2.36: Choose a sequence g1 , ..., gr of topological generatorsof G and liftings A1 , ..., Ar of (g1 ), ..., (gr ) to Md (O). Define a mapping : Md (O) Md (O)r which sends x to (xA1 A1 x, ..., xAr Ar x). Since has torsion-free cokernel, so we can decomposeMd (O)r = (Md (O)) V,for some submodule V Md (O)r . If : G GLd (R) is a lifting of of typeD set v = ((g1 ) A1 , ..., (gr ) Ar ) Md (R)r . Note that v 0 mod mRand that v completely determines . We call the lifting well-placed if vbelongs to V O R Md (R)r . The crucial observation is the following result.Lemma 2.37 If : G GLd (R) is a lifting of of type D then there is aunique conjugate 0 of which is well-placed.The lemma is first proved for algebras R such that mnR = (0) by induction onn, and then one deduces the general case.In virtue of the lemma it suffices to find a universal well-placed lifting oftype D. Let e1 , ..., es be a basis of V asPan O-module. If is a well-placed liftingof of type D then we can write v = si=1 v,i ei for unique elements v,i mRand we can define a homomorphism : O[[T1 , ..., Ts ]] PR sending Ti to v,i .Note that is completely determined by ((gi ) = Ai + sj=1 (Tj )eji , whereej = (ej1 , ..., ejr )). Let I denote the intersection of all ideals J of O[[T1 , ..., Ts ]]such that there is a representationJ : G GLd (O[[T1 , ..., Ts ]]/J) of typePsD with J (gi ) = Ai + j=1 Tj eji for all i. Let RD denote the quotient ofunivO[[T1 , ..., Ts ]] by I. Then one can check:Ps that there is a representation univG GLd (RD ) with (gi ) = Ai + j=1 Tj eji for all i, and that this is thedesired universal representation.2We will need a few elementary properties of these universal deformations.More precisely we will need to know how these universal rings change whenwe change the base field, we will need to know how to calculate the equicharacteristic tangent space of these rings and more generally how to calculate/2 for certain prime ideals . The first of these lemmas is a remark ofFaltings, the second is due to Mazur [Maz3] and the third to Wiles [W3].

D0 and : RD0 RD O O0 . Moreover they show that themaps : RD RDcomposite ( 1) : RD0 RD0 is the identity and that : RD Ris the natural embedding. Thus is an isomorphism.2(n)We will let D denote the full subcategory of D whose objects are killedby n . Suppose that M is an object of D0 which is finite and free overO/n . Recall from section 2.4 that H 1 (G, End (M )) may be identified withExt1D(n) (M, M ). If M is an object of D (n) which is finite and free over O/n ,0then we have a natural inclusionExt1 (n) (M, M ) Ext1 (n) (M, M ) = H 1 (G, End (M )).D

D0

We define HD1 (G, End (M )) to be the image of Ext1D(n) (M, M ) in the groupH 1 (G, End (M )), and HD1 (G, End 0 (M )) to be the intersectionH 1 (G, End 0 (M )) HD1 (G, End (M )).Lemma 2.39 There is a canonical isomorphism of k-vector spacesHomk (mR /(, m2 ), k) = H 1 (G, ad0 ).RD

Proof: There is a natural bijection between Homk (mRD /(, m2RD ), k) and theset of O-algebra homomorphisms from RD to the algebra k[] where 2 = 0 (thecorrespondence associates : RD k[] to |mRD ). Hence there is a bijectionto the set of liftings : G GLd (k[]) of of type D, modulo conjugation byelements of 1 + Md (k). On the other hand, recall from section 2.4 that thereis a natural bijection between Ext1D(1) (M, M) and the set of of all continuous0

Moreover a lifting is type D if and only if det = det and the correspondingextension M is an object of D. The lemma follows on checking linearity. 2Now suppose that : RD O is an O-algebra homomorphism. Let denote the kernel of and let = univD . We setHD1 (G, ad0 K/O) = lim HD1 (G, ad0 n /O) H 1 (G, ad0 K/O).

75

Lemma 2.40 There is a canonical O-linear isomorphism

HomO (/2 , K/O)

= HD1 (G, ad0 K/O).

Proof: One shows in a very similar manner to the proof of lemma 2.39 thatfor all n there is a natural isomorphism.HomO (/2 , O/n ) = HD1 (G, ad0 O/n ).2

2.7

Deformations of Galois representations

Again in this section we assume that ` is an odd prime. Let : GQ GL2 (k)denote a continuous absolutely irreducible representation. Suppose moreoverthat det = and that is semi-stable in the sense that |G` is semi-stable, and if p 6= ` then #(Ip )|`.

Note that if E/Q is a semistable elliptic curve then E,` : GQ GL2 (F` ), satisfies these conditions if it is irreducible. By a theorem of Mazur (theorem 2.9)this will be the case if ` > 7.Let denote a finite set of prime numbers. If R is an object of CO thenwe say that a continuous lifting : GQ GL2 (R) of is of type if thefollowing hold. det = . |G` is semi-stable. If ` 6 and |G` is good then |G` is good. If p 6 {`} and is unramified at p then is unramified at p.

If p 6 {`} then |Ip 10 1 .

Roughly speaking we require that at primes p 6 , is as unramified as could

be hoped for and we require that |G` is semi-stable. Note that if 0 and is a lifting of type then it is also a lifting of type 0 . Note also that if E/Qis an elliptic curve which is semi-stable at ` and for which E,` is irreducibleand semi-stable, then E,` : GQ GL2 (Z` ) is a lifting of type if containsall the primes for which E has bad reduction.Now suppose that : GQ GL2 (O/n ) is a lifting of of type . We willwrite H1 (Q, ad0 ) for HL1 (Q, ad0 ), where76

(b) R can be topologically generated as an O-algebra by dimk H1 (Q, ad0 )

elements.(c) If : R O is a O-algebra homomorphism, if = and if = ker then Hom(/2 , K/O) = H1 (Q, ad0 K/O).Proof: Let L0 denote the fixed field of . Let Ln (for n Z>0 ) denote themaximal elementary abelian `-extension of LSn1 unramified outside , {`}and the primes where ramifies. Let L = n Ln and let G = Gal (L /Q).Note that any lifting of of type factors through G. Gal (L /L0 ) is a pro`-group and its maximal elementary abelian quotient, Gal (L1 /L0 ), is finiteby theorem 2.2. We deduce from the following lemma that Gal (L /L0 ) andhence G are topologically finitely generated. (See for instance [Koc] Satz 4.10for a proof of this lemma.) its maximal elementary abelianLemma 2.42 Let H be a pro-`-group and H then h1 , ..., hrquotient. Suppose h1 , ..., hr H map to a set of generators of H,topologically generate H.Let D denote the category of profinite O[G]-modules M for which M is semi-stable as an O[G` ]-module, if ` 6 and if is good then M is good as an O[G` ]-module, if p 6 {`} and if is ramified at p then there is an exact sequence ofO[Ip ]-modules(0) M (1) M M (0) (0),such that Ip acts trivially on M (1) and M (0) .

We let q,Q denote the character : Gq RQ in the conclusion of the

lemma. The restriction of q,Q to Iq factors through q . We let Q denote the

character GQ RQwhich is unramified outside the primes of Q and whoserestriction to Iq is q,Q for each q in Q. Thus Q factors through Q . We wishto regard RQ as an O[Q ]-algebra, and it will be most convenient to do so80

2via the map which gives rise to Q. Note the following consequence of thelemma 2.44.

H1 (Q, ad0 (1))

H 1 (Fq , ad0 (1))

then #Q = dimk H1 (Q, ad0 (1)) and RQ can be topologically generated

as an O-algebra by #Q elements.Proof: The first part is a direct calculation using the fact that Frob q acts

semi-simply on ad0 with eigenvalues x, 1, x1 for some x k\{0,

1}. The0same is true for ad (1) as q 1 mod `. The second and third parts followfrom this and corollary 2.43.2For the proof of the last theorem of the chapter, we shall need two resultson finite groups.Theorem 2.47 (a) If H is a finite subgroup of P GL2 (C) then H is isomorphic to one of the following groups: the cyclic group Cn of order n(n Z>0 ), the dihedral group D2n of order 2n (n Z>1 ), A4 , S4 or A5 . ` ) then one of the following holds:(b) If H is a finite subgroup of P GL2 (F H is conjugate to a subgroup of the upper triangular matrices;

H is conjugate to P SL2 (F`r ) or P GL2 (F`r ) for some r Z>0 ;

H is isomorphic to A4 , S4 , A5 or the dihedral group D2r of order

2r for some r Z>1 not divisible by `.In fact we shall only need part (b) which is due to Dickson [Dic2], secs. 255,260 (see also [Hu] II.8.27), but we have included part (a) for later reference.81

Lemma 2.48 Let F be a finite field of odd characteristic `. If #F 6= 5, then

H 1 (SL2 (F), End 0 (F2 )) = 0.Proof: This is a special case of results of [CPS], table 4.5 (assuming #F 6= 3).In fact we shall only need it in the case ` = 3, but the proof in the generalcase is no more difficult, and we sketch it here for the readers convenience.We let B (resp. U ) denote the group of upper-triangular (resp. unipotent)matrices in G = SL2 (F). Since ` does not divide the index of B in G, therestriction homomorphismH 1 (G, End 0 (F2 )) H 1 (B, End 0 (F2 ))is injective, so it suffices to prove the latter group vanishes. Since ` does notdivide the index of U in B, we haveH i (B, M ) = H i (U, M )B/Uall integers i 0 and F[B]-modules M . If #F = 3, then one checks directlythat for M = End 0 (F23 ),H 1 (U, M ) = ker N/( 1)M = 0where generates U and N = 1 + + 2 on M . If #F > 5, then one proceedsby writing(0) = M0 M1 M2 M3 = End 0 (F2 )

as F[B]-modules where the successive quotients Mi /Mi1 , (for i = 1, 2, 3) are

one-dimensional over F. The calculation is then straightforward using longexact sequences of cohomology, except in the case #F = 9 where one mustalso check that the one-dimensional space of classes in H 1 (U, M3 /M2 ) fixed byB/U maps injectively to H 2 (U, M2 /M1 ) via the connecting homomorphism.2Theorem 2.49 Keepp the assumptions of the last section and suppose moreover that if L = Q( (1)(`1)/2 `) then |GL is absolutely irreducible. Thenthere exists a non-negative integer r such that for any n Z>0 we can find afinite set of primes Qn with the following properties.(a) If q Qn then q 1 mod `n .(b) If q Qn then is unramified at q and (Frob q ) has distinct eigenvalues.(c) #Qn = r.82

H 1 (Gal (Fn /F0 ), ad0 (1))GQ .

Since F1 /F0 is an extension of degree prime to `, and since GQ acts trivially

Since |GL is absolutely irreducible, the cohomology group in equation (2.8.1)

vanishes. On the other hand, GF0 acts trivially on ad0 so the first term vanishes as well unless Gal (F0 /Q) has order divisible by ` and has Gal (Q(` )/Q)as a quotient. Recall that Gal (F0 /Q) is isomorphic to the projective image of, so by theorem 2.47 we are reduced to the case ` = 3 and the map

Modular forms and Galois representations

From modular forms to Galois representations

PWe suppose in this section that f =an (f )q n is a newform of weight two andlevel Nf (see definition 1.21). Let Kf denote the number field in C generatedby the Fourier coefficients an (f ). Let f denote the character of f , i.e., thehomomorphism (Z/Nf Z) Kf defined by mapping d to the eigenvalue ofhdi on f .Recall that a construction of Shimura (section 1.7) associates to f anabelian variety Af of dimension [Kf : Q]. This abelian variety is a certainquotient of J1 (Nf ), and the action of the Hecke algebra on J1 (Nf ) provides anembeddingKf , End Q (Af ) Q.84

We saw also that for each prime ` the Tate module T` (Af ) Z` Q` becomes afree module of rank two over Kf Q` (lemma 1.48). The action of the Galoisgroup GQ on the Tate module commutes with that of Kf , so that a choice ofbasis for the Tate module provides a representationGQ GL2 (Kf Q` ).

(3.1.1)

As Kf Q` can be identified with the product of the completions of Kf at its

primes over `, we obtain from f certain 2-dimensional `-adic representationsof GQ .`-adic representations: In this discussion, we fix a prime ` and a finiteextension K of Q` . We let O denote the ring of integers of K, the maximalideal and k the residue field. We shall consider `-adic representations withcoefficients in finite extensions of our fixed field K. We regard K as a subfield ` and fix embeddings Q , Q ` and Q , C. If K 0 is a finite extension ofof Q0K with ring of integers O , then we say that an `-adic representation G` GL2 (K 0 ) is good (respectively, ordinary, semistable) if it is conjugate overK 0 to a representation G` GL2 (O0 ) which is good (respectively, ordinary,semistable) in the sense of section 2.4. ` generated by the Fourier coefficients ofLet Kf0 denote the K-algebra in Q0f . Thus Kf is a finite extension of K, and it contains the completion of Kf atthe prime over ` determined by our choice of embeddings. We let Of0 denotethe ring of integers of Kf0 and write kf for its residue field. We definef : GQ GL2 (Kf0 )

as the pushforward of (3.1.1) by the natural map Kf Q` Kf0 . We assume

the basis is chosen so that f factors through GL2 (Of0 ). We also let f0 denotethe finite order `-adic characterGQ Gal (Q(Nf )/Q) (Kf0 )obtained from f .The following theorem lists several fundamental properties of the `-adicrepresentations f obtained from Shimuras construction. The result is a combination of the work of many mathematicians. We discuss some of the proofsand provide references below. In the statement we fix f as above and writesimply N , an , , , 0 and K 0 for Nf , an (f ), f , f , f0 and Kf0 respectively.Theorem 3.1 The `-adic representation : GQ GL2 (K 0 )has the following properties.85

(a) If p6 |N ` then is unramified at p and (Frob p ) has characteristic polynomial

(d) The conductor N () is the prime-to-`-part of N .

(e) Suppose that p 6= ` and p||N . Let denote the unramified characterGp (K 0 ) satisfying (Frob p ) = ap . If p does not divide the conductorof , then |Gp is of the form

0 .If p divides the conductor of , then |Gp is of the form1 0 |Gp .(f ) If `6 |2N , then |G` is good. Moreover |G` is ordinary if and only if a` isa unit in the ring of integers of K 0 , in which case I` (Frob ` ) is the unitroot of the polynomial X 2 a` X + `(`).(g) If ` is odd and `||N , but the conductor of is not divisible by `, then|G` is ordinary and I` (Frob ` ) = a` .Proof: Part (a) was established by Shimura ([Shi2], [Shi3]). The key ingredient is the Eichler-Shimura congruence relation, theorem 1.29. Recall thatJ1 (N ) has good reduction at primes p not dividing N . So the action ofGp on T` (Af ) Z` Q` is unramified and is in fact described by the action ofFrob p GFp on the Tate module of the reduction. But this is given by theFrobenius endomorphism F whose characteristic polynomial is computed incorollary 1.41.The first assertion of (b) follows from (a) on applying the Chebotarevdensity theorem. The second assertion then follows on noting that (1) = 1.Part (c) was proved by Ribet (see section 2 of [R3]). Assuming reducibilityof the representation, he applies algebraicity results of Lang and Serre toobtain a contradiction to the estimate on the Fourier coefficients stated intheorem 1.24.86

Parts (d) and (e) follow from a deep result of Carayol [Ca1], Thm. (A),building on the work of Langlands [Ll1], Deligne and others. In fact, thisresult and the local Langlands correspondence characterize |Gp in terms of|Gp and the L- and -factors at p of twists of f . The descriptions in the caseof p||N are based on the analysis of Deligne-Rapoport of the reduction mod pof J1 (N ) (see [DR], [Ll1]).The first assertion of (f) follows from the fact that Af has good reductionat ` if ` does not divide N . The second assertion of (f) (respectively, all of (g))follows from the Eichler-Shimura congruence relation (respectively, the work ofDeligne-Rapoport), and general results on `-divisible groups and the reductionof abelian varieties; see thm. 2.2 of [W2], lemma 2.1.5 of [W1], 12 of [Gro]and thms. 2.5 and 2.6 of [Edi]. The restriction to odd ` in (f) and (g) is madeprimarily out of lack of suitable definitions.2Mod ` representations: We maintain the notation used in the discussion of`-adic representations. Definef : GQ GL2 (kf )to be the semi-simplification of the reduction of f . (See the discussion following proposition 2.6.) Assertions analogous to those in theorem 3.1 hold for = f , except that The representation need not be absolutely irreducible (as in (c)). However if ` is odd, one checks using (b) that is irreducible if and only ifit is absolutely irreducible. In (d) one only has divisibility of the prime-to-` part of Nf by N () ingeneral.The various possibilities for mp () to be strictly less than the exponent ofp in N (where p 6= `) were classified independently by Carayol [Ca2] and Livne[Liv]. We record the following consequence of their results (cf. the introductionof [DT1]):Proposition 3.2 Suppose that p is a prime such that p|Nf , p 6 1 mod ` andf is unramified at p. Then tr (f (Frob p ))2 = (p + 1)2 in kf .Artin representations: The theory of Hecke operators and newforms (seesection 1.3) extends to modular forms on 1 (N ) of arbitrary weight. Theconstruction of `-adic representations associated to newforms was generalizedto weight greater than 1 by Deligne [De] using etale cohomology. There arealso Galois representations associated to newforms of weight 1 by Deligne andSerre [DS], but an essential difference is that these are Artin representations.87

Remark 3.5 A basis can be chosen so that the representation g takes valuesin GL2 (Kg ) (where Kg is the number field generated by the an (g)). Moreover ` and we have fixed embeddingssuppose that K is a finite extension of Q` in Q

of Q in C and Q` ). If Kg is contained in K, then we can view g as giving rise

to an `-adic representation GQ GL2 (K) and hence a mod ` representationGQ GL2 (k).Remark 3.6 A key idea in the construction of g is to first construct the mod` representations using those already associated to newforms of higher weight.More precisely, suppose that Kg , K as in remark 3.5. One can show thatfor some newform f of weight 2 and level Nf dividing N ` we haveap (g) ap (f ),

g (p)) pf (p)

for all p6 |N `, the congruence being modulo the maximal ideal of the ringof integers of Kf0 . Thus f is the semisimplification of the desired mod `representation (with scalars extended to kf ).

3.2

From Galois representations to modular forms

It is conjectured that certain types of two-dimensional representations of G Q

always arise from the constructions described in section 3.1. We now statesome of the conjectures and the results known prior to [W3] and [TW].Artin representations:Conjecture 3.7 Let : GQ GL2 (C) be a continuous irreducible representation with det((c)) = 1. Then is equivalent to g for some newform g ofweight one.88

Recall that g is the Artin representation associated to g by the Deligne-Serre

construction (theorem 3.3).Remark 3.8 Conjecture 3.7 is equivalent to the statement that the ArtinL-functions attached to and to all its twists by one-dimensional charactersare entire. (The Artin conjecture predicts that the Artin L-function L(s, ) isentire, for an arbitrary irreducible, non-trivial Artin representation : G Q GLd (C).)A large part of conjecture 3.7 was proved by Langlands in [Ll2], and theresults were extended by Tunnell [Tu].Theorem 3.9 Let : GQ GL2 (C) be a continuous irreducible representation such that (GQ ) is solvable and det((c)) = 1. Then is equivalent tog for some newform g of weight one.Remark 3.10 Note that by theorem 2.47, part (a), the solvability hypothesisexcludes only the case where the projective image of is isomorphic to A5 .Remark 3.11 If the projective image of is dihedral, then is induced froma character of a quadratic extension of Q. In this case the result can alreadybe deduced from work of Hecke.Mod ` representations: We fix notation as in the discussion of `-adic andmod ` representations in section 3.1.Definition 3.12 We say that a representation : GQ GL2 (k) is modular(of level N ) if for some newform f of weight 2 (and level N ), is equivalentover kf to f .By proposition 1.32 the notion is independent of the choices in section 3.1 `, Q , Q ` and Q , C. Moreover if K 0 is a finiteof embeddings K , Qextension of K with residue field k 0 , then is modular if and only if k k 0 ismodular.The following was conjectured by Serre [Se7], (3.2.3). (See also [Da1] forfurther discussion and references.)Conjecture 3.13 Let : GQ GL2 (k) be a continuous absolutely irreduciblerepresentation with det((c)) = 1. Then is modular.Some cases of Serres conjecture can be deduced from theorem 3.9.89

Theorem 3.14 Let : GQ GL2 (k) be a continuous absolutely irreducible

representation with det((c)) = 1. Suppose that one of the following holds:(a) k = F3 ;(b) the projective image of is dihedral.Then is modular.Sketch of proof: For case (a), we consider the surjection

GL2 (Z[ 2]) GL2 (F3 )

defined by reduction mod

(1+2). One checks that there is a section

s : GL2 (F3 ) GL2 (Z[ 2]) and applies theorem 3.9 to s . The resultingrepresentation arises from a weight one newform, and hence its reduction isequivalent to f for some f (see remark 3.6).GIn case (b), is equivalent to a representation of the form Ind GQF whereF is a quadratic extension of Q and is a character GF k . (We have choose an embeddinghere enlarged K if necessary.) Let n be the order of ,Q(e2i/n ) , K and lift to a character : GF Z[e2i/n ] . We mayGalways choose so that the Artin representation = Ind GQF is odd, i.e.,det((c)) = 1. (In the case ` = 2 and F real quadratic, we may have tomultiply by a suitable quadratic character of GF .) We then apply theorem 3.9to and deduce as in case (a) that is modular.2Serre also proposed a refinement ([Se7], (3.2.4)) of the conjecture whichpredicts that is associated to a newform of specified weight, level and character. Through work of Mazur, Ribet [R5], Carayol [Ca2], Gross [Gro] andothers, this refinement is now known to be equivalent to conjecture 3.13 if ` isodd. (One also needs to impose a mild restriction in the case ` = 3.) See [R6]and [Di1] for statements of the results and further references; here we give avariant which applies to newforms of weight two. Before doing so, we assume` is odd and define an integer () as follows: () = 0 if |G` is good; () = 1 if |G` is not good and |I` k k is of the form

a a 0 ,or`a00 10

for some positive integer a < `. (Recall that is the cyclotomic characterand is the character of I` defined after lemma 2.22.)90

() = 2 otherwise.Theorem 3.15 Suppose that ` is odd and is absolutely irreducible and modular. If ` = 3, then suppose also that |GQ(3) is absolutely irreducible. Thenthere exists a newform f of weight two such that is equivalent over kf to f ; Nf = N ()`() ; the order of f is not divisible by `.Proof: The existence of such an f follows from [Di1] thm. 1.1, thm. 5.1 andlemma 2.1, but with Nf dividing N ()`() . By lemma 2.7 above, we see thatNf is divisible by N (). The divisibility of Nf by () follows from results insec. 8 of [Gro] and sec. 2.4 of [Edi].2`-adic representations: We again use the notation of section 3.1. Let :GQ GL2 (K) be an `-adic representation.Definition 3.16 We say that is modular if for some weight 2 newform f , is equivalent over Kf0 to f .The notion is independent of the choices of embeddings and well-behaved underextension of scalars by proposition 1.32 (cf. definition 3.12).The following is a special case of a conjecture of Fontaine and Mazur [FM].Conjecture 3.17 If : GQ GL2 (K) is an absolutely irreducible `-adicrepresentation and |G` is semistable (in the sense of section 2.4), then ismodular.(Recall that for us `-adic representations are defined to be unramified at all butfinitely many primes. Recall also that if |G` is semistable, then by definitiondet |I` is the cyclotomic character .)Remark 3.18 Relatively little was known about this conjecture before Wileswork [W3]. Wiles proves that under suitable hypotheses, the modularity of implies that of .Remark 3.19 The conjecture stated in [FM] is stronger than the one here;in particular, the semistability hypothesis could be replaced with a suitablenotion of potential semistability. On the other hand, one expects that if | G` issemistable, then it is equivalent to f (over Kf0 ) for some f on 1 (N ())0 (`)(and on 1 (N ()) if |G` is good).91

The Shimura-Taniyama conjecture: Conjecture 1.54 can be viewed in the

framework of the problem of associating modular forms to Galois representations. Let E be an elliptic curve defined over Q. For each prime `, we let E,`denote the `-adic representation GQ GL2 (Q` ) defined by the action of GQon the Tate module of E (see section 2.2).Proposition 3.20 The following are equivalent:(a) E is modular.(b) E,` is modular for all primes `.(c) E,` is modular for some prime `.Proof: If E is modular, then E is isogenous to Af for some weight two newformf with Kf = Q (see section 1.8). It follows that for each prime `, E,` isequivalent to the `-adic representation f . Hence (a) (b) (c).To show (c) (b), suppose that for some ` and some f , the representationsE,` and f are equivalent. First observe that for all but finitely primes p, wehavetr (f (Frob p )) = tr (E,` (Frob p )).We deduce from proposition 2.11 and theorem 3.1, part (a) that for all butfinitely many primes pap (f ) = p + 1 #Ep (Fp ) Z.

(3.2.1)

Applying proposition 2.6, we find that for each prime `, E,` is equivalent tof and is therefore modular.We finally show that (b) (a). The equality (3.2.1) holds for all primesp not dividing Nf , which by theorem 3.1, part (d), is the conductor of E.Since det(f ) = det(E,` ) = , we see by 3.1, Part (b) that f is trivial. Bytheorem 1.27 parts (b) and (d) (or [AL] thm. 3), ap is in {0, 1} for primesp dividing Nf . Thus Kf = Q and Af is an elliptic curve. Faltings isogenytheorem (see [CS], sec. II.5) now tells us that E and Af are isogenous and weconclude that E is modular.2Remark 3.21 Note that the equivalence (b) (c) does not require Faltingsisogeny theorem.Proposition 3.22 If the Fontaine-Mazur conjecture (conjecture 3.17) holdsfor some prime `, then the Shimura-Taniyama conjecture (conjecture 1.54)holds. If Serres conjecture (conjecture 3.13) holds for infinitely many `, thenconjecture 1.54 holds.92

Proof: The first assertion is immediate from proposition 3.20 and the irreducibility of E,` . See [Se7], sec. 4.6 for a proof of the second. (We haveimplicitly chosen the field K to be Q` in the statements of conjectures 3.17and 3.13, but it may be replaced by a finite extension.)2Remark 3.23 Note that to prove a given elliptic curve E is modular, it suffices to prove that conjecture 3.17 holds for a single ` at which E has semistablereduction. Wiles approach is to show that certain cases of conjecture 3.13 imply cases of conjecture 3.17 and hence cases of conjecture 1.54.

3.3

Hecke algebras

In this section fix the following notation. Let ` be an odd prime, let K be afinite extension of Q` , let O denote the ring of integers of K, let denote its `, Q , Q ` andmaximal ideal and k its residue field. Fix embeddings K , Q , C. Let : GQ GL2 (k) denote a continuous representation with theQfollowing properties(a) is irreducible,(b) is modular,(c) det = ,(d) |G` is semi-stable,(e) and if p 6= ` then #(Ip )|`.Let us first record the following lemma.Lemma3.24 The representation |GL is absolutely irreducible where L =pQ( (1)(`1)/2 `).

Proof: If it were not then we see that `6 |#

(GQ) and so is unramified at

0all p 6= `. Moreover we can check that |I` 0 1 . If ` > 3 we can usetheorem 3.15 to deduce that is modular of weight 2 and level 1 and henceobtain a contradiction. If ` = 3 we see that the splitting field of is everywhereunramified over Q( 3) and hence must equal Q( 3), a contradiction. 2Let denote a finite set of primes. For the application to modularity ofsemistable elliptic curves, it suffices to consider sets contained in where is defined as follows

93

Definition 3.25 For a representation as above, we let denote the set of

primes p satisfying p = ` and |G` is good and ordinary; or p 6= ` and is unramified at p.We shall sometimes assume that in order to simplify statements andproofs. Let N denote the set of newforms f such thatf : GQ GL2 (Of0 )is equivalent to a lifting of k kf of type and `2 6 |Nf (the last condition ispresumably not necessary, cf. remark 3.19). From theorem 3.1 and lemma 2.7one deduces the following description of N .Lemma 3.26 The set N consists of newforms f such that f = k kf , f is trivial, Nf divides ` N () = 1 otherwise.

p{`}

Ip

pdim , where = 0 if is good and ` 6 and

As Q is modular it follows from theorem 3.15 that for all , N 6= . Set

), we let TpT = f N Of0 . If p is a prime not in and not dividing `N (

denote the element (ap (f ))f in T . Then define T to be the O-subalgebra of

generated by the elements Tp for such primes p. Then T is a completeTnoetherian local O-algebra with residue field k. Moreover it is reduced and itis a finitely generated free O-module.Lemma 3.27 There is a continuous representationmod: GQ GL2 (T )

such that if p6 |`N (

) and p 6 then modis unramified at p and we have

modtr (Frob p ) = Tp . Moreover we have the following. Tis a lift of of type and there is a unique surjection : R (a) mod

univsuch that mod

(b) If 0 then there is a unique surjection T0

T such that mod00pushes forward to modandTmapstoTforp6|`N(

)andp

.pp

94

(c) If K 0 is a finite extension of K and T0 is constructed in the same way

tinuity and the Chebotarev density theorem tr mod

any g GQ the diagonal entries of mod

a 1 . Then for all

where b 6= 0. Rescaling e+ we may assume that mod()=

cdmodg GQ the lower left entry of (g) lies in T (look at the upper left entryof modusing the irreducibility of we can find GQ such that (g)).

Againmodmod ( ) = e where e T ( g) . Looking at the lower right entry of modwe see that for any g GQ the upper right entry of (g) lies in T . Thusmodis now in fact valued in GL2 (T ) and will be our candidate for mod

. Wemodleave the verification of the other properties of as an exercise.2Example 3.28 Let = f57B ,3 where f57B is the newform of level 57 discussedin the example of section 1.6. As f57B is not congruent modulo 3 to any form oflevel 19 or 3 we see by theorem 3.15 that is ramified at 19 and |G3 is not good.On the other hand is semi-stable. The facts that (Frob 2 ) has order 8 (see thetable below) and 3|#(I19 ) (from the discussion above) imply that : GQ

01 1 Exercise 3.29 Show that there are three algebra homomorphisms T Z/9Z and hence show that there are at least three liftings of of type to Z/9Z.95

Exercise 3.30 What is the image of ?

We will need two deeper properties of the Hecke algebras T (theorem 3.31and theorem 3.36 below). These results will be proved in the next chapter(sections 4.3 and 4.4).Theorem 3.31 Let Q be a finite set of primes as in section 2.8. Then TQis a free O[Q ]-module, where TQ inherits the structure of an O[Q ]-modulefrom RQ via the homomorphism Q : RQ TQ .Corollary 3.32 T = TQ /aQ .Proof: From the definitions and corollary 2.45 we have thatT O K = (TQ O K)/aQ ,and from the theorem we have that TQ /aQ is torsion free. The corollaryfollows.2For the second of these results we will need some additional notation andwe restrict our attention to sets of primes contained in . Suppose that and that f is an element of N whose Fourier coefficients are in O.Then Of0 = O and projection to the component corresponding to f gives riseto an O-algebra homomorphism = f : T O.The pushforward of modby is equivalent to

f : GQ GL2 (O).Remark 3.33 Most of the objects considered in the rest of the section willdepend on the choice of newform f . We also remark that to give an O-algebrahomomorphism T O is equivalent to giving a newform f in N withcoefficients in O. Indeed given such a homomorphism, there exists a newformf in N such that the homomorphism is defined by Tp 7 ap (f ) for all p /with p not dividing `N (). The fact that ap (f ) O for such p implies (usingfor example parts (b) and (d) of theorem 1.27 and lemma 4.1 below) that allthe Fourier coefficients of f are in O. The uniqueness of f is a consequence ofthe theory of newforms, theorem 1.22.

96

For 0 satisfying 0 ,

let 0 denote the composite T0 T O. Let

0 = 0 (Ann T0 (ker 0 )).

(3.3.1)

Note that because T0 is reduced, 0 6= (0). Also let 0 denote the kernel of0 0 . Recall that#0 /20 = #H1 0 (Q, ad0 f O K/O).Remark 3.34 We have not yet shown that these groups are finite, but ifeither is finite, then so is the other and their cardinality is the same. is actually in T .Note that if ` is in then T` := (a` (g))g TThis follows from theorem 3.1(f), which shows that T` = ` + ``1 where `is the eigenvalue of Frob ` on the free rank one T -module M (0) where M isthe module underlying mod(see lemma 2.20). Now that we have shown that

Proposition 3.35 If 0 , then

Moreover if we have equality then the sequence

is exact.Finally we state the second theorem on Hecke algebras which shall beproved in section 4.4.Theorem 3.36 If 0 and f is a newform in N with coefficientsin O, then we haveY0 (cp ) .p0

Corollary 3.37 With the above notation

#H1 0 (Q, ad0 f O K/O)/H1 (Q, ad0 f O K/O) #( /0 ).

3.4

Isomorphism criteria

The main thrust of Wiles approach is to prove that in many circumstances themap : R T is an isomorphism. For this we will need two criteria fromcommutative algebra. The first was found by Wiles [W3] (but is presented herein a slightly stronger form due to Lenstra [Len]); the second was developed byFaltings, from the original arguments of [TW]. In both criteria the notion ofcomplete intersection plays a vital part.Proofs of all the results in this section, together with some background,references and examples, is given in chapter 5.Definition 3.38 Suppose that A is an object of CO which is finitely generatedand free as an O-module. Then we call A a complete intersection if and onlyif for some r Z0 and some f1 , ..., fr O[[X1 , ..., Xr ]] we haveA= O[[X1 , ..., Xr ]]/(f1 , ..., fr )(i.e. there are the same number of generators as relations).98

We first record a lemma about complete intersections.

Lemma 3.39 Suppose that K 0 /K is a finite extension with ring of integersO0 and that A is an object of CO which is finitely generated and free as anO-module. Then A is a complete intersection if and only if A O O0 is.For the proof, see chapter 5, lemma 5.30.Now fix objects R and T of CO and a surjection of O-algebras : R T.Also assume that T is a finitely generated, free O-module. The first criterionis as follows.Theorem 3.40 Suppose that : T O. Let = ker ( ) R and let = (Ann T (ker )) O. Suppose also that 6= (0). Then the following areequivalent.(a) The inequality #/2 #O/ is satisfied.(b) The equality #/2 = #O/ is satisfied.(c) The rings R and T are complete intersections, and the map : R Tis an isomorphism.The proof is explained in chapter 5, sections 5.1 to 5.8. (See theorem 5.3.)For the second criterion let us also fix a non-negative integer r. If J O[[S1 , ..., Sr ]] is an ideal contained in (S1 , ..., Sr ), then by a J-structure wemean a commutative diagram in COO[[S1 , ..., Sr ]]

The main theorem

We are now in a position to deduce the main theorems. We will keep thenotation and assumptions from the start of section 3.3.Theorem 3.42 Keep the notation and assumptions of section 3.3. Then, forall finite sets , : R T is an isomorphism and these rings arecomplete intersections.Remark 3.43 There seems to be a deep link between the fact that is anisomorphism and the fact that T is a complete intersection. The proof of thetheorem divides into two parts. One first proves it in the minimal case where = . One then deduces the full theorem from this special case by a differentargument. In both these steps the facts that is an isomorphism and thatT is a complete intersection are proved simultaneously.Proof of theorem 3.42: Note that to prove the theorem we may extend scalarsif necessary (by lemma 3.39) and hence assume that both of the following hold: The eigenvalues of all elements of the image of are rational over k. There is a newform f in N with coefficients in O, hence an O-algebrahomomorphism T O.We first prove that : R T is an isomorphism and that the ringsR and T are complete intersections.Note that according to theorem 2.49 and lemma 3.24 we can find an integerr 0 and for each n Z>0 we can find a set Qn of r primes such that if q Qn then q 1 mod `n ; if q Qn then is unramified at q and (Frob q ) has distinct eigenvalues; RQn can be topologically generated by r elements as a O-algebra.n

(a) RQn /(S1 , . . . , Sr ) R (see corollary 2.45),

(c) TQn /(S1 , . . . , Sr ) T (see corollary 3.32).

n

Let Jn = ((Si + 1)` 1 : i = 1, . . . , r). Replacing TQn and RQn by TQn /Jnand RQn /Jn we see that we have a Jn -structure for every n. Theorem 3.42 for = now follows from the criterion of theorem 3.41.We now turn to the proof of theorem 3.42 in the general case. By theorem 2.41 and theorem 3.40 we see that#H1 (Q, ad0 f O K/O) = #O/ ,

(3.5.1)

and so applying corollary 3.37 we see that for any

#H1 (Q, ad0 f O K/O) #O/ .

(3.5.2)

A second application of theorems 2.41 and 3.40 allows us to deduce theorem 3.42.2Remark 3.44 In certain cases where = (1), the bound (3.5.1) on the orderof the Selmer group H1 (Q, ad0 f O K/O) also follows from the previous workof Flach, by a different method. See [Fl1] for details.Corollary 3.45 Keep the notation of theorem 3.42 and suppose that f is anewform in N with coefficients in O.(a) We have#H1 (Q, ad0 f O K/O) = #O/ < ,where was defined in equation (3.3.1) after remark 3.33.(b) If 0 then01(0) H1 (Q, ad0 f O K/O) HL0 (Q, ad f O K/O) p0 Hp (0)

is exact, where the groups Hp and H` were defined before proposition 3.35.Proof: The first part now follows as a direct consequence of theorem 3.42 andanother application of theorems 2.41 and 3.40. The second part follows fromthe first, together with proposition 3.35 and theorem 3.36.2101

Corollary 3.46 Suppose : GQ GL2 (K) is a continuous representation

and let denote its reduction. Suppose also that(a) is irreducible and modular,

(b) if p 6= ` then |Ip 10 1 ,(c) |G` is semi-stable,

(d) det = .Then is modular.Proof: We let denote the set of primes in at which is ramified. Then : GQ GL2 (O) is a deformation of of type , so there is an O-algebrahomomorphism R O such that = univR O. Since : R T is an

isomorphism by theorem 3.42, it follows that there is a homomorphism T

O sending Tp to tr ((Frob p )). Since such a homomorphism is necessarily ofthe form Tp 7 ap (f ) for some newform f , it follows that is equivalent to fand hence is modular.2Corollary 3.47 Suppose that E/Q is a semi-stable elliptic curve such thatE,3 is irreducible. Then E is modular.Proof: One need only apply the last corollary with ` = 3 and theorem 3.14. 2

3.6

Applications

The Shimura-Taniyama conjecture for semi-stable elliptic curves:

Theorem 3.48 If E/Q is a semistable elliptic curve, then E is modular.Proof: By corollary 3.47, it is enough to show that E is modular when itsassociated mod 3 representation E,3 is reducible, i.e., when E has a subgroupof order 3 defined over Q. Consider the group E[5] of 5-division points of E.The mod 5 Galois representation E,5 associated to E[5] is irreducible: forotherwise, E would have a subgroup of order 15 defined over Q, and wouldgive rise to a (non-cuspidal) rational point on the modular curve X0 (15). Thiscurve is of genus one, and is known to have only 4 non-cuspidal rational points,which do not correspond to semi-stable elliptic curves (and, at any rate, areknown to correspond to modular elliptic curves). Hence we know that E,5satisfies all the assumptions of corollary 3.46, except the (crucial!) modularityproperty. To show that E,5 is modular, one starts with102

Lemma 3.49 There is an auxiliary (semi-stable) elliptic curve A/Q which

satisfies(a) A[5] ' E[5] as GQ -modules.(b) A[3] is an irreducible GQ -module.Proof: Let Y 0 (5) be the curve over Q which classifies elliptic curves A togetherwith an isomorphism E[5] ' A[5] compatible with Weil pairings. Ellipticcurves over Q satisfying (a) correspond to rational points in Y 0 (5)(Q). Adjoining a finite set of points to Y 0 (5) yields its compactification X 0 (5) whichis a twist of the modular curve X(5) with full level 5 structure. (I.e., it be As was shown by Klein, the modularcomes isomorphic to this curve, over Q.)curve X(5) over C has genus 0. Since X 0 (5) has a point x0 defined over Qcorresponding to E, it is isomorphic over Q to P1 . The rational points of Y 0 (5)therefore give a plentiful supply of elliptic curves satisfying condition (a). Nowconsider the curve Y 0 (5, 3) classifying elliptic curves A with an isomorphismE[5] ' A[5] (respecting Weil pairings) and a subgroup of A of order 3. Onechecks that the compactification of Y 0 (5, 3) has genus greater than 1, hencehas only finitely many rational points by Faltings theorem (the Mordell conjecture). It follows that only finitely many points Y 0 (5)(Q) are in the image ofY 0 (5, 3)(Q) under the natural map Y 0 (5, 3) Y 0 (5). Hence for all but finitelymany points x in Y 0 (5)(Q), the corresponding elliptic curve A satisfies (b)since it has no rational subgroup of order 3. Choosing x arbitrarily close inthe 5-adic topology to x0 , we find that the elliptic curve A associated to x issemistable and satisfies the two conditions in the lemma.2We can now finish the proof of theorem 3.48. Applying corollary 3.47 tothe curve A, we find that A is modular. Hence so is the mod 5 representationA,5 ' E,5 . Now applying corollary 3.46 with ` = 5 and the representationof GQ acting on the 5-adic Tate module of E, we find that E,5 is modular,and hence, so is E, as was to be shown.Remark 3.50 Wiles original argument uses Hilberts irreducibility theoremwhere we have used Faltings theorem. The alternative presented here is basedon a remark of Karl Rubin.Remark 3.51 The results of [W3] and [TW] actually apply to a larger classof elliptic curves than those which are semistable. In [Di2], their methods arefurther strengthened to prove that all elliptic curves which have semi-stablereduction at 3 and 5 are modular.

103

Remark 3.52 Rubin and Silverberg observed that an elliptic curve of theform y 2 = x(x a)(x + b) has a twist with semi-stable reduction at all oddprimes, hence is modular by [Di2]. In fact it is shown in [DK] that theirobservation together with the general results of [W3] and [TW] already implymodularity.Fermats Last Theorem: As was already mentioned in the introduction,the Shimura-Taniyama conjecture for semi-stable elliptic curves (and, moreprecisely, for the elliptic curves that arise in Freys construction explained insection 2.2) implies Fermats Last Theorem.More precisely, suppose that there is a non-trivial solution to the Fermatequation x` + y ` = z ` , with ` > 3. By theorem 2.15 the Frey curve constructedfrom this solution (cf. section 2.2) is a semistable elliptic curve E/Q whoseassociated mod ` representation E,` is irreducible, unramified outside 2` and isgood at `. Serres conjecture predicts that E,` arises from a newform of weight2 and level 2; the lowering the level result of Ribet [R5] (cf. theorem 3.15)actually proves this, once we know that E is modular, i.e., E,` arises from amodular form of weight 2 and some level. But this is a contradiction, sincethere are no modular forms of weight two and level two: such forms wouldcorrespond to holomorphic differentials on the modular curve X0 (2) which isof genus 0. This contradiction completes the proof.Values of L-functions: Also mentioned in the introduction was the relationship between the calculation of the Selmer group (3.5.1) and certain cases ofa conjecture of Bloch-Kato [BK], called the Tamagawa number conjecture. Itwas in this context that partial results were obtained by Flach in [Fl1] (cf.remark 3.44).If f is a newform of weight 2, then one can associate to f a certain symmetric square L-function L(Symm 2 f, s). We shall recall the definition in section 4.4 and explain how a method of Hida establishes a relationship betweenL(Symm 2 f, 2) and O/ in the setting of corollary 3.45. We may thereforeregard part (a) of that corollary as a relationship between L(Symm 2 f, 2) andthe size of a Selmer group. While the result is in the spirit of the Tamagawanumber conjecture of [BK], we have not verified that the relevant cases of theconjecture can be deduced from it. We shall however state a partial resultin the context of semistable elliptic curves. The reader can consult [Fl1] and[Fl2] for a discussion of the relation to the Tamagawa number conjecture.

104

Suppose that E is a semistable elliptic curve over Q of conductor NE and

whereis the Neron differential defined in section 1.1. Since E is modularby theorem 3.48, a method of Shimura (see [Shi4] and the introduction of [St])establishes the analytic continuation of L(Symm 2 E, s) to an entire functionand shows that L(Symm 2 E, 2) is a non-zero rational multiple of iE . Wenow explain how to deduce the following theorem from Wiles results and aformula of Hida, corollary 4.21.Theorem 3.53 Suppose that E is a semistable elliptic curve and ` is a primesuch that E,` is irreducible, andQ ` does not divide 2dp .p|NE

Then the `-part of

NE L(Symm 2 E, 2)iE

is the order ofH1 (Q, ad0 E,` Z` Q` /Z` ).105

Sketch of proof: Since E is semistable, it is modular by theorem 3.48. Letting

Full Hecke algebras

Suppose that K is a finite extension of Q` for some prime `. Let O denote

its ring of integers and let k = O/ where is the maximal ideal of O. Fix `, Q , Q ` and Q , C. Recall that we defined Heckeembeddings K , Qalgebras over O in two different contexts: In section 1.6 as an algebra TO generated by the full set of Hecke operators acting on a space of modular forms; In section 3.3 as a certain subring T of a product of fields of Fouriercoefficients of newforms giving rise to the same mod ` representation.The first of these provides a concrete geometric description useful for establishing properties of the fine structure of the algebra; the second yields a reducedring which is more easily interpreted as the coefficient ring of a Galois representation. In the next section we shall relate the two notions by identifying106

the reduced Hecke algebras of the form T as localizations of the full Heckealgebras of the form TO . Before doing so we need to recall some fundamentalproperties of the algebras TK , TO and Tk .Let = H (N ) for some positive integer N and subgroup H of (Z/N Z)(see section 1.2). We let TZ denote the subring of End (S2 ()) generated bythe operators Tn for all positive integers n and hdi for all d (Z/N Z) . If Ris a ring, then TR denotes the R-algebra TZ R. Recall that TR acts faithfullyon S2 (, R) and is finitely generated and free as an R-module. (This holds forR = Z, hence for arbitrary R.)We first record the following lemma:Lemma 4.1 (a) TR is generated as an R-algebra by either of the followingsets of elements: Tn for all positive integers n.

Tp for all primes p and hdi for all d in (Z/N Z) .

(b) Suppose that D is a positive integer relatively prime to N . If either D

is odd or 2 is invertible in R, then TR is generated as an R-algebra byeither of the following sets of elements: Tn for all positive integers n relatively prime to D.

Tp for all primes p not dividing D, and hdi for all d in (Z/N Z) .For a proof of (a), see [DI], prop. 3.5.1; for (b), see p. 491 of [W3].The spectrum of TO : First note that TK and Tk are Artinian, hence haveonly a finite number of prime ideals, all of which are maximal. Since TO isfinitely generated and free as an O-module, its maximal (resp. minimal) primeideals are those lying over the prime (resp. (0)) of O. (This follows from thegoing-up and going-down theorems, [Mat] thms. 9.4 and 9.5, for example.) Itfollows that the natural mapsTO , TO O K = TK ;

and TO T O O k = Tk

induce bijections{ maximal ideals of TK } { minimal primes of TO } and{ maximal ideals of Tk } { maximal primes of TO }.Moreover, since O is complete we have (by [Mat] thms. 8.7 and 8.15, forexample) that the natural mapYTO Tmm

107

is an isomorphism, where the product is over the finite set of maximal ideals mof TO and Tm denotes the localization of TO at m. Furthermore each Tm is acomplete local O-algebra which is finitely generated and free as an O-module,and each minimal prime P Pof TO is contained in a unique m. forNow suppose that f =an q n is a normalized eigenform in S2 (, K)

the operators Tn for all n 1. Then Tn 7 an defines a map TZ K and

The image is the finiteinduces a K-algebra homomorphism f : TK K.extension of K generated by the an , and the kernel is a maximal ideal of TKwhich depends only on the GK -conjugacy class of f . Similarly a Gk -conjugacy gives rise to a maximal ideal of Tk .class of normalized eigenforms in S2 (, k) has coefficients inRecall also that a normalized eigenform f in S2 (, K)

Furthermore if f and g are GK -conjugate, then f and g are Gk -conjugate.We have thus constructed a diagram of maps of finite sets whose commutativity is easily verified.nononormalized eigenforms innormalized eigenforms in

{ maximal primes of TO }.Proposition 4.2 The vertical maps are bijective, and the horizontal maps aresurjective.Proof: For the injectivity of the upper-left vertical arrow, note that if p is area maximal ideal of TK , then all K-algebra homomorphisms TK /p , Kobtained from a single one by composing with an element of GK . For thesurjectivity, let K 0 = TK /p and p0 denote the kernel of the natural K 0 -algebrahomomorphism TK 0 K 0 . Since TK 0 acts faithfully on S2 (, K 0 ), the localization S2 (, K 0 )p is non-zero, hence so is S2 (, K 0 )[p0 ]. (For an R-module Mand an ideal I of R, we write M [I] for the intersection over the elements r in Iof the kernels of r : M M .) It follows that there is a normalized eigenformf in S2 (, K 0 ) so that p0 is the kernel of the f0 : TK 0 K 0 , and therefore p isthe kernel of f . To prove that the upper-right vertical arrow is bijective, notethat the above arguments carry over with K replaced by k.2Remark 4.3 The surjectivity of the top arrow is called the Deligne-Serrelifting lemma ([DS], lemma 6.11).

108

Suppose that m is a maximal ideal of TO . Note that the maximal ideals

of TK mapping to m are precisely those p for which p TO is contained in m.Note also that the natural mapYm : T m O K Tpp

is an isomorphism, where the product is over such p.

It is straightforward to check that the above constructions are well-behavedwith respect to replacing the field K by an extension K 0 . More precisely,for each set S in the above diagram, there is a natural surjective map from$ : S 0 S where S 0 is defined by replacing K with K 0 , and these maps arecompatible with the maps in the diagram. Furthermore the mapsYYT p K K 0 Tp0 and Tm O O0 T m0p0 $ 1 (p)

m0 $ 1 (m)

are isomorphisms by which m K K 0 can be identified with

m0 $ 1 (m)

m0 .

Associated Galois representations: Suppose that p is a maximal ideal of

TK and m is the associated maximal ideal of TO . By lemma 1.39,T` (JH (N )) Z` Kis free of rank two over TK , so reduction mod p yields a two-dimensional vectorspace over the field TK /p endowed with an action of GQ . The resulting Galoisrepresentationp : GQ GL2 (TK /p)is unramified at all primes p not dividing N `, and for such p the characteristicpolynomial of p (Frob p ) isX 2 Tp X + phpi mod p.If ` is odd, then p is defined over TO /m and we write m for its semisimplificationm : GQ GL2 (TO /m).Thus m is unramified at primes p not dividing N ` and the characteristicpolynomial of Frob p isX 2 Tp X + phpi mod m.109

Suppose that g is in the GK -conjugacy class of eigenforms in S2 () corresponding to p. If g is a newform then TK /p is isomorphic to the field denotedKg0 in section 3.1 and p can be identified with the representationg : GQ GL2 (Kg0 )considered in theorem 3.1. If also ` is odd then g is obtained from m byextending scalars. More generally suppose that g is not necessarily a newformand consider the associated newform f . Let D denote the product of the primes(D)which divide N but not Nf . Let TK denote the K-subalgebra of TK generatedby the operators Tn for n relatively prime to D and hdi for d in (Z/N Z) . Let0 = H 0 (Nf ) where H 0 is the image of H in (Z/Nf Z) , and let T0K = T0Z Kwhere T0Z is the Hecke algebra acting on S2 (0 ). Restriction of operators defines(D)a natural homomorphism TK T0K which is surjective by lemma 4.1. The(D)(D) Kf0 factors through the field TK /(p TK ), so we maycomposite with T0K identify Kf0 with a subfield of TK /p and p is then equivalent to the extensionof scalars of f . If ` is odd, then m and f are defined and equivalent over acommon subfield of TO /m and kf .The structure of TK : We now give an explicit description of TK in the casethat K contains the coefficients of all eigenforms of level dividing N . Let Ndenote the set of newforms in S2 (), i.e., the set of newforms f of level Nfdividing N such that H is contained in the kernel of the characterf

(Z/N Z) (Z/Nf Z) K .By theorem 1.22S2 (, K) =

SK,f ,

f N

where SK,f is spanned by the linear independent elements

{ f (a ) | a divides N/Nf }.

For each f =

an (f )q n in N , let

TK,f denote the image of TK in End K SK,f ;

AK,f denote the polynomial ring over K in the variables uf,p indexed bythe prime divisors of N/Nf ; IK,f denote the ideal in AK,f generated by the polynomialsv (N/Nf )1

pPf,p (uf,p ) = uf,p

(u2f,p ap (f )uf,p + f (p)p),

for primes p dividing N/Nf (setting f (p) = 0 if p divides Nf ).

110

Consider the K-algebra homomorphism AK,f

TK,f defined by mapping uf,pto the operator Tp . Since Pf,p is the characteristic polynomial of Tp on the spanof { f (api ) | i = 1, . . . , vp (N/Nf ) } for each a dividing N/(Nf pvp (N/Nf ) ), we seethat IK,f is contained in the kernel of AK,f TK,f . Taking the product overf in N , we have a surjective K-algebra homomorphismYYAK,f /IK,f

TK,f .f

Since the natural map TK

TK,f is injective and

dimK TK = dimC S2 (, C) =

0 (N/Nf ) =

dimK AK,f /IK,f ,

we concludeLemma 4.4 There is an isomorphism of K-algebras:Y : TK AK,f /IK,ff N

defined by (Tp )f = ap (f ) if p is a prime not dividing N/Nf ; (Tp )f = uf,p mod IK,f if p is a prime dividing N/Nf ; (hdi)f = f (d) if d is relatively prime to N .Remark 4.5 It follows that the algebra TK is a complete intersection overK in the sense that it is a finite-dimensional K-algebra of the formK[X1 , . . . , Xr ]/(P1 , . . . , Pr )for some r.

4.2

Reduced Hecke algebras

As in section 4.1, we suppose K is a finite extension of Q` with ring of integers

`, Q , Q ` and Q , C. WeO and residue field k, and fix embeddings K , Qassume in this section that ` is odd and we fix a representation : GQ GL2 (k)111

which is modular (definition 3.12). Thus is equivalent to m (over TO /m) for

some = H (N ) and maximal ideal m of TO .We suppose also that has the properties listed at the beginning of section 3.3 and that is a finite set of primes contained in (definition 3.25).We shall show that and m can be chosen so that the reduced Hecke algebra T can be identified with a localization Tm of the full Hecke algebra TO .The main result is due to Wiles ([W3], prop. 2.15), but we also explain animportant variant ([TW], lemma 1) which arises when considering special setsof primes Q as in section 2.8 above.T and the full Hecke algebra: Let = 0 (N ) whereYYYIpN = ` N ()pdim = `pp2p{`}

p|N ()

(4.2.1)

p{`}

with = 0 if is good and `

/ and = 1 otherwise.Suppose that f is a newform in N . Recall that N , defined in section 3.3,is the set of newforms f in S2 () such that is equivalent to f over kf(lemma 3.26). Note that these representations are equivalent if and only ifap (f ) mod 0 = tr ((Frob p ))

for all p - N `,

where 0 is maximal ideal of the ring of integers of Kf0 . There is then a

Proposition 4.7 There is an isomorphism T Tm of O-algebras such

and Tm O O0 T0m0 (since there is a unique maximal ideal m0 of TO0 over

m). We are therefore reduced to the case where K is as in lemma 4.4. Notethat N is the set of newforms f in N such thatap (f) = tr ((Frob p ))

for all p - N `.

(We have K = Kf0 and write f for f mod .)

We now define an isomorphism of K-algebrasY

K, : Tm K f N

such that for each f N :

(Tp )f = ap (f ) if p is not in ; (Tp )f = 0 if p is in {`}; (T` )f is the unit root of X 2 a` (f )X + f (`)` if ` is in .QRecall that Tm K = p Tp where p runs over the primes of TK whosepreimage in TO is contained in m. Thus according to lemma 4.4 we haveY Y

Tm K (AK,f /IK,f )p ,f N pMf

where Mf is the set of prime ideals in AK,f /IK,f whose preimage in TO is

contained in m. If f is not in N , then Mf is empty. If f is in N , thenMf consists only of the kernel pf of the map defined by up,f 7 ap (g) whereg is the eigenform of lemma 4.6. Furthermore the maps up,f 7 ap (g) induce113

isomorphisms (AK,f /IK,f )pf K, and taking the product over f N , we

obtain the desired isomorphism .QIdentifying T with the O-subalgebra of f N K generated by the elements Tp = (ap (f ))f for all p not dividing N `, it suffices to prove that Tcontains (Tp ) for all p dividing N ()`. Now observe that for each f in Nthe representation f is isomorphic toGQ GL2 (T ) GL2 (K)

Note that (4.2.2) is equivalent to

tr ((Frob p ))2 6= (p + 1)2 .

(4.2.3)

Let Q denote the set of primes in such that q 1 mod `. The set of primes Qis therefore as in section 2.8. Choosing an eigenvalue q of (Frob q ) for eachq Q as in section 2.8, we regard RQ and hence TQQas an O[Q ]-algebra.(Recall that Q is the the maximal quotient of (Z/( qQ q)Z) of `-powerorder.)Instead of working with 0 (N ) as in proposition 4.7, we shall now workwith the group = 0 (N ) 1 (M ),whereM=

pQ

p2

(4.2.4)

q.

qQ

Remark 4.8 We are about to relate TQ to a localization of TO , where TO

is now defined using the Hecke operators on S2 (), with defined by (4.2.4).Recall that TQ is defined using modular forms with trivial character, but notethat the modular forms involved in the definition of TO may have non-trivialcharacter. The purpose of establishing this relationship is to give a concreterealization of the image of Q in TQ for the purpose of proving theorem 3.31.114

Suppose that f is a newform pair in N . For each q in Q, let q (f ) be the

root of X 2 aq (f )X + q = 0 (in Kf0 ) whose image in kf is q . We let g denotethe unique normalized eigenform in S2 (, Kf0 ) of trivial character such that ap (g) = ap (f ) if p is not in ; ap (g) = p (f ) if p is in Q; ap (g) = 0 otherwise.The reduction g is the unique normalized eigenform in S2 (, k 0 ) of trivial character such that ap (g ) = tr Ip (Frob p ) if p is not in ; ap (g ) = p if p is in Q; ap (g ) = 0 otherwise.Thus g has coefficients in k and is independent of the choice of f in N . Welet m denote the corresponding maximal ideal of TO .Suppose for the moment that K contains the coefficients of all eigenformsof level dividing N . Let N 0 denote the set of newforms g in N such that g ; ap (g ) = p for all p dividing Ng /N .Suppose we are given a newform g N 0 . Let g denote its character andwrite Qg for the conductor of g . Note that g has trivial reduction andhence `-power order, and that Qg divides Q. By proposition 3.2 we see thatif p Q, then g is unramified at p and hence Ng is not divisible byp. Furthermore, by theorem 3.1 (e) and (4.2.2), Ng = N Qg . Let g denotethe character of (Z/Qg Z) of `-power order such that g2 is the primitivecharacter associated to g . ThenXg g =g (n)an (g)q nis in NQg NQ .

Lemma 4.9 The map g 7 g g defines a bijection between N 0 and NQ .

115

Proof: Suppose we are given a newform f in NQ ; i.e., f is in N0 (Q) and f .

For q Q, we have (by lemma 2.44 for example) that

f,q 0f | G q 0 1 ,f,q

where f,q : Gq K is a character whose reduction is the unramified character sending Frob q to q . Note that the characters f,q |Iq have `-power order. Inparticular Nf /N is Q2f where Qf is the product of the primes q Q such thatf,q is ramified (theorem 3.1 (d)). Note also that there is a unique characterf : (Z/Qf Z) K such that

Iq G Q Gal (Q(Qf )/Q) K coincides with the restriction of f,q to Iq for each q|Qf . (We have written ffor the character of GQ as well as the corresponding Dirichlet character. Weshall also write f for the corresponding character of Q .) Let g denote thenewform associated to the eigenformXf1 (n)an (f )q n S2 (0 (N ) 1 (Q2f ), K).By proposition 2.6 and theorem 3.1 we have g f f1 , g = f2 , Ng = N Qf and

P/ m. ItSince the order of H 0 is not divisible by `, we find that dH 0 hdi 0follows that if d H then hdi = 1 in Tm . We may therefore regard Tmas an O[Q ]-algebra via the map d 7 hdi. Recall that TQ is considered anO[Q ]-algebra via the mapO[Q ] RQ TQ .If p is a prime not in Q, let xp denote the unique element of Q such thatx2.p = p

Proposition 4.10 There is an isomorphism TQ Tm of O[Q ]-algebras

such that Tp 7 xp Tp for all primes p / with p - `N ().Proof: We may enlarge K so that we are in the setting of lemma 4.9. We thendefine an isomorphism of K-algebrasY

: Tm K KgN 0

such that for each g N 0 :

(Tp )g = ap (g) if p is not in or if p divides Qg , (Tp )g = 0 if p is in Q, (Tp )g is the root of X 2 ap (g) + g (p)p with reduction p if p is in Qbut does not divide Qg , and (d)g = g (d) for all d Q .The existence of such an isomorphism follows from lemma 4.4 which givesY Y

(AK,g /IK,g )p ,Tm K gN pMg

where Mg is the set of prime ideals in AK,g /IK,g whose preimage in TO is

contained in m. If g is not in N 0 , then Mg is empty. If g is in N 0 , then Mgconsists of the prime ideal corresponding to the eigenform whose eigenvaluesare prescribed as above, and one checksQ that (AK,g /IK,g )p = K.Viewing TQ as a subalgebra ofK and matching indices via the bijecf NQ

tion f = g g g, we obtain an injective homomorphism of O[Q ]-algebras

Y0 : TQ KgN 0

117

such that0 (Tp ) = (g (p)ap (g))g = (xp Tp )for primes p / such that p - `N . Since TQ is generated over O by the setof such Tp , we see that 0 (TQ ) is contained in (Tm ). On the other hand, theimage of 0 contains the image of Q , hence contains (hdi) for d in (Z/M Z)as well as (Tp ) for p / with p - N `.ItQremains to prove that (Tp ) is in the image of 0 (TQ ) for p dividingN ` qQ q. Consider the compositeQ

RQ TQ

K.

gN 0

The pushforward of univ

(Tp ) as the image of Frob p on the Ip -coinvariants, hence (Tp ) is in 0 (TQ ).For q Q and g N 0 , the pushforward of Q |1Gq q,Q to the g-component is anunramified summand of g |Gq whose reduction sends Frob q to q . It followsthat this character maps Frob q to (Tq ) and we conclude that (Tq ) 0 (TQ ).Finally in the case that ` does not divide N , we appeal to lemma 4.1 toconclude that (T` ) 0 (TQ ).2Auxiliary primes: For as in proposition 4.10, i.e. a set of primes satisfying(4.2.2), a somewhat simpler argument provides a similar a description of Twith replaced by0 = 0 (N ) H (M )

(4.2.5)

where H is the `-Sylow subgroup of (Z/M Z) . While we shall make no direct

use of this, the group 0 will play a role in the proof of theorem 3.31 andwe shall need to choose so that 0 has no elliptic elements; i.e., non-trivialelements of finite order. This is the case for example if contains a primep 6 1 mod ` with p > 3. The group 0 is then contained in 1 (N) forsomeinteger N > 3, and therefore has no elliptic elements. Indeed if ac db

has finite order then the roots of X 2 (a + d)X + 1 are roots of unity and wededuce that this matrix is the identity.In order to show that can be so chosen, we appeal to the following lemma(cf. [DT2], lemma 3).118

Lemma 4.11 Suppose that G is a finite group, : G k is a character of

If d > 3, then is reducible.

If d = 3, then is reducible or has projective image isomorphic to A4 . If d = 2, then |ker is reducible.Proof: Note that if (g) is a scalar then (G) = 1. Hence induces inducesa surjective homomorphism 0 : G0 Cd where G0 is the projective image of and Cd is cyclic of order d. Furthermore if d = 2, then every element ofG0 ker 0 has order 2. The lemma then follows from theorem 2.47(b).2

4.3

Proof of theorem 3.31

We shall give a proof of theorem 3.31 which is based on the q-expansion principle rather than the method of de Shalit [dS] employed in [TW]. It will be moreconvenient to consider the action of the Hecke operators on the full space ofmodular forms M2 (). The Riemann-Roch theorem shows that the dimensionof M2 () is g + s 1 where g is the genus of the modular curve X associatedto and s is the number of cusps on X (see for example [Shi2] thm. 2.23 or[DI] (12.1.5)).Eisenstein maximal ideals: One can give an explicit description of a spaceof Eisenstein series G2 () so thatM2 () = S2 () G2 ().(See for example [Hi3], lemma 5.2.)There is a natural action on M2 () by the Hecke operators hdi for all d (Z/N Z) and Tp for all primes p. We shall only need to use the operators hdi Z denote the subring of End (M2 ())and Tp for p not dividing N , and we let T

these generate. The ring TZ is commutative and is finitely generated and free Z is finitely generated is proved for example by showingas a Z-module. That Tthat M2 (, Z), the set of forms with integer Fourier coefficients at , is stableunder the Hecke operators and contains a basis for M2 () (see [DI], cor. 12.3.12 Z acts faithfully on theand prop. 12.4.1). Alternatively, one can show that Tcohomology of the non-compact modular curve Y (cf. section 1.3).119

For any ring A, we write M2 (, A) for M2 (, Z) A and regard this as a

A := T Z A. If n is a maximal ideal of T O we write T n for themodule for T denote its preimagelocalization. If m is a maximal ideal of TO then we let m

in TO . O is Eisenstein ifWe say that a maximal ideal n of TTp p + 1 mod n for all p 1 mod N .One sees from the explicit description of the Eisenstein series that Tp = p+1 onG2 () if p 1 mod N . We shall be interested in the non-Eisenstein maximalideals because of the following lemma (see [R5], thm. 5.2(c)).Lemma 4.12 Suppose that ` is odd. The representationm : GQ GL2 (TO /m) is not Eisenstein.is absolutely irreducible if and only if mProof: If m is not absolutely irreducible then m 1 2 with N divisible bythe product of the conductors of 1 and 2 . Furthermore if ` does not divideN , then one of the characters is unramified at ` while the other coincides with is Eisenstein. Conversely if m isthe cyclotomic character on I` . Therefore mEisenstein then proposition 2.6 implies that m restricted to GQ(N ` ) has trivialsemisimplification from which it follows that m is also reducible.2Differentials: Suppose now that is a finite set of primes satisfying (4.2.2)Moreover we assume that contains a prime p > 3 with p 6 1 mod `. Let and 0 be defined as in (4.2.4) and (4.2.5); i.e. = 0 (N ) 1 (M )0 = 0 (N ) H (M )

where H is the `-Sylow subgroup of (Z/M Z) . Let X and X 0 denote the

modular curves associated to and 0 .We first consider the case where ` - N . The curve X has a smooth propermodel X over Z[1/N M ] such that the complement of the cusps parametrizescyclic isogenies (E1 , i1 ) (E2 , i2 ) of degree N where Ej is an elliptic curveand ij is an embedding M , Ej . (See [DR] or [Kat] for example.) The actionof H on X extends to X and the quotient X 0 = X /H is a smooth model overZ[1/N M ] for X 0 . The natural projection X X 0 is etale. (The fact that it isetale on the complement of the cusps follows from the natural moduli-theoreticdescription of X 0 using the fact that H (M ) 1 (p) for some p > 3. Onethen need only verify that X X 0 is unramified at the cusps.)For a Z[1/N M ]-algebra A, we define120

A = H 0 (XA , 1XA /A ), 0A = H 0 (XA0 , 1X 0 /A ),A

A = H 0 (XA , 1

XA /A (D)), 0 = H 0 (X 0 , 1 0 (D0 )),

AAX /AA

where D (resp. D 0 ) denotes the reduced divisor defined by the cusps of X (resp.X 0 ). The q-expansion principle and standard base-change arguments allow usto identify these with S2 (, A), S2 (0 , A), M2 (, A) and M2 (0 , A) (see [Kat],sec. 1.6, 1.7). We may therefore regard the first two of these as modules for A.TA and all of them as modules for TLemma 4.13 Suppose that A = k or K. O O A A and 0 O A 0 .(a) ==OA(b) The natural map X X 0 induces an isomorphism H 0A

A .

O , then the localization at n

(c) If n is a non-Eisenstein maximal ideal of T O is an isomorphism.of O (d) If m is a maximal ideal of TA then A [m] is one-dimensional over TA /m.Sketch of proof:(a) In the case A = K this follows from the fact that K is flat over O, so O / O with the direct image of k undersuppose A = k. Identify Xk XO and use that H 1 (Xk , 1Xk /k (D)) vanishes by Serre duality. Thecase of 0 is similar. (See [Maz1] sec. II.3.)(b) Using the fact that X X 0 is etale one identifies the pull-back of 1X 0 /AAwith 1XA /A and that of D 0 with D.(c) The map is injective and one proves that its cokernel is free and annihilated by the operators Tp (p + 1) for all p 1 mod N M . To prove thelatter assertion, first observe that it holds with O replaced by C, thenby Z[1/N M ].

121

(d) This follows from

A [m] = S2 (, A)[m] = HomA (TA , A)[m] = HomA (TA /m, A),where the middle isomorphism is that of proposition 1.34, but let usreformulate the argument in a way more easily generalized to the caseof ` dividing N discussed below. Note that A [m] is non-zero sinceTm acts faithfully on m . To prove that the dimension is at most onewe can enlarge the field A and assume TA /m = A. One then showsthat an eigenform f in A (for all the Hecke operators Tp and hdi) isdetermined by its eigenvalues and the first coefficient of its q-expansion.This follows from the fact that f is determined by its q-expansion andfor all n, an (f ) = a1 (Tn f ) and Tn can be expressed in terms of the Tpand hdi. (See [Maz1], sec. II.9.)2We now explain how the situation changes if ` divides N . In that caseX has a regular model X over Z[`/N M ] with the same moduli-theoretic description as above. This model is smooth over Z[1/N M ], but XF` has twosmooth irreducible components crossing at ordinary double points as in [DR]sec. V.1. The quotient X 0 = X /H is a regular model for X 0 over Z[`/N M ] A andand X X 0 is etale. For a Z[`/N M ]-algebra A we define A , 0A , 10

A as before, except that is replaced by the sheaf of regular differentials

(see [DR] sec. I.2, [Maz1] sec. II.6 or [MRi] sec. 7). Formation of these modules again commutes with change of the base A, but we can no longer identifythem with S2 (, A), S2 (0 , A), M2 (, A) and M2 (0 , A) if ` is not invertible A ) on A and 0in A. For every A, there is a natural action of TA (resp. TA0 A and ). In the case of TA this is proved by identifying A (resp.(resp. A0A ) with the cotangent space at the origin for the Neron model over A for A the action is defined usingthe Jacobian of X (resp. X 0 ). In the case of TGrothendieck-Serre duality (as in [Maz1] sec. II.6 or [MRi] sec. 7).Lemma 4.14 If ` divides N then lemma 4.13 carries over with the abovenotation and the additional hypothesis T` / m for part (d).The proof is essentially the same except that part (d) is more delicate in thecase A = k. For the proof in that case we refer the reader to [W3], lemma 2.2(from which the result stated here is immediate). We remark however that weshall only use lemma 4.14 if m is not good. In that case one can also deducethe statement here from the argument used in the proof of [MRi] prop. 20,which shows instead that dimk k [m0 ] = 1 where m0 is a certain maximal idealcorresponding to m, but contained in an algebra defined using the operatorshdi, Tp for p 6= ` and w` .122

Remark 4.15 The approach in [TW] to proving theorem 3.31 is based on theresult that under certain hypotheses (T` (J) Z` O)m is free of rank two overTm (see [W3], thm. 2.1 and its corollaries). This result generalizes work ofMazur Ribet, and Edixhoven (see secs. 14 and 15 of ch. II of [Maz1], thm. 5.2of [R5], [MRi] and sec. 9 of [Edi]), and the key to its proof is lemmas 4.13(d)and 4.14. We shall instead give a proof of theorem 3.31 which is based directlyon lemma 4.13, (d).The fact that (T` (J) Z` O)m is free of rank two over Tm actually underliesWiles approach to many intermediate results along the way to proving theShimura-Taniyama conjecture for semistable elliptic curves. We shall appealto a special case in the course of proving theorem 3.36 below.Proof of the theorem: Suppose we are given a representation as in section 3.3 and a set of primes Q as in the statement of theorem 3.31. We applylemmaQ4.11 with G = GQ and = to choose an auxiliary prime p not dividing6N ` qQ q such that p 6 1 mod `

tr ((Frob p ))2 6= (p + 1)2 .To prove theorem 3.31 we may replace K by a larger field and assume that kcontains the eigenvalues of (Frob p ).Let = Q {p}. Thus is a set of primes as in proposition 4.10 and contains a prime p > 3 such that p 6 1 mod `. We let be as in (4.2.4); thus = 0 (N ) 1 (M )Qwhere M = p2 qQ q, and we choose the maximal ideal m of TO as in proposition 4.10. Let H denote the `-Sylow subgroup of (Z/M Z) and regard TOand hence Tm as an O[H]-algebra via d 7 hdi. In view of proposition 4.10,theorem 3.31 is equivalent to the following:Theorem 4.16 Tm is free over O[H]. O -moduleProof: Consider the T = HomO ( O , O)Land the TO -moduleL = HomO (O , O).123

Since X X 0 is unramified, the Riemann-Hurwitz formula implies that

dim M2 () = g + s 1 = #H(g 0 + s0 1) = #H dim M2 (0 ),

where g (resp. g 0 ) is the genus and s (resp. s0 ) is the number of cusps of X

On the other hand, by (b) we have that

where a = ker (O[H] O) is the augmentation ideal. By Nakayamas lemma

dimk k . By (4.3.1) any surjective homomorphism

O[H]d L

is free over O[H], as is

is in fact an isomorphism. Hence L n , O)Ln = HomO (n , O) = HomO (

O (by part (c) of lemmas 4.13

for each non-Eisenstein maximal ideal n of Tand 4.14). is not Eisenstein (by lemma 4.12),Since m is absolutely irreducible, mand it follows that Lm is free over O[H].Since L O K is free of rank one over TK (lemma 1.34), we have thatLm O K is free of rank one over Tm O K. If ` - N then lemma 4.13 impliesthatL/mL = Homk (k [m], k)is one-dimensional over k. The same assertion holds if `|N by lemma 4.14(from the definition of m in this case, we see that it does not contain T` ). Itfollows from Nakayamas lemma that Lm is free of rank one over Tm . ThereforeTm is free over O[H].2

4.4

Proof of theorem 3.36

Our proof of theorem 3.36 is based on Wiles arguments in ch. 2 of [W3],

which in turn are based on a method of Ribet [R4]. We shall reformulate theproof somewhat to underscore the relationship observed by Doi and Hida [Hi1]between the size of O/ and the value of an L-function, but after doing so weshall also sketch the more direct argument used by Wiles. Two importantingredients appear in both versions of the argument, and we shall discuss theirproofs in section 4.5. These ingredients are124

the generalization of a result of Ihara [Ih] used in Ribets argument;

the generalization of a result of Mazur [Maz1] on the structure of theTate module T` (J0 (N )) as a module for TZ` .Assume now that we are in the setting of theorem 3.36. In particular, isa representation as in section 3.3, is a finite set of primes contained in and is an O-algebra homomorphism T O arising from a newform f inN with coefficients in O.The symmetric square L-function: For each prime p let p (f ) and p (f )denote the roots of the polynomial X 2 ap (f )X + p p = 0, where p = 0 or 1according to whether p divides Nf . Thus we haveLp (f, s) = (1 p (f )ps )1 (1 p (f )ps )1 .If p = ` divides N/Nf , then we require that ` (f ) be the root which is a unitin O, i.e. a` (g). We define the symmetric square L-function associated to f asYL(Symm 2 f, s) =Lp (Symm 2 f, s),p

whereLp (Symm 2 f, s) = (1 p2 (f )ps )1 (1 p (f )p (f )ps )1 (1 p2 (f )ps )1 .(We caution the reader we have defined the Euler factors rather naively atprimes p such that p2 divides Nf .) The product converges absolutely for realpart of s > 2 and can be analytically continued to an entire function.The calculation of : We write for the kernel of and I for the annihilatorof in T . Since T is reduced, we have I = 0 and I has finite indexin T . Note thatO/ = T /( I).

Suppose that we are given a T -module L which is finitely generated and freeover O, and such that L K is free of rank d over T O K. Suppose alsothat L is endowed with a perfect O-bilinear pairingLL O(x, y) 7 hx, yi

such that hT x, yi = hx, T yi for all x, y L and T T . Note that to give

such a pairing is equivalent to giving an isomorphism

L HomO (L, O)x 7hx, i125

of T -modules. We shall refer to the module L together with the pairing h , i

as a self-dual T -module of rank d. We then can give a lower bound for thesize of O/ in terms of a basis {x1 , x2 , . . . , xd } for the free O-module L[]:Lemma 4.17 We haved O det(hxi , xj i)i,j ,and equality holds if L is free over T .Proof: The modules L[] and L/L[I] are free of rank d over O and the pairingh , i induces an isomorphism

L/L[I] HomO (L[], O).

The O-module M = L/(L[] L[I]) is annihilated by and is generated by

with equality if L is free. The cardinality of M is that of the cokernel of the

mapL[] , HomO (L[], O)

arising from the pairing h , i, and this is precisely

O/ det(hxi , xj i)i,j .

Recall that proposition 4.7 establishes an isomorphism between T

Tm , the localization at a certain maximal ideal m of TO = TZ O, where

2and

TZ End (S2 (0 (N ))),

and where N = N is defined in (4.2.1). Write X for X0 (N ) and J for J0 (N ).Recall that H1 (X, Z) is endowed with the structure of a TZ -module andT` (J) Z` O = H1 (X, Z) Owith that of a TO -module. We also regardH 1 (X, O) = Hom(H1 (X, Z), O)as a TO -module; as such it is naturally isomorphic toHomO (T` (J) Z` O, O).126

The Weil pairing defines a perfect pairing

T` (J) T` (J) Z`n

(with (e2i/` )n as the chosen generator for lim `n (C)). The composite

T` (J) T` (J) HomZ` (T` (J), Z` )

(4.4.2)

is an isomorphism of TZ` -modules where w is induced by the involution w =

wN of X (see section 1.4). Tensoring with O and localizing at m, we regardLT = (T` (J) Z` O)mas a self-dual T -module of rank 2 via the isomorphism T = Tm of proposition 4.7. We shall write h, iT for the pairing obtained from (4.4.2). Similarly,using w , the cup product and Poincare duality, we regardLH = H 1 (X, O)mas a self-dual T -module of rank 2, with hx, yiH defined by the image of x w y under the canonical isomorphism of H 2 (X, O) with O. Note that LH iscanonically isomorphic as a T -module to HomO (LT , O). In the next sectionwe shall discuss the following generalization of a result of Mazur [Maz1].Theorem 4.18 The T -modules LT and LH are free.Corollary 4.19 If {x, y} is a basis for LT [] (resp. LH []), then is generated by hx, yiT (resp. hx, yiH ).This follows from lemma 4.17, theorem 4.18 and skew-symmetry of the pairings.Hidas formula: We now explain how the value of hx, yiH in corollary 4.19is related to L(Symm 2 f, 2) by a formula of Hida (see [Hi1] sec. 5, [W3] sec.4.1).Recall that the homomorphism

TO T m = T O

arises as Tn 7 an (g) for a normalized eigenform g in S2 (0 (N ), K) whose associated newform is f . We shall write P for the kernel of this homomorphism;thus P is the preimage in TO of the ideal of T . Note that Pm correspondsto under the isomorphism Tm = T .127

Choose a number field K0 containing the Fourier coefficients of g and let

where the last h, i denotes the Petersson inner product and we have used thatwg c = (wg)c for forms on 0 (N ). We then appeal to a formula of Shimura(see [Shi5] (2.5) and [Hi1] (5.13)) to obtainhg, (wg)c i = (48)1 [SL2 (Z) : 0 (N )]ress=2 D(g, wg, s).128

PBy [Shi5] lemma 1, the Dirichlet series D(g, h, s) is defined by an (g)an (h)nsand if g and h are normalized eigenforms then this has an Euler productexpression in which the factors are(1 p (g)p (g)p (h)p (h)p2s ) (1 p (g)p (h)ps )1 (1 p (g)p (h)ps )1 (1 p (g)p (h)ps )1 (1 p (g)p (h)ps )1 ,(where the p s and p s are defined as they were for f ). Using the recipe forobtaining g from f (see lemma 4.6) we find that if ` is not in thenYD(g, wg, s) = D (f, f, s)p1 ,p|N/Nf

of T0m0 -modules. Note that T is also the composite of the canonical splittingL0T , T` (J 0 ) Z` O

with (4.4.4).The construction of T is similar for p = ` except that we have only twocopies of J and the map L2T LT is given by the matrix (1, `1 ) where ` isthe unit root of (3.3.2). For arbitrary 0 satisfying 0 we define Tas a composite of the maps defined above. It is independent of the choice ofordering of 0 . Recall that 0T is used to denote the dual map LT L0T .We define H : L0H LH as the adjoint of the map 0H which renders thediagram0H : H 1 (X, O)m H 1 (X 0 , O)m0

HomO (LT , O) HomO (L0T , O)

commutative (where the vertical maps are the natural isomorphisms and thebottom one is dual to T ).The second crucial result whose discussion we postpone until the nextsection is the following generalization of a lemma of Ihara.131

Lemma 4.24 T and H are surjective.

Suppose again we are in the case 0 = {p} with p 6= `. One need onlyunravel the definition of 0H to see that the diagramH 1 (X, O0 )[P0 ] H 1 (X 0 , O0 )[P00 ]

LH

0H

L0H ,

commutes, where the top arrow is defined as

p1 (Tp ) + p1 .Extending scalars to C, this map sends g to g0 and gc to g0c where g 0 isdefined by (4.4.3). In the case 0 = {`}, the same assertion holds if O0 ischosen so that it contains (` ) = a` (g 0 ) and the top arrow is defined as (` )1 .We conclude that if {x, y} is a basis for H 1 (X, O0 )[P0 ] and the matrices Aand A0 are defined using the bases {x, y} for H 1 (X, O0 )[P0 ] and {0H x, 0H y}for H 1 (X 0 , O0 )[P00 ], then A0 = A. Theorem 3.36 is now immediate from corollary 4.21. In fact, we have proved thatY0 = (cp ) .p0

2Remark 4.25 Note that to obtain the inclusion stated in theorem 3.36 it suffices to apply theorem 4.18 for , rather than both and 0 (cf. remark 4.22).Wiles argument: The method of [W3] sec. 2.2, like that of Ribet in [R4],is more direct than the one given above but does not explicitly illustrate therelation with values of L-functions. The approach is simply to compute thecomposite T 0T . It suffices to consider the case 0 = {p} and we supposefirst that p 6= `. Let denote the endomorphism of J 3 defined by the matrix

= w0 ( , , )w .

Using the relations w 0 = w and w 0 = w, we find that

!

= ,

132

which can be computed for example by considering its effect on the cotangentspace S2 (0 (N ))3 . The result is that

p(p + 1)Tp2 (p + 1) pTp,pTpp(p + 1)pTp=2p(p + 1)pTp Tp (p + 1)

which we note commutes with the action of TZ on J 3 . One then checks that!

Thus in either case we find that T 0T acts on L0T [P] by an element of O =

Tm /P which is a unit times (cp ). Theorem 3.36 then follows from corollary 4.19 and lemma 4.24.

4.5

Homological results

In this section we sketch the proofs of theorem 4.18 and lemma 4.24, but shalloften refer the reader to ch. 2 of [W3] for more details.Multiplicities: We first consider theorem 4.18, generalizing a result provedby Mazur in sections II.14 and II.15 of [Maz1]. Recall that LT (resp. LH )is defined as (T` (J0 (N ))O )m (resp. H 1 (X0 (N ), O)m ) where m is a maximalof the ideal of the Hecke algebra TO generated by the Hecke operators onS2 (0 (N )). We are assuming moreover that m is irreducible and that one ofthe following holds: ` does not divide N ; `2 does not divide N and T` / m.We wish to prove that LT and LH are free over Tm .Since LH and LT are isomorphic as Tm -modules, it suffices to prove thatLT is free. Furthermore, to prove that LT is free, it suffices to prove that the133

localization of T` (J0 (N )) at m TZ` is free; i.e., we may replace O by Z` and

T` (J0 (N ))/`T` (J0 (N ))

= J0 (N )[`] = HomF` (J0 (N )[`], F` )

as TF` -modules, it suffices to prove the following cases of [W3] thm. 2.1.Theorem 4.26 Let m be a maximal ideal of TZ` . Suppose that m is irreducible and that either(a) ` does not divide N , or(b) `2 - N and T` / m.

Then

dimTZ` /m J0 (N )[m] = dimTZ` /m (J0 (N )[`]/m) = 2.

The proof of theorem 4.26 in case (b) requires more of the theory of groupschemes and Neron models than we wish to delve into here. We shall thereforeonly explain the proof in the case (a) (which is [R5] thm. 5.2(b), see Ribetspaper for more details) and refer to [W3] sec. 2.1 for the general case. Weremark however that the proof of theorem 3.42 appeals to theorem 3.36 onlyin the case = (cf. remarks 4.22 and remark 4.25). Recall that ` divides Nonly if is not good. So for the purposes of proving the Shimura-Taniyamaconjecture for semistable elliptic curves, (b) can be replaced by the strongerhypothesis(b0 ) `2 - N and m |G` is not good.

The result in this case is due to Mazur and Ribet (the main result of [MRi])and the proof is slightly easier than in the case of (b).Returning to case (a), we appeal to a general property of the functor D oftheorem 2.31 which follows from cor. 5.11 of [Oda].Theorem 4.27 Suppose that A is an abelian variety over Z` (i.e., the Neronmodel of an abelian variety over Q` with good reduction). There is a canonicaland functorial isomorphism of vector spaces over F` ` )[`] )0 D(A(Q= Cot 0 (A/F` ),where Cot 0 denotes the cotangent space at the origin.134

In the case that A is the Jacobian of a smooth proper curve X over Z` , we

Consider now the action of GQ on the points of

V = J0 (N )[`] [m] = (J0 (N )[`]/m) .By the argument in [Maz1] prop. II.14.2 or by the main result of [BLR], weknow that every Jordan-Holder constituent of the representation of GQ on thisT/m-vector space is isomorphic to m = m . It follows thatdimT/m D(V ) = 2 dimT/m D(V )0where we now regard V as a good G` -module. On the other hand (4.5.1)implies thatD(V )0 = S2 (0 (N ), F` )[m],which is one-dimensional over T/m by the q-expansion principle (lemma 1.34for example). We have now shown that D(V ) is two-dimensional over T/m,and it follows that so is V .Iharas lemma: We now sketch the proof of lemma 4.24, which proceedsby analyzing the behavior of the homology of modular curves under certaindegeneracy maps. Note that it suffices to prove the lemma for T .If N and M are positive integers, then we let 1 (N, M ) = 1 (N ) 0 (M )and write Y1 (N, M ) (resp. X1 (N, M )) for the associated non-compactified(resp. compactified) modular curve. The key intermediate result is the following:Lemma 4.28 Suppose that N is a positive integer and p is a prime not dividing N .(a) The mapH1 (X1 (N, p), Z) H1 (X1 (N ), Z)2x7 ( x, x)

is surjective, where is defined by 7 and by 7 p .

135

(b) If N > 3 and ` is an odd prime different from p, then the sequenceH1 (Y1 (N p, p2 ), Z` ) H1 (Y1 (N p), Z` )2 H1 (Y1 (N ), Z` )x7 (1 x, 1 x); (y, z) 7 2 y 2 zis exact, where the maps 1 and 2 are defined by 7 and 1 and 2by 7 p .To prove the lemma, one first translates the statement into one about thehomology (or cohomology) of the congruence subgroups involved. Part (a)of the lemma is due to Ihara [Ih], but see also the proof of thm. 4.1 of [R4].Part (b) due to Wiles is more elementary and is established in the course ofproving lemma 2.5 of [W3] (see the sequence (2.13)). The result stated thereis in terms of cohomology rather than homology, but the one given here isimmediate from it.Remark 4.29 A method of Khare [Kh], sec. 2 using modular symbols yieldsan alternate proof of part (b) (after localization at a non-Eisenstein maximalideal).To deduce lemma 4.24 from lemma 4.28 we also need the following result:Lemma 4.30 Suppose that O is the ring of integers of a finite extension of Q` .O = TZ OLet M be a positive integer and H a subgroup of (Z/M Z) . Let T Z is the subring of endomorphisms of M2 (1 (M )) generated by thewhere Toperators hdi and Tr for primes r - M . If n is a non-Eisenstein maximal idealthen the localization at n of the natural map(a) H1 (Y1 (M ), O) H1 (X1 (M ), O) is an isomorphism;(b) H1 (X1 (M ), O) H1 (XH (M ), O) is surjective.To prove part (a), one checks thatH1 (Y1 (M ), Z) H1 (X1 (M ), Z)is surjective and that Tr = r + 1 on the kernel for primes r 1 mod M (cf.proposition 4.13, part (c)).Part (b) is most easily proved by showing that GQ acts trivially on thecokernel of the natural mapT` (J1 (M )) T` (JH (M ))(using [LO], prop. 6 for example) and then applying lemma 4.12.136

hence that of (4.4.4) upon localizing at m0 .

Commutative algebra

In this section we collect some basic facts of commutative algebra that areused in the proof. We recall that O is the ring of integers in a finite extensionK of Q` , and that O has residue field k. Let CO denote as in section 2.6 thecategory of complete noetherian local O-algebras with residue field k.

137

5.1

Wiles numerical criterion

In this section we state a numerical criterion discovered by Wiles for a map

between two rings in CO to be an isomorphism.Complete intersections: We say that a ring A in CO is finite flat if it isfinitely generated and torsion free as an O-module. A key ingredient in Wilesisomorphism criterion is played by the concept of complete intersection, forwhich we give the following naive definition.Definition 5.1 An object A in CO which is finite flat is called a completeintersection if it can be expressed as a quotientA ' O[[X1 , . . . , Xn ]]/(f1 , . . . , fn ),where there are as many relations as there are variables.Remark 5.2 It is also true that if an object O[[Y1 , . . . , Yr ]]/J in CO which isfinite flat is a complete intersection, then necessarily J can be generated byr elements. See for example [Mat], thm. 21.2 (and lemma 5.11 below). Wewill not use this fact in this chapter, although a special case is proved in thecourse of establishing lemma 5.30.

The category CO: The numerical criterion of Wiles is stated more naturallyin terms of rings A in the category CO which are endowed with some extrastructure: namely, a surjective O-algebra homomorphism A : A O. Let

CObe the category whose objects are pairs (A, A ), where A is an object ofCO and A : A O is a surjective O-algebra homomorphism, also called the

augmentation map attached to A. Morphisms in CO

are local ring homomorphisms which are compatible in the obvious way with the augmentation maps.By abuse of notation one often omits mentioning the augmentation map A

when talking of objects in CO

, and simply uses A to denote (A, A ), when this

causes no confusion. Objects of CO

will also be called augmented rings.

The invariants A and A : One associates to an augmented ring (A, A )

two basic invariants:A = (ker A )/(ker A )2 ;A = A (AnnA ker A ).Here AnnA (I) denotes the annihilator ideal of the ideal I in A.The invariant A can be thought of as a tangent space for the object A.(More precisely, it is the cotangent space of the scheme spec(A) at the pointker A .) It is a finitely generated O-module.138

The invariant A seems less familiar at first sight. It is called the congruenceideal. (The reason for this terminology should become clearer shortly.) Weare now ready to state Wiles numerical criterion:Theorem 5.3 Let : R T be a surjective morphism of augmented rings.Assume that T is finitely generated and torsion-free as an O-module, and thatT 6= (0). (And hence, in particular, #(O/T ) < .) Then the following areequivalent:(a) The inequality #R #(O/T ) is satisfied.(b) The equality #R = #(O/T ) is satisfied.(c) The rings R and T are complete intersections, and the map : R Tis an isomorphism.Remark 5.4 The above theorem is slightly different from the one that appears in [W3], where it is assumed from the outset that the ring T is Gorenstein. In the form in which we state it above, the theorem is due to H. Lenstra[Len]. Our presentation follows Lenstras very closely. For the original (andslightly different) point of view, the reader should consult the appendix of[W3].Some examples: Before going further, it may be good to pause and consider

some examples of objects of CO

and the invariants associated to them. Whilelogically independent of the proof, the examples should help the reader developsome intuition. (For a systematic way to compute the tangent spaces A , seethe paragraph at the end of section 5.2.)Example 1:A = {(a, b) O O,

A ' O/O O/O,

Example 6: ( = ` 1 (mod 4),

with A (a, b + ci) := a.

A ' Z/`2 Z,

A = (`2 ).

Example 7:A = O[[T ]]/(T ) ' O kT kT 2 ,A ' k,

5.2

with A (f ) = f (0).

A = ().

Basic properties of A and A

In this section we collect some of the basic properties of the invariants A andA , and prove the equivalence of (a) and (b) in theorem 5.3.Behaviour of A under morphisms: The assignment A 7 A is a functor

from the category CO

to the category of O-modules; a morphism A B in

CO induces a homomorphism A B of O-modules. Moreover, if A B

is surjective, then so is the induced map on the tangent spaces. Therefore,when A maps surjectively onto B we have#A #B .There is also a converse to this, which will be useful later:140

(5.2.1)

Lemma 5.5 If the homomorphism A B is surjective, then A B is

also surjective.Proof: This follows from Nakayamas lemma. (Cf. [Ha], ch. II, sec. 7.4 and[Mat], th. 8.4.)2Behaviour of A under (surjective) morphisms: Unlike the assignmentA 7 A , the assignment A A is not functorial, but it does have a nicebehaviour under surjective morphisms: namely, if : A B is surjective,thenA B ,

i.e., #(O/A ) #(O/B ).

(5.2.2)

This is simply because in that case induces a map

AnnA ker A AnnB ker B .Relation between the invariants A and A : In general, we have thefollowing inequality:#A #(O/A ).

(5.2.3)

The key behind proving this identity is to interpret #A in terms of Fitting

ideals.Digression on Fitting ideals: If R is a ring (in CO , say) and M is a finitelygenerated R-module, we express M as a quotient of Rn for some n:0 M 0 Rn M 0.

(5.2.4)

The Fitting ideal of M , denoted Fitt R (M ), is the ideal of R generated by the

determinants det(v1 , . . . , vn ), where the vectors vi Rn range over all possiblechoices of elements of M 0 Rn . One checks that this ideal does not depend onthe choice of exact sequence (5.2.4), and hence is an invariant of the R-moduleM . For example, if M is a finitely generated O-module, we may writeM = Or O/(n1 ) O/(n2 ) O/(nk ),with n1 n2 nk , and the sequence ni is completely determined byM . The Fitting ideal Fitt O (M ) is then the ideal of O generated by n1 ++nk ,if r = 0, and is the 0-ideal if r > 0. Note in particular that, if M is a finiteO-module, then#M = #(O/Fitt O (M )).141

(5.2.5)

Furthermore, if M is any R-module, it follows directly from the definition that

Fitt R (M ) AnnR (M ).

(5.2.6)

Fitting ideals behave well under tensor products: in particular, if M is a

finitely generated A-module, where A is an object in CO

, then:A (Fitt A (M )) = Fitt O (M A O),

(5.2.7)

where the tensor product is taken with respect to the augmentation map A .For more details and references on the Fitting ideal, see [Len] for example.Now we are ready to prove equation (5.2.3). For, noting that A =ker A A O, where the tensor product is taken with respect to A , and applyingequation (5.2.7) with M = ker A , we have:Fitt O (A ) = A (Fitt A (ker A )) A (AnnA ker A ) = A ,

(5.2.8)

where the containment follows from equation (5.2.6). Now the inequality(5.2.3) follows by combining (5.2.5) and (5.2.8).As a consequence, we haveCorollary 5.6 The statements (a) and (b) in theorem 5.3 are equivalent.Proof: If R T is a surjective map of augmented rings, then #R #T , by equation (5.2.1). But equation (5.2.3) gives the inequality #T #(O/T ). Hence the inequality #R #(O/T ) always holds, so that (a)implies (b) in theorem 5.3. The reverse implication is clear.2

Remark 5.7 (Computing the tangent spaces A ): Any object (A, A ) in CO

can be expressed as a quotient of the object U = O[[X1 , . . . , Xn ]] of example 5 with augmentation map given by U (f ) = f (0). Indeed, one cantake a1 , . . . , an to be A-module generators of the finitely generated A-moduleker A , and obtain the desired quotient map by sending Xi to ai .The tangent space U of U is a free O-module of rank n which can bewritten down canonically as

OX1 OX2 OXn ,

the natural map from (ker U ) being simply the map which sends a powerseries f U with no constant term to its degree 1 term, which we will denoteby f.If A is expressed as a quotient U/(f1 , . . . , fr ), then one hasA = U /(f1 , . . . , fr ).142

(5.2.9)

5.3

Complete intersections and the Gorenstein condition

In this section we show that complete intersections satisfy a Gorenstein condition, and that (c) implies (a) and (b) in the statement of Wiles isomorphismcriterion (theorem 5.3).Definition 5.8 An object A in CO which is finite flat is said to be GorensteinifHomO (A, O) ' A as A-modules.Proposition 5.9 Suppose A in CO is finite flat. If A is a complete intersection, then A is Gorenstein.The remainder of this section will be devoted to proving proposition 5.9. Sincethe proof is a bit long and involved, and the concepts it uses are not usedelsewhere, the reader is advised on a first reading to take it on faith and skipto the next section. A more direct proof of proposition 5.9 which bypasses thearguments of this section is explained in [Len].We let A be a ring which is finite flat, and is a complete intersection.(Hence, A can be written as O[[X1 , . . . , Xn ]]/(f1 , . . . , fn ).) We assume thatthe augmentation map for A is induced from the map on O[[X1 , . . . , Xn ]]sending f to f (0). This implies that fi (0) = 0, i.e., the fi have no constantterm.We recall some definitions from commutative algebra that we will need.An ideal I of a local ring R is said to be primary if I 6= R and every zerodivisor in R/I is nilpotent. If (x1 , . . . , xn ) generates a primary ideal of R, andn = dim R, then (x1 , . . . , xn ) is called a system of parameters for R.Lemma 5.10 The sequence (f1 , . . . , fn , ) is a system of parameters for U =O[[X1 , . . . , Xn ]].Proof: The quotient ring U/(f1 , . . . , fn , ) is local and is finitely generated asa k-vector space; therefore every element in its maximal ideal is nilpotent. 2A sequence (x1 , . . . , xn ) in a ring R is said to be a regular sequence if xi isnot a zero-divisor in R/(x1 , . . . , xi1 ) for i = 1, . . . , n.Lemma 5.11 The sequence (f1 , . . . , fn ) is a regular sequence for U .

143

Proof: The ring U is Cohen Macaulay, since , X1 , . . . , Xn is a system of

parameters of U which is also a regular sequence. Hence, by theorem 17.4 (iii)of [Mat], (f1 , . . . , fn , ) is a regular sequence in U . A fortiori, the sequence(f1 , . . . , fn ) is also a regular sequence.2The proofs of lemma 5.10 and 5.11 use only the fact that A is finitelygenerated as an O-module, and not that A is flat. As a corollary of this proof,we therefore have:

(1)t xt ui1 uit1 uit+1 uip .

We denote by Hp (x; R) the homology groups of this complex. We record here

the main properties of this complex that we will use.Proposition 5.13

(a) H0 (x; R) = R/(x).

(b) There is a long exact homology sequence

xn+1

Hp (x; R) Hp (x, xn+1 ; R) Hp1 (x; R)

Hp1 (x; R) Hp1 (x, xn+1 ; R) Hp2 (x; R)

(c) Hp (x; R) is annihilated by the ideal (x), i.e., it has a natural R/(x)module structure.144

(d) If x is a regular sequence, then Hp (x; R) = 0 for all p > 0 (i.e., thecomplex Kp (x; R) is a free resolution of R/(x).)Proof: The first assertion follows directly from the definition. For (b) and (c),see [Mat], th. 16.4. The assertion (d) can be proved by a direct inductionargument on n, using the long exact homology sequence: For p > 1, thissequence becomes0 Hp (x, xn+1 ; R) 0,and for p = 1, it isxn+1

Lemma 5.14 (Tate): The function (f ) = (f (D)) induces an isomorphism

of O[[X]]-modulesHomO[[X]] (A[[X]], O[[X]])/(g1 , . . . , gn ) A.Proof: By lemma 5.11, the sequence (f ) = (f1 , . . . , fn ) is a regular O[[X]]sequence. One can see directly from the definition that the sequence (g) =(gi ) = (Xi ai ) is a regular A[[X]]-sequence. Let K(f ) and K(g) be the Koszulcomplexes associated to these two sequences. It follows from proposition 5.13that the Koszul complex K(f ) is a resolution of A by free O[[X]]-modules,and the Koszul complex K(g) is a resolution of A by free A[[X]]-modules, andhence a fortiori, by free O[[X]]-modules. We define a map : K(f ) K(g)of complexes by letting0 : K0 (f ) K0 (g)be the natural inclusion of O[[X]] into A[[X]], and letting1 : K1 (f ) K1 (g)be the map defined by(1 (u1 ), . . . , 1 (un )) = (v1 , . . . , vn )M,and extending it by skew-linearity to a map of exterior algebras. One cancheck that the resulting map is a morphism of complexes which induces theidentity map A A, and satisfiesn (u1 un ) = D v1 vn .Applying the functor HomO[[X]] (, O[[X]]) to these two free resolutions, andtaking the homology of the resulting complexes, we find that since is a homotopy equivalence of complexes it induces an isomorphism on the cohomology,and in particular, on the nth cohomology:'

n : HomO[[X]] (A[[X]], O[[X]])/(g1 , . . . , gn )

HomO[[X]] (O[[X]], O[[X]])/(f1 , . . . , fn ) ' A,

which is given explicitly by the formula:

n (f ) = (f (D)).2We finally come to the proof of proposition 5.9, which we can state in amore precise form.146

Lemma 5.15 The map : HomO (A, O) A defined by (f ) = (f(D)),

where f : A[[X]] O[[X]] is the base change of f , is an A-module isomorphism, and hence, A is Gorenstein.Proof: The key point is to show that is A-linear. By definition, ifa = (a0 ) A, with a0 O[[X]],then

(af ) = (f(aD)) = (f((a a0 )D)) + (f(a0 D)).

Since a a0 ker , it can be written as an A[[X]]-linear combination of the

Exercise 5.18 Write down explicitly an isomorphism A HomO (A, O) for

The Congruence ideal for complete intersections

Let A be an object of CO, which is finite flat and is a complete intersection asin the previous section, so that A ' O[[X1 , . . . , Xn ]]/(f1 , . . . , fn ).Using the result of the previous section, we will give an explicit formulafor computing A in this case, and prove that (c) implies (b) in theorem 5.3.Let A := HomO (A, O), and let A : O A be the dual map. Fromthe Gorenstein property of A, we may identify A with A (as A-modules). Fix'any identification : A A. (Any two such differ by a unit in A.) Onechecks thatA (O ) = AnnA ker A .

Hence, A is the image of the map A A . By using the explicit construction

of given in lemma 5.15, we find:A A (O ) = (D),where D is the determinant defined in section 5.3. By a direct calculation usingequation (5.3.1), one sees that the right hand side is equal to det(fi /Xj (0)).Hence we have shown:

Proof: The equation (5.2.9) of remark 5.7 implies that

The usefulness of the notion of complete intersections comes from the followingtwo (vaguely stated) principles:1. Isomorphisms to complete intersections can often be recognized by lookingat their effects on the tangent spaces.2. Isomorphisms from complete intersections can often be recognized by looking at their effects on the invariants .These vague principles are made precise in theorems 5.21 and 5.24 respectively.Theorem 5.21 Let : A B be a surjective morphism of augmented rings,with B a (finite, flat) complete intersection. If induces an isomorphism fromA to B , and these modules are finite, then is an isomorphism.Remark 5.22 Let A = O[[X, Y ]]/(X(X ), Y (Y )) be the ring of example 4, let B = O[[X, Y ]]/(X(X ), Y (Y ), XY ) be the ring of example2, and let : A B be the natural projection. The map induces anisomorphism A B , even though is not an isomorphism. The assumption that B is a complete intersection is crucial for concluding that is anisomorphism.Remark 5.23 The natural mapO[[X]]/(X 3 ) O[[X]]/(X 2 )is a surjective morphism inducing an isomorphism on tangent spaces, and thetarget ring is a complete intersection. Yet this map is not an isomorphism.This shows that the assumption on the finiteness of the tangent spaces cannotbe dispensed with.Proof of theorem 5.21: Recall that U = O[[X1 , . . . , Xn ]] is the augmented ringof example 5 of section 5.1. LetB : U Bbe a surjective morphism of augmented rings with ker B = (f1 , . . . , fn ). Letb1 , . . . , bn ker B denote the images of X1 , . . . , Xn by B , and let a1 , . . . , an ker A denote inverse images of b1 , . . . , bn by . Since is an isomorphismon tangent spaces, the elements ai generate (ker A )/(ker A )2 . Hence themorphismA : O[[X1 , . . . , Xn ]] A149

defined by A (Xi ) = ai induces a surjection U A , and so it is surjective,

by lemma 5.5.We claim that ker B is contained in ker A (and hence, ker B = ker A ).For, let g1 , . . . , gn be elements of ker A whose linear terms g1 , . . . , gn generatethe kernel ofA : U A .Since ker A ker B , it follows that there exists an n n matrix M Mn (U )with entries in U such that(g1 , . . . , gn ) = (f1 , . . . , fn )M. be the matrix of constant terms of the matrix M . Then we haveLet M.(g1 , . . . , gn ) = (f1 , . . . , fn )MSince (g1 , . . . , gn ) and (f1 , . . . , fn ) generate the same submodules of rank n is a unit in O. Hence, M isand finite index in U , it follows that det Minvertible, and therefore the fi can be expressed as a U -linear combination ofthe gj . This implies that ker B ker A . Now we see that A B1 gives awell-defined inverse to , so that is an isomorphism.2Theorem 5.24 Let : A B be a surjective morphism of augmented rings.Suppose that A and B are finite flat, and that A a complete intersection. IfA = B 6= 0, then is an isomorphism.Remark 5.25 The torsion-freeness assumption on B is essential: if n is largeenough, then B = A/(ker A )n satisfies A = B , although the natural mapA B is not injective when A 6= O.Proof: By proposition 5.9, A is Gorenstein, i.e., it satisfiesA := HomO (A, O) ' A as A-modules.Now, we observe thatker A AnnA ker A = 0,

(5.5.1)

and likewise for B. For, let x be a non-zero element of A , and let x0

AnnA ker A satisfy A (x0 ) = x. For all a ker A AnnA ker A , we have0 = a(x x0 ) = ax,150

the first equality because a belongs to AnnA ker A and (x x0 ) belongs to

ker A , the second equality because a belongs to ker A and x0 belongs toAnnA ker A . Hence a belongs to the O-torsion submodule of A, and thereforeis 0.It follows from (5.5.1) that the homomorphism A (resp. B ) induces anisomorphism from AnnA ker A (resp. AnnB ker B ) to A (resp. B ). Since Ais isomorphic to B , it follows that induces an isomorphism from AnnA ker Ato AnnB ker B , i.e.,AnnA ker A = AnnB ker B .From (5.5.1) it also follows a fortiori thatker AnnA ker A = 0,hence there is an exact sequence of A-modules:0 ker AnnA ker A A.

(5.5.2)

The cokernel of the last map is

A/(ker AnnA ker A ) ' B/(AnnA ker A ) ' B/(AnnB ker B ),which is torsion-free, since there is a natural injectionB/(AnnB ker B ) , End O (ker B ).Hence, the exact sequence (5.5.2) splits over O. Taking O duals in (5.5.2)and using the Gorenstein condition for A, we thus get an exact sequence ofA-modules:A (ker ) (AnnA ker A ) 0.Applying the functor A k (relative to the map A k), we find1 = dimk (A A k) dimk ((ker ) A k) + dimk ((AnnA ker A ) A k).Since A 6= 0, it follows that (AnnA ker A ) A k 6= 0, and hence we must have(ker ) A k = 0.Therefore by Nakayamas lemma and duality, ker = 0, which proves thetheorem.2

151

5.6

A resolution lemma

It turns out that objects in CO

can be resolved (in a weak sense) by acomplete intersection, namely,

Theorem 5.26 Let A be an augmented ring which is finite flat over O. Then

there is a morphism A A in the category CO

such that:(a) the ring A is finite flat over O and is a complete intersection;(b) the map A A induces an isomorphism A A .

we find that the ring V /(f1 , . . . , fn ) is a finitely generated O-module: it can

be generated by the images of the monomials of degree n(m + 1), since therelations allow us to rewrite any monomial of higher degree in terms of ones oflower degree. Completing at the ideal (, X1 , . . . , Xn ), we find that the ringA = U/(f1 , . . . , fn )has the desired properties: the natural homomorphism from A to A inducesan isomorphism on the tangent spaces, since the linear terms of the fi generatethe kernel of the induced map U A on the tangent spaces, and A is afinitely generated O-module, since V /(f1 , . . . , fn ) is.2152

5.7

A criterion for complete intersections

The results we have accumulated so far allow us to give an important criterion

for an object A to be a complete intersection:Theorem 5.27 Let A be an augmented ring which is a finitely generatedtorsion-free O-module. If #A #(O/A ) < , then A is a complete intersection.Proof: Let : A A be the surjective morphism given by the resolutiontheorem (theorem 5.26). Then we have#(O/A ) (#A ) = (#A ) #(O/A ), andwhere the first inequality is by assumption, the second by the choice of A,the third is by the equation (5.2.3). On the other hand, by equation (5.2.2),we have#(O/A ) #(O/A ).It follows that

A = A ,so that is an isomorphism by theorem 5.24. It follows that A is a completeintersection.2

5.8

Proof of Wiles numerical criterion

Theorem 5.28 Let R and T be augmented rings such that T is a finitely

generated torsion-free O-module, and let : R T be a surjective morphism.If#R #(O/T ) < ,then R and T are complete intersections, and is an isomorphism.

Proof: We have:#(O/T ) #T #R #(O/T ),where the first inequality is by equation (5.2.3), the second follows from thesurjectivity of , and the third is by hypothesis. Therefore,#T = #(O/T ),and hence T is a complete intersection by theorem 5.27. Since the orders ofR and T are the same, induces an isomorphism between them. Hence is an isomorphism R T , by theorem 5.21. This completes the proof.2153

Let Ck be the category of complete local noetherian k-algebras with residue

field k. Again all morphisms are assumed to be local. There is a naturalfunctor A 7 A from CO to Ck which send A to A := A/.We say that an object A of Ck which is finite dimensional as a k-vectorspace is a complete intersection if it is isomorphic to a quotientA = k[[X1 , . . . , Xr ]]/(f1 , . . . , fr ).Note that if an object A of CO is a complete intersection, then A is acomplete intersection in Ck . As a partial converse, we have:Lemma 5.29 Suppose that R 7 T is a map in the category CO , and that T isfinitely generated and free as an O-module. Then R T is an isomorphism T is.of complete intersections, if and only if RProof: This is an exercise and is left to the reader.We now come to the proof of lemma 3.39 of section 3.4:

Lemma 5.30 Suppose that K K 0 are local fields with rings of integersO O0 and that A is an object of CO which is finitely generated and free asan O-module. Then A is a complete intersection if and only if A O O0 is.Proof: One implication is clear. Let k and k 0 be the residue fields of O and O 0respectively. By lemma 5.29 it is enough to prove that, if R is an object of Ckwhich is finite dimensional as a k-vector space, thenR0 = R k k 0 is a complete intersection

We now turn to the proof of theorem 3.41. In view of the last section we willwork in characteristic `. Thus let : R T be a surjective morphism in thecategory Ck , where R and T are finite dimensional as k-vector spaces. Let rbe a non-negative integer. If J k[[S1 , ..., Sr ]] and J (S1 , ..., Sr ) then by astrong J-structure we shall mean a commutative diagram in Ckk[[S1 , ..., Sr ]]

strong Jn -structure. Then R T and these rings are complete intersections.

Before proving theorem 5.31 we shall explain how to deduce theorem 3.41from it. If J is an ideal of O[[S1 , ..., Sr ]] we will use J to denote its image ink[[S1 , ..., Sr ]]. If S given byO[[S1 , ..., Sr ]]

is a strong J-structurefor (R/mR a)O k T O k. If Jn is a sequence of idealsas inTtheorem 3.41 then Jn is a sequence of nested ideals with J0 = (S1 , ..., Sr )and n Jn = (0). Moreover if for each n there is a Jn -structure Sn for R Tthen for each n there is a strong Jn -structure Sn for(R/mR a) O k T O k.Then by theorem 5.31 we see that this map is an isomorphism of completeintersections. Lemma 5.29 shows that theorem 3.41 follows.2Before returning to the proof of theorem 5.31, let us first make some remarks about strong J-structures.156

The set of strong J structures for all ideals J (S1 , ..., Sr ) k[[S1 , ..., Sr ]]forms a category, with the obvious notion of morphism. If S is a strong J-structure and if (S1 , ..., Sr ) J 0 J then there is a naturalJ 0 -structure S mod J 0 obtained by replacing T 0 by T 0 /J 0 and R0 by the imageof R0 R (T 0 /J 0 ). If R is a finite dimensional k-vector space and if J has finite index ink[[S1 , ..., Sr ]] then there are only finitely many isomorphism classes of strongJ-structure. (This follows because we can bound the order of R 0 in any Jstructure. Explicitly we must have #R0 (#R)(#k[[S1 , ..., Sr ]]/J)dimk T .)Lemma 5.32 Suppose that R is a finite dimensional k vector space. SupposeTalso that {Jn } is a nested (decreasing) sequence of ideals and that J = n Jn .If for each n a strong Jn structure exists then a strong J structure exists.Proof: We may suppose that each Jn has finite index in k[[S1 , ..., Sr ]]. Let Sndenote a strong Jn -structure. Let Sn,m = Sn mod Jm if m n. Because thereare only finitely many isomorphism classes of strong Jm structure, we mayrecursively choose integers n(m) such that Sn(m),m = Sn,m for infinitely many n,

if m > 1 then Sn(m),m1

Sm mod Jm1 = Sm1 . One checks that S = lim Sm

is the desired strong Jstructure.2

Lemma 5.33 Suppose that a strong (0) structure exists. Then the map R T is an isomorphism, and these rings are complete intersections.Proof: Because k[[S1 , ..., Sr ]] , T 0 (and T 0 is a finitely generated k[[S1 , ..., Sr ]]module by Nakayamas lemma, cf. [Mat], thm. 8.4) we see that the Krulldimension of T 0 is at least r. On the other hand k[[X1 , ..., Xr ]] T 0 and soby Krulls principal ideal theorem this map must be an isomorphism. Thus

k[[X1 , ..., Xr ]] R0 T 0 .Hence we have that

k[[X1 , ..., Xr ]]/(S1 , ..., Sr ) R0 /(S1 , ..., Sr ) T,

and the lemma follows.Theorem 5.31 follows at once from these two lemmas.157