in the projective plane , where are coefficients whose discriminant is non-vanishing, which is the projective version of the affine elliptic curve

To each such curve one can associate a genus, which we will define later; for instance, elliptic curves have genus . We can also count the cardinality of the set of -points of . The Hasse-Weil bound relates the two:

Theorem 1 (Hasse-Weil bound).

The usual proofs of this bound proceed by first establishing a trace formula of the form

for some complex numbers independent of ; this is in fact a special case of the Lefschetz-Grothendieck trace formula, and can be interpreted as an assertion that the zeta function associated to the curve is rational. The task is then to establish a bound for all ; this (or more precisely, the slightly stronger assertion ) is the Riemann hypothesis for such curves. This can be done either by passing to the Jacobian variety of and using a certain duality available on the cohomology of such varieties, known as Rosati involution; alternatively, one can pass to the product surface and apply the Riemann-Roch theorem for that surface.

In 1969, Stepanov introduced an elementary method (a version of what is now known as the polynomial method) to count (or at least to upper bound) the quantity . The method was initially restricted to hyperelliptic curves, but was soon extended to general curves. In particular, Bombieri used this method to give a short proof of the following weaker version of the Hasse-Weil bound:

Theorem 2 (Weak Hasse-Weil bound) If is a perfect square, and , then .

In fact, the bound on can be sharpened a little bit further, as we will soon see.

Theorem 2 is only an upper bound on , but there is a Galois-theoretic trick to convert (a slight generalisation of) this upper bound to a matching lower bound, and if one then uses the trace formula (1) (and the “tensor power trick” of sending to infinity to control the weights ) one can then recover the full Hasse-Weil bound. We discuss these steps below the fold.

I’ve discussed Bombieri’s proof of Theorem 2 in this previous post (in the special case of hyperelliptic curves), but now wish to present the full proof, with some minor simplifications from Bombieri’s original presentation; it is mostly elementary, with the deepest fact from algebraic geometry needed being Riemann’s inequality (a weak form of the Riemann-Roch theorem).

The first step is to reinterpret as the number of points of intersection between two curves in the surface . Indeed, if we define the Frobenius endomorphism on any projective space by

then this map preserves the curve , and the fixed points of this map are precisely the points of :

Thus one can interpret as the number of points of intersection between the diagonal curve

and the Frobenius graph

which are copies of inside . But we can use the additional hypothesis that is a perfect square to write this more symmetrically, by taking advantage of the fact that the Frobenius map has a square root

with also preserving . One can then also interpret as the number of points of intersection between the curve

and its transpose

Let be the field of rational functions on (with coefficients in ), and define , , and analogously )(although is likely to be disconnected, so will just be a ring rather than a field. We then (morally) have the commuting square

if we ignore the issue that a rational function on, say, , might blow up on all of and thus not have a well-defined restriction to . We use and to denote the restriction maps. Furthermore, we have obvious isomorphisms , coming from composing with the graphing maps and .

The idea now is to find a rational function on the surface of controlled degree which vanishes when restricted to , but is non-vanishing (and not blowing up) when restricted to . On , we thus get a non-zero rational function of controlled degree which vanishes on – which then lets us bound the cardinality of in terms of the degree of . (In Bombieri’s original argument, one required vanishing to high order on the side, but in our presentation, we have factored out a term which removes this high order vanishing condition.)

To find this , we will use linear algebra. Namely, we will locate a finite-dimensional subspace of (consisting of certain “controlled degree” rational functions) which projects injectively to , but whose projection to has strictly smaller dimension than itself. The rank-nullity theorem then forces the existence of a non-zero element of whose projection to vanishes, but whose projection to is non-zero.

Now we build . Pick a point of , which we will think of as being a point at infinity. (For the purposes of proving Theorem 2, we may clearly assume that is non-empty.) Thus is fixed by . To simplify the exposition, we will also assume that is fixed by the square root of ; in the opposite case when has order two when acting on , the argument is essentially the same, but all references to in the second factor of need to be replaced by (we leave the details to the interested reader).

For any natural number , define to be the set of rational functions which are allowed to have a pole of order up to at , but have no other poles on ; note that as we are assuming to be smooth, it is unambiguous what a pole is (and what order it will have). (In the fancier language of divisors and Cech cohomology, we have .) The space is clearly a vector space over ; one can view intuitively as the space of “polynomials” on of “degree” at most . When , consists just of the constant functions. Indeed, if , then the image of avoids and so lies in the affine line ; but as is projective, the image needs to be compact (hence closed) in , and must therefore be a point, giving the claim.

For higher , we have the easy relations

The former inequality just comes from the trivial inclusion . For the latter, observe that if two functions lie in , so that they each have a pole of order at most at , then some linear combination of these functions must have a pole of order at most at ; thus has codimension at most one in , giving the claim.

From (3) and induction we see that each of the are finite dimensional, with the trivial upper bound

Riemann’s inequality complements this with the lower bound

thus one has for all but at most exceptions (in fact, exactly exceptions as it turns out). This is a consequence of the Riemann-Roch theorem; it can be proven from abstract nonsense (the snake lemma) if one defines the genus in a non-standard fashion (as the dimension of the first Cech cohomology of the structure sheaf of ), but to obtain this inequality with a standard definition of (e.g. as the dimension of the zeroth Cech cohomolgy of the line bundle of differentials) requires the more non-trivial tool of Serre duality.

At any rate, now that we have these vector spaces , we will define to be a tensor product space

for some natural numbers which we will optimise in later. That is to say, is spanned by functions of the form with and . This is clearly a linear subspace of of dimension , and hence by Rieman’s inequality we have

if

Observe that maps a tensor product to a function . If and , then we see that the function has a pole of order at most at . We conclude that

Proof: From (3), we can find a linear basis of such that each of the has a distinct order of pole at (somewhere between and inclusive). Similarly, we may find a linear basis of such that each of the has a distinct order of pole at (somewhere between and inclusive). The functions then span , and the order of pole at is . But since , these orders are all distinct, and so these functions must be linearly independent. The claim follows.

Proof: As is not injective, we can find with vanishing. By the above lemma, the function is then non-zero, but it must also vanish on , which has cardinality . On the other hand, by (8), has a pole of order at most at and no other poles. Since the number of poles and zeroes of a rational function on a projective curve must add up to zero, the claim follows.

If , we may make the explicit choice

and a brief calculation then gives Theorem 2. In some cases one can optimise things a bit further. For instance, in the genus zero case (e.g. if is just the projective line ) one may take and conclude the absolutely sharp bound in this case; in the case of the projective line , the function is in fact the very concrete function .

Remark 1 When is not a perfect square, one can try to run the above argument using the factorisation instead of . This gives a weaker version of the above bound, of the shape . In the hyperelliptic case at least, one can erase this loss by working with a variant of the argument in which one requires to vanish to high order at , rather than just to first order; see this survey article of mine for details.

— 1. Additional notes —

One can get a “cheap” proof of Riemann’s inequality (with the “wrong” definition of the genus ) from Cech cohomology as follows. For any natural number , let denote the sheaf on , defined by setting the sections on any open set of to be the rational functions with no poles at , except possibly for a pole of order up to at in the case that contains . Then is the space of global sections on this sheaf, that is to say the zeroth Cech cohomology . On the other hand, we also have the skyscraper sheaf on , defined by setting the sections on to be if contains and otherwise. For , we have an obvious short exact sequence of sheaves

which upon taking Cech cohomology (and using the snake lemma) gives the long exact sequence

One can compute that and , so the long exact sequence becomes

The first part of this long exact sequence already recovers the trivial bounds (3). But it also shows that the dimensions of the spaces are non-increasing, and decrease by one precisely when the left inequality in (3) is sharp. If one then makes the non-standard definition , then this decrease can occur at most times, and Riemann’s inequality then follows. (However, it is not immediately obvious with this definition that is finite; this requires some additional effort, e.g. invoking the Riemann-Hurwitz formula.) To relate this definition of to the more usual notions of genus requires the use of Serre duality, which will not be discussed here.

The upper bound in Theorem 2 (or more precisely, a generalisation of this bound) can be converted into a comparable lower bound, namely

Theorem 5 Let be an absolutely irreducible quasiprojective curve of bounded degree, defined over , and let be a perfect square. Then

where the implied constant can depend on but is uniform in in the limit (holding and fixed).

Proof: The upper bound follows from Theorem 2, after first removing all the singularities from , normalising to be projective, and noting that the genus does not depend on . (Note that removing singularities only deletes points at most.) So the remaining task is to establish the matching lower bound. Unfortunately, the trace formula (1) is not enough by itself to convert upper bounds to lower bounds, due to the case in which all the are positive, although strangely enough it can be used for the reverse task of converting lower bounds to upper bounds. Instead, we use a Galois-theoretic argument, though for (somewhat idiosyncratic) reasons, I will disguise the Galois theory by writing everything in a geometric language rather than an algebraic one.

The basic idea is to embed (perhaps with a bounded number of points removed) as a component of a larger (non-reducible) curve for which (a) the number of “rational points” on this larger curve can be counted almost exactly; and (b) an upper bound similar to Theorem 2 exists for the number of “rational points” on each of the components of , so that a lower bound can then be established by subtraction.

An easy instance of this trick arises in the case of hyperelliptic curves, which we will write affinely (deleting the point at infinity, by abuse of notation) as for some polynomial defined over that is not a perfect square. We can embed this in the reducible curve

where is an arbitrarily chosen quadratic non-residue of . The curve is the union of two curves, one of which is a dilate of the other, so they both have the same genus , and so they each have at most points (in the regime when is bounded and is a perfect square), thanks to Theorem 2. On the other hand, if is non-zero, then exactly one of and is a quadratic residue, and so for all but a bounded number of , there are exactly two values of with , and so . One can then subtract the upper bound for one of the components from this estimate to obtain a matching lower bound for the other component, and in particular .

In general we can proceed as follows. Let be the degree of , so . After deleting points from , we can view as a degree cover of the affine line with many points removed, with projection map . Once we do so, we can pass to the lifted curve , defined as the collection of distinct -tuples in that lie over the same point in , thus . This is a degree cover of (most of) , and is defined over . It also carries a free action of the permutation group : if , we define

(the inverse is there to make the action a left action; one could also work with right actions instead if desired).

The curve need not be absolutely irreducible, so we break it into absolutely irreducible components. The permutation group acts transitively on these components, so if we pick one of these components, say , and let be the stabiliser, then is a subgroup of (in particular, ), and one can index the components of as , as ranges over a set of coset representatives of . Each fibre of over a generic point of the base is an orbit of ; since the map projects to the connected curve , we conclude that the projection of this orbit must have cardinality . In other words, the permutation group is a transitive group.

Even though is defined over , the individual components of need not be. In particular, the Frobenius image of might be another component of , say .

Now let be a generic point of defined over , then the fibre of above is a free -orbit and thus has cardinality orbit. Let be a point in this fibre. Applying Frobenius, we see that lies in the fibre of above , and hence

for a unique . Of course, if does not lie over a point defined over , then the equation (13) cannot hold. Since there are points in with points removed that are defined over , and each of these give rise to points in , we conclude that

for each . Strictly speaking, Theorem 2 is only directly applicable in the case when is the identity. However, even when is not the identity, one can “twist” the proof of Theorem 2 to establish the above estimate, basically by replacing the curve in (2) by a twisted variant

which is still isomorphic to ; we leave the details to the interested reader. If we subtract instances of (15) from (14), we obtain that

I think there is a small typo near the beginning. Is the number of rational points or or ? A nice post, nonetheless. Though, if you think the definition is wrong, then I wonder what you must think of the definition via the Hilbert polynomial!

It would be awesome if you could include a note indicating why a particular proof is interesting and important or what about it is novel for those of us more lay readers who still enjoy following your technical blog.

For commenters

To enter in LaTeX in comments, use $latex <Your LaTeX code>$ (without the < and > signs, of course; in fact, these signs should be avoided as they can cause formatting errors). See the about page for details and for other commenting policy.