Can anyone help me with references to the current fastest algorithms for counting the exact sum of primes less than some number n? I'm specifically curious about the best case running times, of course. I'm pretty familiar with the various fast algorithms for prime counting, but I'm having a harder time tracking down sum of prime algorithms...

Thanks, Gerry - both interesting links, and the tabulation in particular is handy for me. I notice that the largest value (11138479445180240497) requires a full 64 bits to express in binary - looks like you can't count much higher with taking precision concerns pretty seriously (at least in languages like C).
–
Nathan McKenzieNov 20 '11 at 23:12

1

You should be able to get away with using only 64 bits for a pretty long time, since the first few bits are determined by the asymptotic formula. Even beyond the point you're comfortable holding a few implicitly, it's not hard to use a second 64-bit variable to extend your precision to 128 bits -- you need only check for overflow every million iterations unless you're going over half a trillion primes.
–
CharlesNov 20 '11 at 23:53

5 Answers
5

Deléglise-Dusart-Roblot [1] give an algorithm which determines $\pi(x,k,l)$, the number of primes up to $x$ that are congruent to $l$ modulo $k,$ in time $O(x^{2/3}/\log^2x).$ Using this algorithm to count the number of primes in all residue classes $k<2\log x$ takes
$$1+\sum_{p<2\log x}(p-2)\sim\frac{2\log^2x}{\log\log x}$$
invocations of Deléglise-Dusart-Roblot for a total of $O(x^{2/3}/\log\log x)$ time.

This allows one to determine the value of $\sum_{p\le x}p$ mod all primes up to $2\log x$ and hence, by the Prime Number Theorem and Chinese Remainder Theorem, the value of the sum mod $\exp(\vartheta(2\log x))=x^2(1+o(1)).$ Together with bounds on the value of $\sum_{p\le x}p$ [2], this allows the computation of the sum.

Note that the primes slightly beyond $2\log x$ may be required depending on the value of $\vartheta(2\log x).$ Practically speaking, except for $x$ tiny, $2\log x+\log x/\log\log x$ suffices. This does not change the asymptotics.

I do not know if it is possible to modify the Lagarias-Odlyzko analytic formula [3] to count in residue classes. If so, this would allow an $O(x^{1/2+o(1)})$ algorithm.

An excellent answer! The Lagarias-Odlyzko analytic formula can certainly be obtained in this setting as well (although I am not sure whether someone has done it), giving the $x^{1/2+o(1)}$ answer. This basically comes down to beeing able to calculate multiple values (or multiple zeros) of Dirichlet L-functions fast (The Odlyzko-Schönhage algorithm for Dirichlet L-functions), and this is possible for Dirichlet L-functions as well as Riemann zeta-function (see e.g. Platt's thesis: maths.bris.ac.uk/~madjp/thesis5.pdf)
–
Johan AnderssonNov 21 '11 at 18:16

Edit Nov 22: Changed the condition of the test function $\Phi$ somewhat to simplify my argument (removed one sum in the identity below).

Although I like Charles' answer and its application of the Chinese remainder theorem, let me give you a different take on the problem. It is possible to use an analytic method more directly (similar to the Lagarias-Odlyzko method), just using the Riemann zeta-function and not any theory of the Dirichlet L-functions (or primes in arithmetical sequences). This method can treat rather arbitrary sums
$$ \sum_{ p < x } f(p) $$ for smooth functions $f$.

The idea is to consider the identity
$$\sum_{ p < x } f(p) =\sum_{p} f(p)\Phi \left( \frac {p-x} {\sqrt x} \right )
- \sum_{x< p < x + \sqrt x } f(p) \Phi \left(\frac {p-x} {\sqrt x} \right), $$
where $\Phi(x)$ is smooth test function such that $\Phi(x)=1$ for $x<0$ and $\Phi(x)=0$ for $x>1$. It is clear that the last sum can be calculated in $O( x^{1/2} )$ time (say sieving out the primes in the interval, and explicit calculation). The idea is now to use the Mellin-transform identity
$$ \sum_{p}f(p)\Phi \left( \frac {p-x} {\sqrt x} \right )= \frac 1 {2 \pi i} \int_{c-\infty i}^{c+\infty i} \sum_{p} p^{-s} \int_0^\infty \Phi \left(\frac{y-x}{\sqrt x} \right)f(y) y^{s-1} dyds. (c>1)$$
The point here is that the Mellin transform $\int_0^\infty \Phi(\frac{y-x}{\sqrt x})f(y) y^{s-1}dy$ is small when $|\Im(s)|>x^{1/2+\epsilon}$ and $\Re(s)=c$ for any $\epsilon>0$, so that part of the integral can be discarded. The Dirichlet series $\sum_{p} p^{-s}$ can be expressed in terms of the logarithm of the zeta function for the arguments $ks$ (summing over $k$). Now the Odlyzko-Schönhage algorithm (Quite nice algorithm, uses fast fourier transform) allows us to calculate the values of the Riemann zeta-function for say $s=c+it$ for $ |t| < x^{1/2+\epsilon}$ in $O(x^{1/2+\epsilon+o(1)})$ time. We also remark that the Weil explicit formula can be used instead of this complex integral (then the zeros of the Riemann zeta-function needs to be calculated fast, but this can also be done by the Odlyzko-Schönhage algorithm). This means that the total time to calculate $\sum_{ p < x } f(p)$ will be $O(x^{1/2+\epsilon})$ for any $\epsilon>0$.

Note that the same argument applies when primes in arithmetical progression are concerned (see my comment on Charles answer). Since the Odlyzko-Schönhage algorithm also holds for the Dirichlet L-functions, this case can be treated in the same way.

I'll put in a plug for my original paper with Lagarias and Odlyzko, as well as a recent paper by Bach, Klyve and Sorenson: http://www.ams.org/journals/mcom/2009-78-268/S0025-5718-09-02249-2/home.html Computing prime harmonic sums Math. Comp. 78 (2009), 2283-2305. Although the algorithms of Lagarias and Odlyzko (with the extra algorithm of Odlyzko-Schonhage for computing good approximations to a bunch of values of $\zeta(s)$) are asymptotically the best, having complexity $O(x^{1/2 + \epsilon})$ the combinatorial algorithms will probably work best for any reasonable ranges. Though one should look at J. Buethe, J. Franke, A. Jost, and T. Kleinjung, "Conditional Calculation of pi(10^24)", Posting to the Number Theory Mailing List, Jul 29 2010. A little more detail (that's all I have right now) is contained in the talk of David Platt: www.maths.bris.ac.uk/~madjp/junior%20talk.pdf . The idea in the combinatorial method is the following:

Suppose that $f(n)$ is a completely multiplicative function of the positive integers. In the case of calculating $\pi(x)$, we take $f(n) = 1$. In the case of calculating the sum of the primes $\le x$ we take $f(n) = n$. In the case of Calculating $\pi(x,a,q)$ we take $f(n)$ to be a Dirichlet character mod $q$.

Define $\phi(x,a)$ as the sum of $f(n)$ for all integers $n \le x$ which are not divisible by the first $a$ primes. Clearly $\phi(x,0) = \sum_{n \le x} f(n)$. We have the recursion:

$\phi(x,a+1) = \phi(x,a) - f(p_{a+1})\phi(x/p_{a+1},a).$

Imagine that you have a labeled tree, where the labels are $\pm f(k) \phi(x/k,b)$ for some $k$ and $b$. You can expand any node in the tree you want by applying the above formula. This has the property that it leaves the sum of the values at the leaves the same. So the idea is to start with a tree consisting of one node labeled $\phi(x,\pi(\sqrt x))$ and keep expanding until either $b=0$ or $x/k < $ some cutoff. If you accumulate all such nodes you can evaluate all of them by sieving (and a special data structure -- see the Lagarias-Miller-Odlyzko paper for details), and the optimal value for the cutoff is $x^{2/3}$ perhaps multiplied by some logarithmic factors.

This has been great. Some nice answers here - I'll accept one shortly.

I was looking for a lit review because I have implemented an approach that runs in $O(n^\frac{2}{3} \log n)$ time and $O(n^\frac{1}{3} \log n)$ space for summing primes to non-negative integer powers, and I wanted to see how it compared. I wasn't exactly ready to write up what I'd done, but I guess I'll take a stab. For the curious:

SumPrimes[0,n] will return the count of primes, SumPrimes[1,n] will return the sum of primes less than n, and so on.

PART 2 - The Specific Algorithm

Here's my fastest approach for calculating $D_{a,k}(n)$ It is inspired by this method for calculating the Mertens function. (If you find the following writeup interesting but rushed, I've written about my approach with somewhat more detail here and, for the special case of just prime counting, here. A C++ implementation is here)

Bear with me; this is a bit involved. It requires two observations.

First, we can use sieving to calculate values of $D_{a,k}(n)$ up to $n^\frac{2}{3}$ in roughly $O(n^\frac{2}{3} \log n)$ time and $O(n^\frac{1}{3} \log n)$ space with the following approach.

We want to sieve numbers up to $n^\frac{2}{3}$ in such a way that we have their full power signature, with $ n = {p_1} ^ {a_1} \cdot {p_2} ^ {a_2} \cdot {p_3} ^ {a_3} \cdot ... $. With this, the normal count of divisor function is $d_k(n) = \binom {{a_1} + k - 1} {a_1} \cdot \binom {{a_2} + k - 1} {a_2} \cdot \binom {{a_3} + k - 1} {a_3} \cdot ...$ The strict count of divisors function, a count of divisor function excluding 1 as a factor, is, in turn, $d_k'(n) = \sum_{j=0}^k -1^{k-j} \binom {k}{j} d_j(n)$. With the strict count of divisors function, we can say $ D_{a,k}(n) = D_{a,k}(n-1) + d_k'(n) \cdot n^a $ . Thus, if we sieve all the numbers from 1 to $n^\frac{2}{3}$, for each number we can compute its values for $d_k(n)$, from which we can calculate $d_k'(n)$, which in turn lets us keep a running total of $D_{a,k}(n)$. We will segment our sieve in blocks of size $n^\frac{1}{3}$ to stay in our memory bound.

The second observation is that, using the following identity, we can express $D_{a,k}(n)$ as sums that rely only on $D_{a,k}(j)$ where $j < n^\frac{2}{3}$ and of arbitrary values of $D_{a,1}(j) = \sum_{k=2}^j k^a$, which can generally be computed in constant or log time for any values of j for the values of a that we're working with.

If we let $d_{a,k}(n) = D_{a,k}(n) - D_{a,k}(n-1)$, we start with this combinatorial identity, whose derivation I have written up elsewhere but don't have the room to show:

For this identity, t is some value such that $1 < t < n$. If you look through the sums closely, you should see that the right hand side doesn't rely on any values of $d_{a,k}(j)$ where $j > t$ and $D_{a,k}(j)$ where $ j > \frac{n}{t}$ except when $k = 1$, as desired.

Even if we set $t=n^\frac{2}{3}$ and sieve values of $D_{a,k}(j)$ up to $n^\frac{2}{3}$, these sums are too large to be computed in our time bound. Instead, we have to take advantage of certain symmetries to reduce the calculations in these sums. I'm going to hand wave over the process leading to this - you can find a description of it and much else here, but, taking one last leap, the following Mathematica code takes the identity we were just talking about and evolves it further as the function labeled DDFast - remember, when tracing through the following code, that in a full implementation any instances of DD and d in DDFast would be looked up thanks to sieving and caching within our time and space bounds, so what follows isn't an accurate reflection of its execution time, just the accurate mechanics of DDFast.

The function labeled DDFast, which is really the key here, can be computed in $O(n^\frac{2}{3})$ time as long as all of the functions it references have been cached.

So there we have it. If you compute SumPrimesFast and, in turn, DDFast, but interleave the computation of those sums with the sieving process described above, you end up with something like an $O(n^\frac{2}{3} \log n)$ time and $O(n^\frac{1}{3} \log n)$ space algorithm that can count prime sums to non-negative, integer powers, including as a prime counting function for $a = 0$.

It's worth taking a look, too, at rng[ A_, start_, end_ ] := Sum[m^A, {m, start, end}] - the reason this algorithm works for non-negative integers is that rng can be computed in constant time for non-negative, integer values with Faulhaber's Formula. It's possible the algorithm could work for other powers if rng can be usefully extended to other powers with sufficient speed and precision.

I have an implementation of this algorithm in C++ here. It has a few C math precision issues I haven't tracked down, so it's often off by a 2 or 3, but it does essentially implement what I've described here, and might be useful for stepping through with breakpoints.

A somewhat more fleshed out description of this can be found here. Several details that I've hand-waved over here are described there. Even better written up is a description of this as a prime counting algorithm, here. It does a better job showing a few key derivations.

This is a difficult problem. I asked about it here on math.se and here on cstheory. In both cases my question was somewhat broader: allowing sums over different exponents rather than just 1. In the second link I also allowed interactive proofs.

I noted that for exponent 0 there are efficient algorithms (superior to enumerating primes): the various analytical or combinatorial $\pi(x)$ algorithms. But no one was able to suggest an efficient means to solve the problem for any exponent. An interactive proof was proposed that allows a person to check a proposed proof in linear time, but the confidence in the proof is asymptotically 0% against a cunning adversary.

I am not aware of any hardness results, though, so this seems to be wide-open.

Unless I am missing something, computing the sum of the kth powers of (the positive primes less than n) should be equivalent in polynomial complexity to finding all the primes less than n, when k is a positive integer. Also, it may be quick to approximate a sum of kth powers by considering the related sum for numbers of the form , say, 210m + j where j is coprime to and less than 210. Gerhard "Ask Me About System Design" Paseman, 2011.11.20
–
Gerhard PasemanNov 21 '11 at 1:07

The complexity of counting primes is sublinear. I've seen $n^{2/3}\log n$ for time with $n^{1/3}\log n$ for memory but I don't know if it is the current record. However that counting is based on some non-trivial sieve type identities and I do not immediately see if there are any analogs of those identities for power sums.
–
fedjaNov 21 '11 at 5:32

Fedja: Are you saying counting primes is sub-linear, or do you mean summing primes? If the latter, I would be fascinated to see a reference. If the former, there is a method from Lagarias-Odlyzko that is commonly listed as having variants that are O(x^3/5+epsilon) time, O(epsilon) space, and O(x^1/2+epsilon) time and O(x^1/4+epsilon) space (though with quite difficult constant time factors).
–
Nathan McKenzieNov 21 '11 at 6:15

I know only about counting (I just responded to Gerhard). It is, indeed, an interesting question if we can do better than $n$ for the sum rather than for the number. I haven't seen it anywhere but the first natural step would be to go back to all those partial sieve formulae and see if they can be modified to give the sums rather than numbers. I do not see how to do it immediately but I do not see why it is impossible in principle either.
–
fedjaNov 21 '11 at 12:23

Unfortunately, fedja and I are experiencing a misunderstanding. I am referring to finding, not counting, primes. Only if pi(n) and pi(n+1) are different have we found a prime. Knowing pi(n)and and pi(n+k) does not find the primes in the middle in general. Gerhard "Ask Me About System Design" Paseman, 2011.11.21
–
Gerhard PasemanNov 21 '11 at 19:07