All
primes in the sequence cannot be obtained adding previous distinct primes (not
necessarily all of them).

Starting with 2 the sequence goes
like this: 2, 3, 7, 11, 17, 41, 47,

Starting with 3 the sequence goes
like this: 3, 5, 7, 11, 13, 17, 47,

Questions:

Do
you devise an efficient
algorithm to calculate the sequence?

Calculate
both sequences up to the 100th member

______________Historical note: The sequences here asked are the
prime version of the so called "A sequences" defined the
following way "The infinite sequence 1<=a1<a2<... is
named an A sequence if no ai is the sum of distinct members of the
sequence other than ai". (See E28, UPiNT, R.K.Guy).
Regarding this kind of sequences one interesting question is if the sum of
its reciprocals converges or not. As a matter of fact Erdös once
proved that S (the sum of the infinite reciprocals) is less than 103.
Later Levine & O'Sullivan improved this bound to 4. What
would you say that this bound can be for the prime version sequences here
asked?

Jud McCranie wrote (24/2/2001):

Question 2, a) : here are 42 terms, it looks like getting 100 will be
very hard.

"I am making two lists - one is necessary, the other is to make
it much faster. Start with two empty lists, L and C. As we go along, build
a list of the primes in the sequence 2,3,7,11 ... Ln, and the cumulative
sum of these numbers 2,5,12,23 ... Cn (and keep track how many terms there
are in the list). To test to see if a new prime p can be expressed as the
sum of terms already in the list, I call a recursive subroutine with
parameter p to see if p is the sum of smaller terms on the list. The
subroutine takes x=p-Ln and calls itself recursively with parameter x. If
x is in the list, then it has found a solution (a way to sum to p, p=Ln+x).
If not the subroutine calls itself recursively with parameter x-Li, where
Li is the largest member of the list L that is < x. This recursion
continues until either a solution is found, or the parameter passed to the
subroutine is >Ci, in which case the remaining i terms can't possibly
sum to the parameter, so we can pop out of the recursion. At each level of
the recursion, if x-Li fails to produce a solution, it continues with
x-L(i-1) and calls itself recursively."

***

Paul Jobling did it (27/2/2001) with
another approach that despite it uses memory intensively, I would say that
algorithmically is a absolutely simple. In his own words:

"1. Allocate 50Mb of RAM - this
allows a bitmap representing 200 million numbers. Clear each entry in
the bitmap.
2. Set p to 1.
3. Increment p. If bitmap[p] is set or p is not prime, goto 3.
4. For i=200 million dowto 0: if bitmap[i] is set then set bitmap[i+p]
5. Set bitmap[p]
6. Goto 3.

Unfortunately I don't think that this approach will help get the
100th term - the bitmap would have to be far too large...every
bit that is set is the sum of 1 or more of the primes that we have found
so far. That is what this algorithm does for you - it keeps track of all
possible sums of the values found, and it is very easy given a new value
to find the new set of all possible sums."

Using this approach Paul got the
first 37 members of the first sequence and the first 40 members of the
second one in no more than 5 minutes.

***

I (C.R.) translated the Paul's
algorithm to Ubasic, making some minor changes to save memory and
make if faster. This is the code to manage primes less than 65500:

Maybe is interesting to notice that it
produces the first 21 members of any of the two sequences in less than a
second.

***

Additional question: How
large do you think that the 100th member of the sequence will be?

***

Felice Russo has calculated an
empirical trend equation between the number of digits and the index of the
number of these sequences that extrapolating provides a forecast of 20-22
digits for the 100th member of both sequences. See below the beautiful graph
he made:

***

An extension of this puzzle is to ask for the
sequences of primes such that no one is the algebraic sum ( primes can be
added or subtracted) of the previous
ones.

***

Paul Jobling sent (3/2/2001) the
following Ubasic version of his original algorithm to get the last
asked sequences:

Result was
obtained by combining the two methods. Call Paul Jobling’s approach
Method A and Jud McCranie’s approach Method B. The program runs Method
A until adding another member to sequence A would require setting bits
outside of the array bounds. Then it switches to Method B, but instead
of trying to find p-Ln=0, it tries to find p-Ln in the Method A bit
array. The 0th element in the bit array is set to 1 for the
case of including only Seq. B members in the sum). This approach ran
orders of magnitude faster than either method by itself. Speed was then
enhanced by a factor of several thousand by forming a jump table from
the Method A bit array. The jump table tells you how many odd jumps you
have to make from a given number to find the next OFF bit. Because
there are extremely long strings of ON bits, you can rule out thousands
or even millions of numbers at a time that do not need to be tested.
The bigger the Method A bit array is, the bigger the jumps tend to be
between OFF bits, and the more efficient the program becomes.

Because the
code was more complicated and therefore more prone to error, I did a
series of tests. I Ran method A alone as far as could (made a giant
packed array with 320 billion bits using 40GB of virtual memory – this
ran over a weekend). Then ran final version of the mixed method code
with different array sizes, and the results always matched the Method A
results (but with runtimes on the order of one second!).

***

Paul Schmidt wrote on June 2011:

An efficient algorithm to find
the first 100 members of the non adding prime sequence must be
efficient in time and storage space. The original solution by Jobling
is time efficient but it uses too much space. This following solution
uses the Jobling algorithm but is also space efficient.

To implement the algorithm, the
100th member will need to be represented by a large integer. A
program was written in C++ to solve the problem using the GMP library
to do arbitrary precision math:

That's the whole program.
Unfortunately, it cannot finish because the computer ran out of
memory.

To solve this problem, a class
was written to compress the 'bits' variable into another
representation. This class is an array of runs of ones or zeros in
the bit array. So the binary number (starting with the least
significant bit) 1011 can be represented as (0,1,1,2). The first 0
means no zeros, then one 1, then one zero, then 2 ones. The functions
that need to be implemented for this class are scan0, shift, and
inclusive or.

scan0 and shift are trivial
code. The 'inclusive or' is a little more challenging. Although it
was not taken to this extent, one could write this class and change
the program above by changing the class for 'bits', and modify the
line to call scan0.

The functions are also time
efficient. So most of the time in the algorithm is running the
probable prime function (Miller-Rabin).

Since the code uses the
arbitrary precision math library, the code can find the sequence much
further than 100 members. The code was used to find the first 250
members in less than a minute. It is also noted that the primes are
"probable" primes. But the test was run to >99.9999% probability for
each member.

The program found the first 100 members in 15 seconds.
I ran up to member 284 in the sequence in 1 minute and 45 seconds. To
go beyond that, I would need to write the program to be more space
efficient