I would like to make (or improve my function) a function that returns an array of integers containing all the prime numbers from 2...n that can execute as quickly and efficiently as possible (ideally so that it will be able to execute n = 1,000,000,000 in a matter of seconds). I am using Swift 4.1 and Xcode 9.3.

Where I'm currently at

So far, I've used the Sieve of Eratosthenes to make this function that can calculate prime numbers under 1,000,000 in 0.28s, but when I try it with a number like 1 billion or 1 trillion, it takes too long.

1 Answer
1

That is not the Sieve of Eratosthenes

The Sieve of Eratosthenes computes multiples
of each found prime to mark subsequent composite numbers in the sieve.
Your algorithm computes the remainder of all subsequent numbers instead.
That makes a huge difference. I'll come back to that later, let's start
with a

Review of your current code

The function – as I understand the to parameter –
computes all primes up to and includingn. On the other hand,
stride(from: 3, to: n, by: 2) does not include the upper bound,
and that is easily verified with

print(generatePrimes(to: 11)) // [2, 3, 5, 7]

So either rename the function to func generatePrimes(below n: Int)
or use stride(from: 3, through: n, by: 2) to include the upper bound.
I'll do the latter for this review.

It would also be a good idea to add a documentation comment which
unambiguously documents the function parameters.

The explicit type annotation is not needed, and the array creation
can be simplified to

var arr = Array(stride(from: 3, through: n, by: 2))

Why is the limit number expected to be greater than 5? That is an
unnecessary restriction and would be unexpected for a caller of your function. It may be that your implementation does not work if
\$ n \le 5 \$, but it would be easy to handle that case separately:

demonstrates, you are leaving the loop to early.
I would also compute the square root only once, and use double-precision
arithmetic: The 24 bit significand of a Float cannot represent large
integers correctly (compare Computing the integer square root of large numbers).

Finally, the function should compute the result, but not print it, i.e. this

should be removed. It is generally a good habit to separate computation from I/O.
In addition, this distorts the benchmark results, because you measure also the
time for converting the numbers to strings, and the time to print these strings
(which depends on the output device, printing to a file is faster than printing
to the Terminal or the Xcode console).

With all changes suggested so far, your function would look like this:

Using the Sieve of Eratosthenes

As I said above, your algorithm is different from the Eratosthenes sieve.
Each time a prime number \$ p \$ is found, your code does trial divisions for
all remaining numbers to remove multiples of \$ p \$.
The Eratosthenes sieve computes multiples of \$ p \$ instead:

p*p, p*p + p, p*p + 2*p, p*p + 3*p, ...

and marks these as composite. Note that there are far less values to compute.

Also your algorithm removes and inserts array elements frequently.
That is slow because the remaining elements must be shifted to the left
or to the right.

The Eratosthenes sieve works with a fixed-sized boolean array instead.

This can surely be further improved, a first step would be to handle the case p=2
separately and to use the sieve only for odd numbers (which however complicates the
index calculations a bit).

But, as you see, computing the primes up to 1 billion is feasible with the sieve
of Eratosthenes (and the numbers are correct, you can compare them with
https://primes.utm.edu/howmany.html).

Going for the trillion

You were also thinking of computing all primes up to one trillion. That might be
a problem, no matter which algorithm you choose. According to
https://primes.utm.edu/howmany.html, there are 37,607,912,018 prime numbers
below 1 trillion. Even if we use 32-bit bit integers to save space, that would
still require approx 140 GB memory, which makes returning an array with all that
numbers difficult.

So that would require a different approach which allows to compute prime numbers in
some given range, instead of all prime numbers up to a limit.