2 Answers
2

See this site for a summary of the key strength estimates used by various researchers and organizations.

Your "512-bits in 12μs" is completely bogus. Let's see from where it comes. 1999 was the year when the first 512-bit general factorization was performed, on a challenge published by RSA (the company) and called RSA-155 (because the number consisted in 155 decimal digits -- in binary, the length is 512 bits). That factorization took 6 months. At the Eurocrypt event organized the same year (in May; at that time the 512-bit factorization effort had begun but was not completed yet), Adi Shamir, from the Weizmann Institute, presented a theoretical device called TWINKLE which, supposedly, may help quite a bit in a factorization effort. It should consist in a huge number of diodes flashing at carefully selected frequencies, in a kind of black tube. Shamir brought a custom device which, from 10 meters away, looked like a coffee machine. He asked for people to switch off the light, so that the Eurocrypt attendee could marvel at the four red diodes flashing at invervals of 2, 3, 5 and 7 seconds. Ooh! and Aah! they went, although the actual machine, would it be built, would require a few millions of diodes and frequencies in the 10 or 100 gigahertz. So the idea is fun (at least for researchers in cryptology, who are known to have a strange sense of humor) but has not gone beyond the theoretical sketch step yet. Shamir is a great showman.

However, TWINKLE is only "help". The best known factorization algorithm is called the General Number Field Sieve; the two algorithms which come next are the Quadratic Sieve and the Elliptic Curve Method. A 512-bit number is out of reach of QS and ECM with today's technology, and a fortiori with 1999's technology. GNFS is very complex (mathematically speaking), especially since it requires a careful selection of some critical parameters ("polynomial selection"). So there must be an initial effort by very smart brains (with big computers, but brains are the most important here). Afterward, GNFS consists in two parts, the sieve and the linear reduction. The sieve can be made in parallel over hundreds or thousand of machines, which must still be relatively big (in RAM), but this is doable. The linear reduction involves computing things with a matrix which is too big to fit in a computer (by several orders of magnitude, and even if we assume that the said computer has terabytes of fast RAM). There are algorithms to keep the matrix (which is quite sparse) in a compressed format and still be able to compute on that, but this is hard. In the 512-bit factorization, sieving took about 80% of the total time, but for bigger numbers the linear reduction is the bottleneck.

TWINKLE is only about speeding up the sieving part. It does nothing about the linear reduction. In other words, it speeds up the part which is easy (relatively speaking). Even a TWINKLE-enhanced sieving half would be nowhere near 12μs. Instead, it would rather help bringing a four month sieving effort down to, say, three weeks. Which is good, in a scientific way, but not a record breaker, especially since linear reduction dominates for larger sizes. The 12μs figure seems to come from a confusion with an even more mythical beast, the Quantum Computer, which could easily factor big numbers if a QC with 512 "qubits" could be built. D-Wave has recently announced a quantum computer with 128 qubits, but it turned out that these were not "real" qubits, and they are unsuitable for factorization (they still can do, theoretically, some efficient approximations in optimization problems, which is great but basically not applicable to cryptography, because cryptographic algorithms are not amenable to approximations -- they are designed so that a single wrong bit scrambles the whole thing). The best "real" QC so far seems to be the prototype by IBM with, as far as I recall, has 5 qubits, enabling it to establish that 15 is equal to 3 times 5.

The current RSA factorization record is for a 768-bit integer, announced in December 2009. It took four years and involved the smartest number theorists currently living on Earth, including Lenstra and Montgomery, who have somewhat god-like status in those circles. I recently learned that the selection of the parameters for a 1024-bit number factorization has begun (that's the "brainy" part); the sieving is technically feasible (it will be expensive and involve years of computation time on many university clusters) but, for the moment, nobody knows how to do the linear reduction part for a 1024-bit integer. So do not expect a 1024-bit break any time soon.

Right now, a dedicated amateur using the published code (e.g. Msieve) may achieve a 512-bit factorization if he has access to powerful computers (several dozens big PC, and at least one clock full of fast RAM) and a few months of free time; basically, "dedicated amateur" means "bored computer science student in a wealthy university". Anything beyond 512 bits is out of reach of an amateur.

Summary: in your code, you can return "practically infinite" as cracking time for all key lengths. A typical user will not break a 1024-bit RSA key, not now and not in ten years either. There are about a dozen people on Earth who can, with any credibility, claim that it is conceivable, with a low but non-zero probability, that they might be able to factor a single 1024-bit integer at some unspecified time before year 2020.

(However, it is extremely easy to botch an implementation of RSA or of any application using RSA in such a way that what confidential data it held could be recovered without bothering with the RSA key at all. If you use 1024-bit RSA keys, you can be sure that when your application will be hacked, it will not be through a RSA key factorization.)

+1. I didn't want to claim that the 512-bit factoring thing was bogus since I know far less about quantum (or the whole area in general) than you do, but I had a strong feeling it was. I also had a fair idea factoring algorithms were tailored per number, but not to the extent you describe, as my number theory just isn't that good yet. So +2 if I could.
–
user2213Jun 12 '11 at 16:23

Short answer: The easiest method would be to use the prime number theorem, but be aware it's an approximation. Estimate how long it would take you to try each of those primes; time per prime*number of primes gives you the total time. This will give you an estimate for brute force searching.

You could also use the running time estimation for the quadratic sieve or the general number field sieve. This will give you estimates for factoring algorithms actually used by people breaking RSA numbers.

Long background:

Number theory time!

Firstly, let's have a look at the size of numbers you're talking about. Given 2^3 = 8, which is 1000 in binary, we can see this is a four bit number, the minimum such that is possible. So, 2^2 = 4 is a 3 bit number (100). So for a given x, the minimum possible value to ensure we have enough bits is 2^(x-1). 2^2047 = 16158503035655503650357438344334975980222051334857742016065172713762327569433945446598600705761456731844358980460949009747059779575245460547544076193224141560315438683650498045875098875194826053398028819192033784138396109321309878080919047169238085235290822926018152521443787945770532904303776199561965192760957166694834171210342487393282284747428088017663161029038902829665513096354230157075129296432088558362971801859230928678799175576150822952201848806616643615613562842355410104862578550863465661734839271290328348967522998634176499319107762583194718667771801067716614802322659239302476074096777926805529798115328. That's the size of number you're dealing with here, i.e. the size of n to be factored.

Next big question then is how is n constructed? n=pq as you know from the definition of RSA, so you're looking for two primes as factors of that number. The question then becomes how do we determine a number is prime and can we count them?

So by definition a number is irreducible if, for any number x less than it, \gcd(p, x) = 1 with the exception of x=1. However, we can improve on that. You ought to realise fairly quickly that for any number, it is either prime or not. If it is not prime, then the gcd of it and at least one prime must be greater than 1 (otherwise it would be prime). We conclude from this that any non-prime integer must be divisible by a set of primes. A formal mathematical proof is not actually that much of a leap from here.

This is called the fundamental theorem of arithemtic and simplifies matters slightly. So now, when working out if a number is prime, we no longer need to try every number, just the numbers we already know are prime!

This is clearly still really slow, so let's make another observation - given factors occur in pairs, the lower of the two numbers is at most the square root of the number. If we are restricted to N (the set of natural numbers) it represents an upper bound on the largest possible value we need to check. So now, for any number N, we must search each integer starting from 2 and heading to sqrt(N) for each number we determine as prime in that list. We can then, if we find a prime, deduce whether it factors N itself. I'm not going to estimate the run time of that because I'll undoubtedly say the wrong thing, but it'll take a long time.

Now you see the strength of RSA. Pick a very large prime and you end up with a long way to go. As it currently stands, we have to start from 2, which is clearly awful.

Primality testing aims to improve on that using a variety of techniques. The naive method is the one we've just discussed. I think a detailed discussion of these techniques is probably more appropriate for Math, so let me sum it up: all of the runtimes are rubbish and using this as a way to count primes would be horrendous.

So, we cannot count the number of primes reliably less than a number without taking forever, since it's effectively analogous to integer factorisation. What about a function that somehow counts primes some other way?

Enter \pi(n) = \frac{n}{\log(n) - 1.08366}, one attempt at the Prime Number Theorem an approximation to the number of primes. It is, however, exactly that; the aim of such a function is to exactly count the number of primes but at present it simply gives you an estimate. For your purposes, this could be considered good enough.

However, it is absolutely still an approximation. Take a look at the rest of the article. Amongst other things, other estimations are dependent on the Riemann Hypothesis.

Ok, so, what about integer factorisation? Well, the second best method to date is called the Quadratic Sieve and the best is called the general number field sieve. Both of these methods touch some fairly advanced maths; assuming you're serious about factoring primes I'd get reading up on these. Certainly you should be able to use the estimates for both as improvements to using the prime number theorem, since if you're going to factor large primes, you want to be using these and not a brute force search.

But I want to know about quantum?

Ok, fair enough. Integer factorisation on a quantum computer can be done in ridiculously short amounts of time assuming we will be able to implement Shor's Algorithm. I should point out, however, this requires a quantum computer. As far as I'm aware, the development of quantum computers of the scale that can crack RSA is currently a way off. See quantum computing developments.

In any case, Shor's Algorithm would be exponentially faster. The page on it gives you an estimate for the run time of that, which you may like to include in your estimates.

+1 Thanks for reply. I know it takes real effort to write such reply. But as non-math person, I hardly understand the detail.
–
PredatorJun 12 '11 at 12:00

1

The links provided are too advance for me. Can you roughly estimate the required time for cracking those RSA key lengths?
–
PredatorJun 12 '11 at 12:18

@Gens your best bet is to have a go with pari/gp and stick to the brute force approximation; you can actually calculate real numbers with that. The wikipedia links talk about computational complexity; you can plug numbers into these - the trick is working out what this means in terms of computational expense. In fact, computational complexity only estimates the largest, dominating, set of terms (see mathematical analysis) so actual run times in terms of work to be done will vary per implementation and processor.
–
user2213Jun 12 '11 at 13:50

Thanks for suggestion, but maybe I need to put a bounty for someone doing that works for me :)
–
PredatorJun 12 '11 at 13:57

One small concern with this answer. You say that the "largest possible factor" of a number is its square root. This is incorrect, of course; it's the (inclusive) upper bound of the series of primes which must be considered before the number can be declared prime. That is, given the number 33, whose square root is 5.75(-ish), we need not test whether 11 is a factor, because we would have already found a factor when we tested 3. This can be shown by the fact that a worst-case scenario for the algorithm is the square of a prime, in which case 5, for example, is the highest test needed for 25.
–
BMDanMar 20 '13 at 18:30