It’s a fairly well-known function these days and first became so when it appeared in the source of Quake III Arena in 2005. It was originally attributed to John Carmack but turned out to have a long history before Quake going back through SGI and 3dfx to Ardent Computer in the mid 80s to the original author Greg Walsh. The concrete code above is an adapted version of the Quake code (that’s where the comments are from).

This post has a bit of fun with this hack. It describes how it works, how to generalize it to any power between -1 and 1, and sheds some new light on the math involved.

(It does contain a fair bit of math. You can think of the equations as notes – you don’t have to read them to get the gist of the post but you should if you want the full story and/or verify for yourself that what I’m saying is correct).

Why?

Why do you need to calculate the inverse of the square root – and need it so much that it’s worth implementing a crazy hack to make it fast? Because it’s part of a calculation you do all the time in 3D programming. In 3D graphics you use surface normals, 3-coordinate vectors of length 1, to express lighting and reflection. You use a lot of surface normals. And calculating them involves normalizing a lot of vectors. How do you normalize a vector? You find the length of the vector and then divide each of the coordinates with it. That is, you multiply each coordinate with

Calculating is relatively cheap. Finding the square root and dividing by it is expensive. Enter FastInvSqrt.

What does it do?

What does the function actually do to calculate its result? It has 4 main steps. First it reinterprets the bits of the floating-point input as an integer.

int i = *(int*)&x; // evil floating point bit level hack

It takes the resulting value and does integer arithmetic on it which produces an approximation of the value we’re looking for:

i = 0x5f3759df - (i >> 1); // what the fuck?

The result is not the approximation itself though, it is an integer which happens to be, if you reinterpret the bits as a floating point number, the approximation. So the code does the reverse of the conversion in step 1 to get back to floating point:

x = *(float*)&i;

And finally it runs a single iteration of Newton’s method to improve the approximation.

x = x*(1.5f-(xhalf*x*x));

This gives you an excellent approximation of the inverse square root of x. The last part, running Newton’s method, is relatively straightforward so I won’t spend more time on it. The key step is step 2: doing arithmetic on the raw floating-point number cast to an integer and getting a meaningful result back. That’s the part I’ll focus on.

What the fuck?

This section explains the math behind step 2. (The first part of the derivation below, up to the point of calculating the value of the constant, appears to have first been found by McEniry).

Before we can get to the juicy part I’ll just quickly run over how standard floating-point numbers are encoded. I’ll just go through the parts I need, for the full background wikipedia is your friend. A floating-point number has three parts: the sign, the exponent, and the mantissa. Here’s the bits of a single-precision (32-bit) one:

s e e e e e e e e m m m m m m m m m m m m m m m m m m m m m m m

The sign is the top bit, the exponent is the next 8 and the mantissa bottom 23. Since we’re going to be calculating the square root which is only defined for positive values I’m going to be assuming the sign is 0 from now on.

When viewing a floating-point number as just a bunch of bits the exponent and mantissa are just plain positive integers, nothing special about them. Let’s call them E and M (since we’ll be using them a lot). On the other hand, when we interpret the bits as a floating-point value we’ll view the mantissa as a value between 0 and 1, so all 0s means 0 and all 1s is a value very close to but slightly less than 1. And rather than use the exponent as a 8-bit unsigned integer we’ll subtract a bias, B, to make it a signed integer between -127 and 128. Let’s call the floating-point interpretation of those values e and m. In general I’ll follow McEniry and use upper-case letters for values that relate to the integer view and and lower-case for values that relate to the floating-point view.

Converting between the two views is straightforward:

For 32-bit floats L is 223 and B is 127. Given the values of e and m you calculate the floating-point number’s value like this:

and the value of the corresponding integer interpretation of the number is

Now we have almost all the bits and pieces I need to explain the hack so I’ll get started and we’ll pick up the last few pieces along the way. The value we want to calculate, given some input x, is the inverse square root or

For reasons that will soon become clear we’ll start off by taking the base 2 logarithm on both sides:

Since the values we’re working with are actually floating-point we can replace x and y with their floating-point components:

Ugh, logarithms. They’re such a hassle. Luckily we can get rid of them quite easily but first we’ll have to take a short break and talk about how they work.

On both sides of this equation we have a term that looks like this,

where v is between 0 and 1. It just so happens that for v between 0 and 1, this function is pretty close to a straight line:

Or, in equation form:

Where σ is a constant we choose. It’s not a perfect match but we can adjust σ to make it pretty close. Using this we can turn the exact equation above that involved logarithms into an approximate one that is linear, which is much easier to work with:

Now we’re getting somewhere! At this point it’s convenient to stop working with the floating-point representation and use the definitions above to substitute the integer view of the exponent and mantissa:

If we shuffle these terms around a few steps we’ll get something that looks very familiar (the details are tedious, feel free to skip):

After this last step something interesting has happened: among the clutter we now have the value of the integer representations on either side of the equation:

In other words the integer representation of y is some constant minus half the integer representation of x. Or, in C code:

i = K - (i >> 1);

for some K. Looks very familiar right?

Now what remains is to find the constant. We already know what B and L are but we don’t have σ yet. Remember, σ is the adjustment we used to get the best approximation of the logarithm, so we have some freedom in picking it. I’ll pick the one that was used to produce the original implementation, 0.0450465. Using this value you get:

Want to guess what the hex representation of that value is? 0x5f3759df. (As it should be of course, since I picked σ to get that value.) So the constant is not a bit pattern as you might think from the fact that it’s written in hex, it’s the result of a normal calculation rounded to an integer.

But as Knuth would say: so far we’ve only proven that this should work, we haven’t tested it. To give a sense for how accurate the approximation is here is a plot of it along with the accurate inverse square root:

This is for values between 1 and 100. It’s pretty spot on right? And it should be – it’s not just magic, as the derivation above shows, it’s a computation that just happens to use the somewhat exotic but completely well-defined and meaningful operation of bit-casting between float and integer.

But wait there’s more!

Looking at the derivation of this operation tells you something more than just the value of the constant though. You will notice that the derivation hardly depends on the concrete value of any of the terms – they’re just constants that get shuffled around. This means that if we change them the derivation still holds.

First off, the calculation doesn’t care what L and B are. They’re given by the floating-point representation. This means that we can do the same trick for 64- and 128-bit floating-point numbers if we want, all we have to do is recalculate the constant which it the only part that depends on them.

Secondly it doesn’t care which value we pick for σ. The σ that minimizes the difference between the logarithm and x+σ may not, and indeed does not, give the most accurate approximation. That’s a combination of floating-point rounding and because of the Newton step. Picking σ is an interesting subject in itself and is covered by McEniry and Lomont.

Finally, it doesn’t depend on -1/2. That is, the exponent here happens to be -1/2 but the derivation works just as well for any other exponent between -1 and 1. If we call the exponent p (because e is taken) and do the same derivation with that instead of -1/2 we get:

Let’s try a few exponents. First off p=0.5, the normal non-inverse square root:

or in code form,

i = 0x1fbd1df5 + (i >> 1);

Does this work too? Sure does:

This may be a well-known method to approximate the square root but a cursory google and wikipedia search didn’t suggest that it was.

It even works with “odd” powers, like the cube root

which corresponds to:

i = (int) (0x2a517d3c + (0.333f * i));

Since this is an odd factor we can’t use shift instead of multiplication. Again the approximation is very close:

At this point you may have noticed that when changing the exponent we’re actually doing something pretty simple: just adjusting the constant by a linear factor and changing the factor that is multiplied onto the integer representation of the input. These are not expensive operations so it’s feasible to do them at runtime rather than pre-compute them. If we pre-multiply just the two other factors:

we can calculate the value without knowing the exponent in advance:

i = (1 - p) * 0x3f7a3bea + (p * i);

If you shuffle the terms around a bit you can even save one of multiplications:

i = 0x3f7a3bea + p * (i - 0x3f7a3bea);

This gives you the “magic” part of fast exponentiation for any exponent between -1 and 1; the one piece we now need to get a fast exponentiation function that works for all exponents and is as accurate as the original inverse square root function is to generalize the Newton approximation step. I haven’t looked into that so that’s for another blog post (most likely for someone other than me).

The expression above contains a new “magical” constant, 0x3f7a3bea. But even if it’s in some sense “more magical” than the original constant, since that can be derived from it, it depends on an arbitrary choice of σ so it’s not universal in any way. I’ll call it Cσ and we’ll take a closer look at it in a second.

But first, one sanity check we can try with this formula is when p=0. For a p of zero the result should always be 1 since x0 is 1 independent of x. And indeed the second term falls away because it is multiplied by 0 and so we get simply:

i = 0x3f7a3bea;

Which is indeed constant – and interpreted as a floating-point value it’s 0.977477 also known as “almost 1” so the sanity check checks out. That tells us something else too: Cσ actually has a meaningful value when cast to a float. It’s 1; or very close to it.

That’s interesting. Let’s take a closer look. The integer representation of Cσ is

This is almost but not quite the shape of a floating-point number, the only problem is that we’re subtracting rather than adding the second term. That’s easy to fix though:

Now it looks exactly like the integer representation of a floating-point number. To see which we’ll first determine the exponent and mantissa and then calculate the value, cσ. This is the exponent:

and this is the mantissa:

So the floating-point value of the constant is (drumroll):

And indeed if you divide our original σ from earlier, 0.0450465, by 2 you get 0.02252325; subtract it from 1 you get 0.97747675 or our friend “almost 1” from a moment ago. That gives us a second way to view Cσ, as the integer representation of a floating-point number, and to calculate it in code:

Note that for a fixed σ these are all constants and the compiler should be able to optimize this whole computation away. The result is 0x3f7a3beb – not exactly 0x3f7a3bea from before but just one bit away (the least significant one) which is to be expected for computations that involve floating-point numbers. Getting to the original constant, the title of this post, is a matter of multiplying the result by 1.5.

With that we’ve gotten close enough to the bottom to satisfy at least me that there is nothing magical going on here. For me the main lesson from this exercise is that bit-casting between integers and floats is not just a meaningless operation, it’s an exotic but very cheap numeric operation that can be useful in computations. And I expect there’s more uses of it out there waiting to be discovered.

(Update: I’ve added an appendix about the intuition behind the part where I shuffle around the floating-point terms and the integer terms appear.)

92 Responses to 0x5f3759df

This is a work of genius. The first half is a brilliant and readable explanation of how it works. The second half smashes through the looking glass. Brilliant. It’s especially impressive that you managed to do it with a regular square root.

Assuming those graphs are the approximate result without the Newton step, I wonder why that step is required at all.

A few notes:
– In the C code sample, return x is on the same line as the comment.
– “Enter InvSqrt” should be “Enter FastInvSqrt”
– ln_2 should be log_2, shouldn’t it? ln specifically refers to log_e.
– log_2(1 + m)e should be log_2(1 + m) + e, shouldn’t it? Looks like the step after this is correct, so it’s not a maths error, just a missing plus. Also you need parens around this expression in the halved version on the right-hand side.

Another thought: I’m assuming that this approach fails completely when it comes to denormalized numbers. You conveniently skipped discussing normalization, which you introduced in the step where you introduced (1 + m). This will cover all numbers greater than or equal to 2^-126. For any smaller numbers, (1 + m) should just be m. Not that this really matters — I doubt most applications are going to require numbers that small, nor is the inverse square root meaningful in the zero case, but still interesting to know whether it will handle these numbers (and I suspect it won’t).

Thanks for the notes, you’re right on all points. The C code issue is wordpress messing up my formatting. The rest is just sloppiness transcribing my notes.

Around the Newton step my guess would be that they tried without it and got visible artifacts in whatever it was they were rendering. There are some visible breaks in the approximation graph (“disdifferentianuities”?) which I would guess shows up visibly if you’re trying to render a smooth surface where the normals vary slowly and cross one of the breaks. But that’s a pure guess.

I think you’re right that it doesn’t give the correct result for denormalized values. The approximation of log2 breaks down majorly if you’re outside 1+(0..1). This is also an issue for the output so even if your input is normalized, if the correct output would have been denormalized then the algorithm probably gives the wrong result too.

However, the accuracy of approximation generally improves with the magnitude of the input. When you’re near 0 the result is already quite inaccurate so I suspect the error caused by not interpreting denormalized values correctly may not be significantly worse than the error for tiny normalized values. And hopefully the Newton step helps somewhat there.

Excellent read, thanks for posting. Very nice walk-through the logic and implementation behind the algorithm (even if it is just an approximation). I’m curious, what (program?) did you use to make your graphs? They look very nice and crisp!

You say that it leaves 27 bits fot the mantisse. but the correct value should be 23. There is 1 bit for the s, 8 for the e ans that leaves only 23 open for the mantisse. or did I made a mistake? otherwise a very nice article.

Great write-up and nostalgia. Unfortunately time has somewhat passed these ultra-optimizations, as FPU’s and GPU’s are more tightly integrated with the CPU. I did a quick’n’dirty benchmark and at least on Java the performance of the fast-inv was 25% faster than using Math.sqrt (which does double-precision). While certainly an improvement, it’s not necessarily enough to warrant starting changing your implementation unless it’s really time-critical. I believe the difference has been significanly larger earlier.

Could you post the test code you’re using? On a 64-bit machine using VS2010, the following function is consistently within 5% as fast as the posted FastInvSqrt function:

double NormalInvSqrt(double x)
{
return (1.0 / sqrt(x));
}

This uses the sqrt() defined in the header (Win7 x64 Pro, Visual Studio 2010). I’d assume the performance penalties are significantly lesser on 64-bit machines now, since the 64-bit double precision floating point variables are now 1 word long (from the perspective of the CPU, not the instruction set nomenclature).

Interestingly (or maybe not, considering the above), I found the following code to be 3 times slower then the posted FastInvSqrt() function:

float NormalInvSqrt(float x)
{
return (1.0f / sqrt(x));
}

The only difference between this code snippet and the above is the variable types used (float instead of double).

Where this really shines is in limited hardware. Take an ARM-CortexM4F. It’s 32-bit and has a few single-precision float instrucitons, but sqrt takes 14 clocks and divide takes the same 14clocks. FP multiply takes 3 so you can get inverse squareroot with 14 or 15 clocks rather than 28 using this method. This also makes a nice fast method for calculating non-2 fractional roots.

This is a fantastic and fascinating post! I was playing around with some of your equations and noticed that the original derivation (for p = -0.5) is not a special case of the general equation listed. I think that the generic formula should read:

I_y ≈ (1-p)*L(sigma – B) + p*I_x

instead of

I_y ≈ (p-1)*L(sigma – B) + p*I_x

The error propagates through the rest of the post (e.g., it is the cause of the spurious negative number you get in your sanity check…).

I think there is (at least) one more artifact of that sign mistake: i = (1 - p) * 0x3f7a3bea + (p * i);
would expand to i = 0x3f7a3bea - p * (0x3f7a3bea - i);
and not with the plus sign in the parentheses as you have.

@Sampo: I haven’t done any benchmarks (I was more interested in the math) but I expect there might be better and more efficient ways to do most of this on a modern high-end architecture or if you have a GPU. But for something like my Arduino where you have floating-point numbers but a limited instruction I can definitely see this being practically useful.

Going down in processor size is like travelling back in time, you just have to go low enough and there’ll be a use for any old-timey trick :-).

Also, for those interested, the Newton’s Method step needs to be updated when the exponent is generalized. This step isn’t really covered in the article, but based on some of my numbers, you’ll get on average about a 20x reduction in the % error (going from ~10% error for the raw estimate to ~0.6% for the post-Newton method estimate).

The generalized equation for the Newton’s Method step is:

x = x*(1 - power + power*input*x^(-1/power);

where input is the number you’re finding the power of and power is the power you’re raising the input to.

Great read. I really like this bit hack. I was first shown it back in high school and it has always been something I check back on to see if there are any new papers on it. A good friend of mine showed me your blog post today and I really liked it. Perhaps you would be interested in these two papers: A Brief History of InvSqrt and loating-Point Tricks. The first is my own work, so a little shameless self-promotion :). The second is very interesting because it is a much older (less known) and more generic approach that $1/sqrt{x}$ is just a special case of.

You have some errors in your code due to how C handles data-type conversions in arithmetic – when you do an arithmetic action with an integer and a float, you’ll get a float, so the result of (i * 0.333f) is a float (i is converted to a float). This also means 0x2a517d3c is converted to a float, and not the other way around.

The “magic number” is a string of 28 bits from the first 1 to the last 1 in this example, so it doesn’t fit in full accuracy into a regular float (only 23-bit mantissa). To show you that’s the case, try running the code with the constant 0x2a517d30 instead and verify that you get the _exact_ same results (I did this test myself).
So, you’d get better accuracy by writing the code as 0x2a517d3c + (int)(i * 0.333f), but I feel that this is missing the point of the original algorithm, since it means we’re adding an extra integer->float and float->integer conversions. I implementation which keeps the spirit of the original function would be to multiply i by a fixed point multiplicative inverse of 1/3 and take the higher dword of the result:
i = 0x2a517d3c + (int)(((unsigned long long)i * 0x55555555U)>>32)

(The idea is to say 0x100000000 represents 1, 0x100000001 is 1 + 1/2^32, and so on, so then 1/3 = 0x100000000/3 = 0x55555555)

GCC can produce good code for this (multiplying two 32-bit numbers gives a 64-bit result, simply take the higher 32 bits of the result), in visual studio I ended up writing it using a compiler intrinsic (i = 0x2a517d3c + (int)(__emulu(i, 0x55555555U)>>32);) since the optimizer isn’t smart enough otherwise.

I think that the original derivation is a special case of the fixed-point version of the algorithmbecause skipping the two conversions is a significant factor in run-time (or at least was back in the day). (1/2 is 0x80000000, and (i*0x80000000)>>32 == (i<>32 == i >> 1)

It’s a good point about where to apply the cast – the way it’s written discards accuracy for no good reason. It shouldn’t affect the correctness though, just the accuracy.

I like the fixed-point multiplication trick, I’ll have to remember that. However there’s a fundamental issue with relying on efficient arithmetic at twice the size of the approximation. The original 32-bit trick was for a 32-bit architecture so it’s not obvious that you’d get good code for a 64-bit multiplication. Similarly, on your machine you’d be more likely to want a 64-bit approximation and so you’d need a 128-bit multiplication.

I think that maybe this code is undefined (i.e., not legal C and a conforming C compiler can do whatever it wants with it, not necc. what you want). That is, I think you need to use memcpy instead of those casts (don’t worry tho: the generate assembly will still be good).

I know many people have already said that, but still, thank you very much. I heard of the constant for the first time four years ago and finally now I can understand the mechanics behind it. Thank you, thank you so much.

There’s a large design space for how to set sigma: by which norm do you want to measure deviation, the calculation uses floating-point and not real numbers, there is nontrivial interaction with the newton step, etc. I suspect even though 0x5f34ff59 is optimal in the L2 norm the may be choices that are less optimal by that criterion but better overall for those other reasons.

The article by Lomont linked above has a lot to say about that and suggests 0x5f375a86 as a better choice.

Hm, I’m thinking that approximating ln(1 + v) as v is just taking the first term of its Maclaurin series. What if we try to do the same trick by approximating it better, with 2 first terms: ln(1 + v) as v – 1/2*v*v?

Is a fast algorithm possible for fixed point (1 bit is 2’s complement sign, 31 bits integer part, 32 bits fractional, meaning range -2147483648 to 2147483647.9999999998) addition, subtraction, multiplication, division, multi-addition, square root, cube root, any base root, powering, logarithms in any base, sine, cosine and tangent with full overflow detection and precision to bit level with half-way results rounded down? I currently implemented addition and subtraction, and the minus signs in numbers don’t conflict with operation signs. Any number outside the range is represented as “Error” and numbers are shown to 8 decimal places by default. Currently the decimal to binary converter uses double float internally so conversion may be inaccurate for large numbers. To enter a calculation, you are supposed to press ” “. The numbers are shown in Custom Font, which is my favorite font. You are not allowed to modify Custom Font, which means that intentional use or thought of Custom Font in mind but it’s not exact (such as by spacing, placing over any pixel, etc.), is not allowed.
Download the fixed point calculator: https://www.dropbox.com/s/m4db2bzkhqqfjlp/new32-32.sb2?dl=0
Download Custom Font: https://www.dropbox.com/s/x9rpnk33qeurk65/customfont.fon?dl=0
Please note that you need to use “Scratch” available online and offline to use my calculator but it also allows you to edit it.

I now implemented correct decimal to binary code.
In old version, the number being treated as double float (Scratch’s number format) is directly converted to binary, so float precision loss will naturally occur.
In new version, the number was separated with decimal point, making integer and fractional parts being obtained from separate float numbers, avoiding inaccuracies.