I find it strange that usually when I tell people that C compilers these days optimize the code to such lenghts in terms of performance that it is waste of time to start writing routines in assembly, there is always someone telling that hand written assembly is usually still way faster than what C compilers spit out.

If I say otherwise, there are people telling the opposite; compilers these days are fast enough to beat hand written assembly.

I wonder, how is it really? I guess it is case-specific, but is there any general ruling to this, as in under which circumstances C or asm is notably faster/slower than the other, if any? Any empiric experience?

And most importantly, any proof for the claims? I've never seen any decent comparison really.

I rarely find it worth the effort to write hand-coded asm - but I quite often look at the assembly output from the compiler (using my profiler) and tweak the C++ code to give better (and faster) asm output. It can make a huge difference to performance, especially in my blitting code, while still keeping the code nice and portable.

If you write code for a processor whose C/C++ compilers are not well developed, use assembly patches to fix the speed problems. If you're on an Intel or AMD, forget about assembly as much as possible. (If it weren't for the fact that the code density is excellent on these processors, thus making the caches more efficient, Assembly would make no impact on the code quality. The compilers used on the x86 are just too commonly used and tweaked and optimized for every use you can imagine.)

Last time I looked at some of my assembly output, I realized that the compiler seems unwilling to use the cmovx instructions, so I had to write those manually to avoid a couple of jumps. Handwritten assembly is needed sometimes, but it is (fortunately) rare.

So, these days for example with gcc, inline hand written asm is usually unnecessary speed-wise?

size-wise I guess that asm is still the king, just look at how many 4k demos are there written in asm/C. This interests me quite alot, as 4k prods seem to be more and more something I would like to take attempt at. gcc + strip + upx + few other tricks seem to be able to cut down the binary size quite damn well for C, but I believe asm is still quite much required to be able to fit actual content in 4096 bytes.

...however, thats not why I created the thread, size advantage is obvious.

In what scenario can C code beat hand-written ASM code, assuming your ASM follows the logical line and don't executes loops for nothing? Never, you just live with it. For the most cases, the difference is non-significant, but to learn the platform-specific ASM might require a great deal of time, which is generally equal to profit. C abstracts the platform for you, which is the main benefit. It adds some overhead, but who cares? Script languages add even more overhead, but who cares? Use the one for prototyping, test, take it to C, test & profile, eventually take it over to ASM. It would be a waste of money, time, life to start coding something in ASM, unless I know it's gonna be a 256b/1/2/4k-demo. But even then I'll prototype my ideas.

Conclusion: we are all aware of the difference, but the languages exist for a reason. The compiler does a bad job? Yes, it's a well known fact, compilers use heuristics, hand-written assembler is much faster!

I believe asm is still quite much required to be able to fit actual content in 4096 bytes.

Quickman. Probably one of the most influential demonstrations showing how raw ASM can outperform your typical compiler code generation. Mandelbrot is a very easy equation to implement, but a standard C/C++ implementation generates no where near that level of performance.

If you have the time and cunning, ASM can make a significantly positive impact on performance.