the C# compiler is like 5 - 10x faster than the c++ compiler. Does that prove that real world managed code runs faster than native code?

There are several reasons for this. Most importantly, C++ is a far more complex language (grammar wise) than C# so it takes longer to parse (particularly thanks to templates, which are a Turing-complete compile-time language), and the C++ compiler performs far more extensive optimizations (in the .Net world, most optimizations are performed by the JIT, so the work of the C# compiler is relatively light).

Whether managed or native is faster has no simple answer. With native code, it is possible to perform more optimizations so it is possible to write faster code than with managed. However, in practice it may be require extremely complex and hard to maintain code to actually beat managed code. A good example of this is the optimization series that Raymond Chen and Rico Mariani once did: his final optimized C++ version was faster than the .Net version, but he had to write his own memory allocation algorithms and stuff like that in order to get there. By comparison, the naïve .Net version was orders of magnitude faster than the naïve C++ version, and not even that much slower than the final optimized C++ version.

And that was done back in 2005, with .Net 2.0. There have been many significant performance improvements in .Net (particularly regarding start-up and the GC) since then.

@SteveRichter: No. Perhaps the C# compiler is more parallel optimized during the compilation process. C++ is the fastest language there is. (in production use).

ok. I am just guessing that the C++ compiler, having been written in c++, is impossible to refactor. I am finding it very sluggish and the compiler errors often times misleading. The other day I had the header version of a function returning a different type than the code version and the compiler just went nuts. Telling me I had over 100 errors. I run into that kind of scenario very frequently, forcing me to change my coding style where I recompile very frequently, after small code changes.

ok. I am just guessing that the C++ compiler, having been written in c++, is impossible to refactor.

Nice theory, except the C# compiler is written in C++ too (Mono's C# compiler is written in C#, but MS.Net's isn't).

It's C++'s insane complexity, I tell you. C++'s grammar isn't even LALR, so you can't use automated compiler generation tools like yacc or bison to create a parser for it. It is just stupidly convoluted in many places.

Whether managed or native is faster has no simple answer. With native code, it is possible to perform more optimizations so it is possible to write faster code than with managed. However, in practice it may be require extremely complex and hard to maintain code to actually beat managed code.

In my C++ code all strings are in std::wstring and I am making a lot of use of unique_ptr<class>. And since I do a lot of function calls which return these types, there must be a lot of constructors and destructors being called. Compared to C# simply returning a reference.

In my C++ code all strings are in std::wstring and I am making a lot of use of unique_ptr<class>. And since I do a lot of function calls which return these types, there must be a lot of constructors and destructors being called.

Could be, but not necessarily. In a lot of cases, the C++ compiler is able to avoid copying objects (copy elision). Even more so since the introduction of r-value references and move constructors in C++11. And these things are aggressively inlined, even across compilation units. Doing those kinds of optimizations is what makes your code faster, but the compilation slower.

And std::unique_ptr doesn't have a copy constructor. That's kind of its main purpose (it does have a move constructor however, but that's no more expensive than copying a pointer).

Nice theory, except the C# compiler is written in C++ too (Mono's C# compiler is written in C#, but MS.Net's isn't).

what was Anders talking about last year when describing the data structures used in the Roselyn project? Where they did not have to recreate the entire parse tree or whatever it was called for every source code change. I recall him saying they had redone the entire compiler.

Nice theory, except the C# compiler is written in C++ too (Mono's C# compiler is written in C#, but MS.Net's isn't).

It's C++'s insane complexity, I tell you. C++'s grammar isn't even LALR, so you can't use automated compiler generation tools like yacc or bison to create a parser for it. It is just stupidly convoluted in many places.

LOL .... it's funny, way back I loved C and to this day I will code in C or in C# but not in C++ unless I am forced to.... I have never liked what they did with C++ and never found a real world case where I could not write what I needed with C if I had to use C or C++

as for speed well .net can be faster or slower depending on what is being done how it's done and many other factors.

runtime speed not the speed of the compiler.

compiler speed -- well as was posted what has to be done to compile C / C++ is way different than .net

also remember that a .net compiler only generates IL not native code. a C/C++ compiler does generate native code.

One thing I found in both C# and C++ (STL), is to stay away from enumerators (foreach and begin()). Instead use simple integer loop indexing. There is a measurable performance drop in both cases when using enumerators.

the C# compiler is like 5 - 10x faster than the c++ compiler. Does that prove that real world managed code runs faster than native code?

That's an absurd conclusion. Depending on language compexity and optimizations applied, the compilation-time can vary wildly. One of the Go compilers can compile large code bases in less than a second. The Go language was designed to allow for that but it's also a product of how much optimization is applied in that particular compiler. There is still a need for a release compiler with a different compile-time vs run-time trade-off. The MLton compiler performs whole-program analysis and optimization - that is obviously going to be more time-consuming than local optimizations. The conclusion is so absurd that it leads one to believe you want to spur a heated debate.

One thing I found in both C# and C++ (STL), is to stay away from enumerators (foreach and begin()). Instead use simple integer loop indexing. There is a measurable performance drop in both cases when using enumerators.

Where performance doesn't matter, using an enumerator is fine.

If the use of an SDL enumerator rather than an integer indexer is the thing tanking your performance, then something has gone so far wrong in your app it's unreal.

My suggestion is that you benchmark your code before optimising it. Optimise only the bits that are actually in your hotpath, and choose algorithms that can be easily parallelized or which have known correct library implementations with lower big-Ohs to the ones you're using.

The rest of your code should focus on being readable and correct instead of being fast.

If the use of an SDL enumerator rather than an integer indexer is the thing tanking your performance, then something has gone so far wrong in your app it's unreal.

Wow that is quite a generalization for someone that has no clue what my code looks like. I'm talking about tight loops that does DSP processing (realtime FFT, filtering, FFT analysis, etc) as well as code that does asynchronous multithreaded low-level IO. In such cases the overhead of using enumerators can easily add a 30% performance drop in such tight loops. I've done the benchmarks and that is what they show. What does your benchmarks show for such cases?

If you think "something has gone wrong" in my code then with all due respect you should stick to writing simple code where no-one gives a crap about actual performance.

Wow that is quite a generalization for someone that has no clue what my code looks like. I'm talking about tight loops that does DSP processing (realtime FFT, filtering, FFT analysis, etc) as well as code that does asynchronous multithreaded low-level IO. In such cases the overhead of using enumerators can easily add a 30% performance drop in such tight loops. I've done the benchmarks and that is what they show. What does your benchmarks show for such cases?

I think you might get a bigger performance gain by taking a step back and asking if using other types of data structures and coding styles might give you better performance.

For example, a flat array will give you faster data accesses than an SDL list, even when using integer indexors.

And if you're really spending all of your time in a hot loop doing memory/CPU operations like FFTs, try writing them as a GPU shader in HLSL; you'll get orders of magnitude speed up by doing so. That's what I do for password-cracking for example. I certainly don't use the SDL at all for the hot loops of mine that actually make the room so hot you need four water cooling pumps and a heat-sink the size of a table to keep the processors from melting (but as you say, I only write code where performance doesn't matter, right )

Focusing on minor things like iterators versus integer indexers tends to hide the wood for the trees - as you said before you spend a lot of time doing low-level asynchronous IO - the syscall to kick that off will take tens of thousands of times longer to complete than the difference between an index in SDL verus the overloaded array indexer.

And since the SDL interator is not contractually bound to be slower - it might actually be faster on some machines or in future versions of the CRT. For example, integer array accesses need to be bounds checked for safety, whereas iterators do not.

Morale of the story is that writing, easy-to-read, obviously-correct and good-practices code for the most part, and only ever optimising (and liberally commenting and benchmarking) genuine hotloops tends to lead to better, more reliable and more long-lived code.

Well I was throwing different examples in there, so different approaches are required in each case.

To simplify things, for this example let's just say I'm implementing a DSP algorithm in C#. In that case there is quite a big difference in using for vs foreach. In my algorithm I analyze the FFT results, then classify peaks based on certain criteria. Then there are additional algorithms that enumerate over those results and analyze the musical relationships between all the collected peaks. So as you can see, there is a lot of looping.

What I have found is that the performance increase when switching to for loops from foreach loops is roughly 30%.

The IO example is a bit more complex because just profiling a specific part of the management code itself (C++) has shown that that part also had a roughly 30% increase in performance when going to simple integer indexing. Now the IO in other parts of the code is a much bigger bottleneck, however changing the management code to use integer indexing resulted in an overall improvement of 5% or so. Not as big but still worthwhile.

BTW in this particular case the DSP algorithm needs to run on WP7/8. As such I can't use the GPU. I can use C++ on WP8 however to keep it backwards compatible with WP7, I prefer not to do that. The point is that with the right coding approach it is fast enough that I don't need to resort to something else.

That's an absurd conclusion. Depending on language compexity and optimizations applied, the compilation-time can vary wildly. One of the Go compilers can compile large code bases in less than a second.

I do not know other compilers. Was just figuring that the compilier would be the flagship app of a language and a lot of effort would be put into making it run well. And my recent experience using visual C++ is it is a bit shabby, kind of like the language and the compiler have been hacked together over the years. Which would be fine if I had an alternative. But C# does not handle structs very well and Microsoft says windows shell code should be written in C++.