PS. FPS I GOT IN MY ENGINE: 50 - 70 fps, res: 1080x1920, system can run Crysis 3 at high settings, so that's not the problem.

Scene:

right now I'm facing another issue, which is the performance of my Engine. Now the problem is not my architecture, well at least I don't think so, so I tried running a Profiler.

Most of the performance was taken off in the initialization of course, but then later on in the render loop then rendering of the depth, normal, diffuse maps are quite fast, relatively to other things. Now the real problem comes when the swapchain->present(0,0) is called. As i understand it takes a long time because it's waiting for the previous frame to finish, right?

I have head some rumors about that if statements in a shader are quite weird, because even though it is a negative statement, false, it will still semi-run/check all the contents inside the if statement, which makes it quite slow, is this true or just some rubbish?

Also my FPS is NOT affected if I change my regular shader for all the objects (tried to remove all so only the positions were calculated in world space and then returned a white color), but then if I changed my Post Processing shader to a very simple version which ONLY returns the diffuse map, my FPS is boosted 2x - 3x times! But why? (Shader is below if you're interested, but there are still errors).

You can put them in the same hlsl file, as long as you're willing to compile it more than once.

You simply swap tests like "if(blur ==1) { ... }" for "#if defined(BLUR) ... #endif" and compile once with "/D BLUR" on the command line, once with "/D SSAO", etc. You can also set the defines in code if you're compiling shaders at runtime.

You then pick the correct shader to use, instead of setting a constant.

SSAO and god rays can be very pixel heavy effects and I'm guessing you're doing them at full resolution (1920x1080). Even full screen blurs can put a fair amount of pressure on fill rate at high resolutions. When you take into account that more than likely every one of your branches is being evaluated even if the conditions are false, this could be adding up to make a very expensive shader.

A lot of these effects are rendered to smaller render targets, such as something like 1/4 size of the backbuffer (experiment with the size to get a good image quality vs. performance trade off). And as mentioned above, even though it's 2013 we still really need to be using the preprocessor for our branches rather than if statements. My recommendation would be render SSAO to a small target by itself, then god rays to another small target by themselves, then have your big post process shader at the end composite those effects along with blurs and distortion etc. using #defines to turn effects on and off as needed.

If statements can be cheap in hlsl, if there's no branch involved and the code you execute is simple. For example this if statement is a cheap one:

if (x > 7) x = 7;

They are also free if the condition can be evaluated at compile time.

They get expensive when the extra code that gets executed is significant, because the compiler will generally execute the code anyway and multiply the result by either 0 or 1 depending on the result of the if.

You can also ask the compiler to [branch] instead of evaluating the whole thing and throw away the result. The expense of that depends on things like what pattern of pixels goes down each path, but it can be beneficial if you avoid executing the extra code a lot of the time. While the compiler will sometimes automatically decide to do a real branch, you're best off specifying it yourself as you get extra errors back if if it can't do a branch due to texturing issues (i.e. tex2D vs tex2Dlod).

Your best option when optimizing is to use a tool like GPU Shader Analyzer to see what instructions get generated, as well as profiling the performance yourself, because if statement performance depends on the input data.

Ok, now I've improved the frame rate a bit, so basically what I do is that I have an individual material for each mesh, (a shader) which can be modified by the user on creation. So this also helped me to escape the fixed shadings. Now the only problem is that I need to write a class that can parse any kind of shader with it's needs, because some shaders needs a specific input and some don't, and the class needs to detect that.

And a funny note, whilst doing this I lost some shader data, basically my whole post processing shader, because I closed Visual Studio without undoing, but then I realized that I had a copy here in this forum

Hi! Cuboid Zone
The Rule: Be polite, be professional, but have a plan to steal all their shaders!

Did you, or did you not, yet, create a separate shader with just one effect (Bloom) as I proposed above, to check the actual difference in performance ?

Easiest way would be to just add a different shader technique to a same HLSL file. If you are not familiar with techniques, and are not willing to learn them - then just create a separate shader HLSL file with the new shorter pixel shader.