I dispute that this is adequate, for reasons shown below. This is with a Sony A7 series 14-bit deep uncompressed ARW file of 24 MPx, on a 2.5 GHz quadcore AMD Phenom II X4 905e, with an NVidia GeForce 1060 6GB as OpenCL renderer (with latest NVidia beta driver):

So we see that even if preview pipeline might run in parallel on the CPU with the other GPU pipelines, it takes 1 s longer than if we do everything sequentially on the GPU.

Suggestion: The OpenCL scheduler should benchmark CPU and GPU pipes to determine relative speed, and on fast OpenCL devices, permit running the preview pipe on the GPU as well. It might even be a parallel trial & error if OpenCL is present, dispatch to both in parallel and abort the slower pipe once the faster has delivered its result - but the latter might be at a disadvantage when CPU-only computation steps are in the pipeline.

History

OK, so we might need speculative execution (i. e. dispatch CPU + GPU in parallel and see whoever arrives first at a synchronization point and then either abort the other path while letting the winner continue until the next "CPU or GPU" dispatch decision) if we want to get it right in the general case.

Can you please elaborate on these settings and results? According to my understanding preview and full pixelpipe are started in parallel in the cases that are relevant for this topic (change of module parameters etc.). The first pixelpipe will get the first free OpenCL device, the second one will then normally run on the CPU. How did you manage to have the GPU process both of them?

OK, so we might need speculative execution (i. e. dispatch CPU + GPU in parallel and see whoever arrives first at a synchronization point and then either abort the other path while letting the winner continue until the next "CPU or GPU" dispatch decision) if we want to get it right in the general case.

Frankly, I don't like that idea. It's wasting energy and increasing noise level.

Can you please elaborate on these settings and results? According to my understanding preview and full pixelpipe are started in parallel in the cases that are relevant for this topic (change of module parameters etc.). The first pixelpipe will get the first free OpenCL device, the second one will then normally run on the CPU. How did you manage to have the GPU process both of them?

Frankly, I don't know how it happens, but it's like 95% consistent. Once in a while one of the pipes gets dispatched on the CPU, and then it's slow, and noticably slow because the GUI update happens many seconds slower than usual. This is my entire set of opencl-related settings:

Frankly, I don't like that idea. It's wasting energy and increasing noise level.

If noise level for this fraction-of-a-second computation peak is a concern, then the system design neglected noise totally. I can run Unigine Valley OpenGL for extended periods of time, like 10 benchmarks in a row (some 3 minutes each) with c. 35 fps in "Extreme HD" here, and even after half an hour, the computer is not making audibly more noise than when left idle. Yes, the GPU gets some 35 K warmer up to 70 °C and its two fans go to like 1200/min start, but noise is not at all a concern.