(Question) Workload of OpenCL

Hello, got a question regarding 'workload' of OpenCL: Months ago I rendered some "Flames" with a different software using Cuda and experienced that my graphiccard sounded like a vacuum cleaner after some minutes.

Atm I am rendering an animation in hd with 10000 frames with OpenCL with Mandelbulber and my graphiccard just sounds like "not in use"...is it because its "just" OpenCL and this could not utilize the gpu the same way as cuda? Or because OpenCL is not yet fully implemented in Mb?

What I want to point out is, is there maybe a possibility for you to utilize the gpu more anyhow?

Yes it was under constant load while rendering the flame-fractal-file (flam4cuda/Apophysis). And I see that there are short interrupts while rendering my job with Mandelbulber...but the interrupts are really very short.

Maybe I try to view the load later when doing my next rendering with software...but until now I only hear that its fan does not much/nothing.

Hm, the workload is as I assumed and my ears telling me...far from doing much.

It renders an animation in 1920x1080 actually, it takes around 3 seconds every picture for rendering while I made this screenshot and ~0.1 second for SSAO and starting the next picture/writing 'OpenCl - rendering picture'.

I do not know much about the technical/programming background - but wouldnŽt it be possible to feed the gpu with 2 or more pictures at the same time for example to speed things up a bit and give the gpu a bit more to do?

When this job is done or tomorrow I do another test with 1 picture only, in a very high resolution, to see what difference it makes...

Pic 2 is while rendering an almost empty picture in highres from my animation, heavily zoomed out and just a few menger-boxes in the middle...gpu-load is at 2-3% all the time (and I wonder a bit why I still need more than 3 minutes for it to finish),

Pic 1 is while rendering 'amazing surf multi 001' in highres - and this looks "as it should" (and takes around 40 minutes to finish).

It is something what can be expected. If you render "nothing" then proportion of GPU and CPU workload is bad. CPU always needs to exchange data with GPU, refresh the image and run the UI. If you render image where GPU calculation time is much longer then this proportion is much better.

I haven't realized about the scale of the problem. Thanks to your problem report I have looked more deeply into the code and I have found several bottle necks. It was possible to reduce rendering time of 12000x8000 mandelbulb from 60s to 20s (this fractal didn't load GPU so much). To achieve it I have optimized image refreshing.

Thanks, just installed it and trying around.Fast-relaxed math compared to normal does not make any difference in my dancing-boxes-fractal, butin general I would say this version is around 2x as fast as before in this fractal. Complete animation took around 3-4 hours before, now it says 1,5 hours....wow. Is this speedup all caused by "MarkRenderedTiles"? Definately let it turned off.

The picture I attach is a comparison to reply #3 in this thread, I forgot which frames I rendered there exactly but it should be the same setting. GPU-load is still not full in usage, but as said, this version seems to be much faster at all.

IŽll play around a bit more now with this version...really looking forward to the next update

Even this version is much quicker than before, here I add a picture made while rendering my "Bombardinho"-animation in 1920x1080.It shows that Gpu-usage goes up to around ~50%.

Just a thought, IŽve read in another forum regarding gpu-utilization "Hi, did you use big tiles for render?OpenCL like it, try one tile for example." and its seen in Mandelbulber that every screen also is rendered in tiles - would it maybe be possible to make the tile-size changeable as an option in Mandelbulber (maybe in animation-dock, on/off or something like that)? Or is that hard to code? On single/large pictures it maybe is interesting how it reveals and gets visible...but for an animation speed possibly is more important.

Would it make a difference maybe if the tile-size in Mandelbulber gets increased, do you have any experiences? On what does the actual tilesize depend? The comment sounds interesting, maybe then the gpu got a bit more to do and rendering maybe gets faster...just a thought, maybe worth a test / benchmark. Will a larger tilesize have a disatvantage I do not see at the moment?

Edit: Or is it maybe possible to parallelise something in the code/tiles maybe? However, the more I read about tiling in OpenCL the more complicated this seems to be to me And I do not have the knowledge what exactly happens in Mandelbulber/OpenCL-rendering exactly or how this all is being done. Just a few thoughts, the only thing I see is that the gpu is being able to do more in many cases...the question is "just" how...

Seems I have chosen the "best" fractals for my previous screenshots...these here all looks a bit better...still many situations where gpu is far from fully in usage, but much better than the ones in my musicvideos.

IŽve just opened some examples and added some little movements to the Animdock and rendered them in 1920x1080.

First thing what I have found that the worst are images where is used DOF effect. But this something what can be expected, because a lot of computation for this effect can be done only by CPU (sorting and randomizing of z-buffer). This effect uses a lot of memory processing which also reduces GPU load (there is visible higher Memory Controller Load and Bus Interface Load)