Description

Almost everywhere in blender, the code assumes there is only one platform, and blindly uses the first platform. When you have the Intel OpenCL SDK and a non-intel GPU, blender will always crash when showing the "system" tab of user preferences. It was also causing a crash when running in background mode with multiple OpenCL platforms present. This patch fixes that.

The Multi device was utterly broken, it was impossible to try to reuse the device_memory object like it tried to do because named const vars store refs to them. Therefore, each subdevice of a multi device must have a separate device_memory object. This fixes the completely broken Multi device.

The multi device did each memory copy using blocking writes. This meant that it had to wait for each memory copy to proceed sequentially, for each multi device. A new memory copy capability exists (that falls back to synchronous if the target implementation doesn't implement the new *_async extensions): they return an integer, which can later be used to wait for an async operation. You can also implicitly gather the operations and wait for them all later. The multi device uses it to convert all synchronous operations to a pair of concurrent async operations.

Blender was completely overloading the GPU queue, causing severe GUI slowdown to the point of loss of control of the machine. This was being caused by putting far too much work into the OpenCL work queue. A self-tuning throttling mechanism now manages queue depth, greatly improving GUI responsiveness while still keeping the GPU full with work.

I made the device_update function do all asynchronous transfers using the new functionality (it synchronizes at the end of the top level update). This allows nearly all (all?) memory transfers to proceed asynchronously, overlapping subsequent processing.

Most of the compilation error fixes and multi device are likely to be solved now anyway. There's still some interesting parts about async API usage, before doing that i think it makes sense to finish kernel split first, which is happening in D1200.

Assigning to self, so the work is not getting totally lost from the radars.

@Aaron Carlisle (Blendify), it's not lost but since split kernel work changed priorities. I would also ask to leave priority triaging reports to developers, especially if it was explicitly set by a developer.