What I am 'imagining' might make it easier to add support for ATI and various other GPU's.

Doing it this way would never 'out perform' the optimized CUDA apps that we have since they 'reserve' the whole GPU but, it seems like it might significantly boost 2, 4,or 8 non-CUDA application streams. Also, it might increase the overall utilization of the available GPU's and, perhaps, allow our apps to benefit from GPU's with smaller memories.

Any thoughts? Has it been tried? Any pointers to other discussions about anything like this?