If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

The OpenCL State Tracker Nears Working State

01-25-2012, 09:40 PM

Phoronix: The OpenCL State Tracker Nears Working State

There's an update to the ongoing X.Org Endless Vacation of Code work, which is currently funding a developer to work on the OpenCL upbringing within the open-source world for graphics drivers. The latest work going on has been redesigning and largely rewriting the Clover state tracker that will provide the OpenCL support to Gallium3D graphics drivers...

Comment

With Francisco's and Tom's work, will it be possible to run OpenCL programs on both my graphics cards with this state tracker and the open source driver? Or would you need crossfire do to this? AFAIK, I thought crossfire was more of sharing a graphics load and not a compute load.

Comment

With Francisco's and Tom's work, will it be possible to run OpenCL programs on both my graphics cards with this state tracker and the open source driver? Or would you need crossfire do to this? AFAIK, I thought crossfire was more of sharing a graphics load and not a compute load.

Crossfire is about splitting a single graphics load then recombining the results into a single framebuffer so you wouldn't need it for OpenCL. Whether or not the first working implementation will support multiple GPUs for compute is another question and one of the devs working on it would have to answer.

AFAIK the OpenCL support for multiple devices uses an "expose both devices to the application and let it decide what to do" model rather than "pretend to be a single device and invisibly split/reassemble the work" so support for multiple devices with OpenCL should be a lot less work than a full Crossfire implementation.

Comment

So, how does this fit in with the work that AMD is doing to support OpenCL through LLVM? Will there be two seperate OpenCL state trackers - one that emits LLVM, and another that emits TGSI?

On a related note - LLVM has already started to support VLIW with the MachineInstrBundle, and they say there will be a generic VLIW scheduler in the future - so my 2 cents is that LLVM is looking better and better as a GPU IR. So, what exactly is missing from the backend then to prevent everybody from just switching over now? Are there some vector instructions missing?

Comment

So, how does this fit in with the work that AMD is doing to support OpenCL through LLVM? Will there be two seperate OpenCL state trackers - one that emits LLVM, and another that emits TGSI?

The clang front end generates LLVM IR and clover was written around LLVM IR (it used LLVM to generate x86 code to run the OpenCL kernels). I don't know if Francisco kept all the LLVM to x86 functionality but if not I guess we'll need to hook into the LLVM IR before it gets converted to TGSI. Tom was planning to go through Gallium3D but with a flag to pass LLVM rather than TGSI so hopefully they're heading in the same direction plus or minus a flag setting anyways.

We could go c99 => LLVM IR => TGSI => LLVM IR => ISA rather than C99 => LLVM IR => ISA but that seems like something we should avoid if possible.

Comment

The clang front end generates LLVM IR and clover was written around LLVM IR (it used LLVM to generate x86 code to run the OpenCL kernels). I don't know if Francisco kept all the LLVM to x86 functionality but if not I guess we'll need to hook into the LLVM IR before it gets converted to TGSI. Tom and Francisco talk frequently so hopefully one isn't demolishing what the other is building on.

We could go c99 => LLVM IR => TGSI => LLVM IR => ISA rather than C99 => LLVM IR => ISA but that seems like something we should avoid if possible.

So, I'll ask a question from a different perspective - what IR would you recommend for someone who was starting from scratch on a new driver? All LLVM, LLVM for compute and TGSI for graphics, or all TGSI? Would it be more work re-implementing optimizations/features already in LLVM, or adding needed functionality to LLVM?