Rob Bredow on GPU-accelerated rendering

To uncover the real pros and cons of the GPU as a rendering platform, the first thing you need to do is cut through the hype. Sony Pictures Imageworks has long been associated with the development of GPU-accelerated production tools, both for rendering and simulation so we took the opportunity to speak to Imageworks CTO Rob Bredow to get an exclusive look at what tools are used by the major studios.

3D World: When did Imageworks first start using the GPU for rendering?

Rob Bredow: We have been using the GPU to generate elements of our final frames all the way back to The Polar Express (below), which we started work on in 2002. So we’ve been doing it for the past eight years.

The work on The Polar Express used Splat, which is a renderer we’re still using today. It basically gives us the ability to render lots and lots of overlapping sprites: great for simulating smoke, certain kinds of fire, and relatively low-detail atmospheres. For that particular renderer, we can also render in software, and that’s a hundred times slower – sometimes even a thousand times slower – so the GPU is a real big boost.

3DW: But Splat isn’t your main production renderer, is it?

RB: No, it’s a special-purpose renderer. It has been used on a lot of shows, but our primary renderer is Arnold [the GI renderer originally written by Marcos Fajardo, now developed jointly by Fajardo and Imageworks’ in-house team]. I believe that every show in the facility is now using it as its primary renderer.

3DW: So could Arnold, or components of Arnold, become GPU-based in future?

RB: One of the goals behind the Open Shading Language [part of Imageworks’ current open-source development programme] is to make a clear area where we could customise the renderer for future computing platforms: a break between the language the engineers and artists are using and the actual execution backend.

The first target for that backend is the CPU, and that’s what we’re using now in production. But the design goals of OSL include having a GPU backend, and if you were to browse on the discussion lists for OSL right now, you would see people working on GPU-accelerated renderers. So that could happen in future: that a component of the rendering could happen on the GPU, even for something like Arnold.

3DW: But only a component of the rendering, rather than the entire process?

RB: Given the complexity of the textures and the complexity of the geometry, the need for full GI, motion blur, hair, subsurface scattering – all the things we take for granted in a production-quality renderer – that doesn’t all fit today in the GPU, so it would have to be components.

3DW: So which components can be calculated efficiently on the GPU?

RB: Any time you get an opportunity to do something – even something very complicated – many millions of times in a row, as long as it doesn’t require more onboard memory than you have on the GPU, that can be faster than on the CPU.

But it is amazing what CPUs are capable of today. We’re doing things [on the CPU] that would have been unthinkable a few years ago: scenes with full GI and millions of hairs, for example. And we’re getting render times of less than ten hours, sometimes less than six hours, per frame.

The Splat renderer in action, on the 'spaghetti tornado' scene from 2009's Cloudy With a Chance of Meatballs.

3DW: Will CPU architectures continue to evolve fast enough to take the impetus off developing for the GPU?

RB: It’s hard to figure out the right balance. We have the full spectrum here. For example, we have two renderers we use for smoke. Splat is very fast, and when you need a wispy element, something that fills the screen in an atmospheric way, it’s the perfect solution because you can do an element in a few hours from beginning to end.

But we also have a [CPU-based] package called Svea, which is our volume renderer. If you need a pyroclastic cloud with lots of nice sharp detail or the kind of incredible atmospheric effects you see in Alice in Wonderland – and you’re willing to put a day or two into those elements – it’s the right tool for that job.

So even in the area of special effects rendering, there are certain compromises that come along with what the GPU is optimised for. If you’re willing to design a system around that and put it in the hands of a talented artist, they can do 20 iterations in a day, instead of one or two. But it doesn’t make sense to cram the kinds of scenes we throw at Arnold every day, with tens of thousands of piece of geometry and millions of textures, at the GPU. Not today. Maybe in a few years it will.

3DW: Was Splat written for the GPU from the ground up?

RB: Yes. It was targeted towards fitting the entire scene into the GPU, with all of the constraints that that imposes. We do have a version of Splat that runs on the CPU, but we wrote it after we wrote the GPU version. And when we look at the render times we get out of [the CPU version], it is highly disappointing, always. But that’s because it isn’t a very good design for a CPU. The whole philosophy you take when developing for the GPU is quite different to that for the CPU.

3DW: So why does the GPU version of Splat work so well?

RB: Because we’re able to use the GPU in the way its first APIs were designed. The way we talk to the GPU is through OpenGL, and that’s a highly tuned, highly optimised path. Every time Nvidia or anyone else comes out with a new driver, it’s going to increase the performance of Splat. If you’re writing code for the GPU, you want to be exercising the same parts of the hardware that are used for videogames.

New GPUs like Nvidia’s 6GB Quadro 6000 are overcoming early issues with onboard memory.

3DW: How much of an issue is the onboard memory of the GPU?

RB: It’s getting less important every year. A few years ago, it was all about what you could do in the very limited memory you had available to address directly, and there were all sorts of strategies that people used to maximise its use and handle the I/O back to the machine’s main memory.

Now the standard graphics cards that we have in everybody’s workstations [mid-to-high-end professional Nvidia GPUs] have 256 or 512MB of memory, and that’s a much more useful amount. It’s still a consideration, but it’s not anywhere near the issue it was.

3DW: What about the standards for parallel computing? Imageworks uses CUDA in some tools: is that simply because you use Nvidia hardware, or is there a more fundamental reason you don't use OpenCL?

RB: The promise of OpenCL is a great one: that you code once, then run that code on the architecture that makes the most sense, or that you have available at the time when you actually need to run it. That would be the Holy Grail of this parallel computing world that we’re heading towards.

But we haven’t seen – and perhaps it’s there, and we just aren’t aware of it yet – code running interchangeably on CPUs and GPUs in the kind of production environment we need. We’d love for it to happen, but at the moment we’re still writing code that’s specially tuned in CUDA for the GPUs we have today, or in OpenGL, which will always run in an optimised fashion on the GPUs of tomorrow.

We haven’t yet written anything designed around OpenCL and the idea of being able to target both CPU and GPU interchangeably. If that happened, one of the first places we’d leverage it would be in Open Shading Language, in the shaders that run inside Arnold. That would be a great optimisation for us.

---

This interview is an exclusive online supplement to our feature 'Rendering on the GPU: hits and myths' in issue 138 of 3D World magazine, in which we spoke to the world of off-the-shelf software development, including mental images, Pixar and Next Limit. To learn more about the pros and cons of rendering on the GPU, you can buy a copy of issue 138 by clicking on the magazine thumbnail or the link below.