Imagination Technologies is one of those companies simultaneously ubiquitous and invisible. Its PowerVR graphics processors drive high-profile electronics like Sony's PlayStation Vita, Apple's iPhones and iPads, and any number of past-and-present smartphones, tablets, and laptops. But you'd probably be hard-pressed to find anyone outside of technology circles who actually knows the name.

While most of our coverage of Imagination is driven by these mobile GPU designs, the company also has its eyes on other markets. We stopped by its CES meeting room to get a glimpse at the Series6 PowerVR GPUs that are going to begin making their way into consumer products this year, but the company was also showing off something else: a pair of workstation-class PCI Express add-in cards that allow 3D rendering programs to do something called ray tracing in real time. This is something hardware developers have been chasing (and we've been covering) for many years, so we took some time to see the hardware in action.

What is ray tracing, and why do I want it?

Enlarge/ Ray tracing algorithms are designed to accurately render light and its interaction with various objects.

Chris Foresman

To put it as simply as possible, ray tracing is used to render light and its interactions with objects. A ray tracing algorithm will track rays of light from a light source to an object. Once the light hits that object, the algorithm can account for how much light will be absorbed by the surface, how much will be reflected or refracted by the surface, and how that reflected and refracted light interacts with other surfaces, among other things.

The end result is an image with very accurate, realistic light. There are plenty of reasons this would be useful—better-looking 3D games, movies, and computer-generated imagery on the entertainment side, faster and more lifelike CAD renderings on the professional side—but ray tracing has typically been too intensive a task to do in real time. It's a workload that is ill-suited to even the most powerful of today's graphics processors. Intel briefly promised real-time ray tracing would become an affordable reality in its discrete graphics card, codenamed Larrabee, but that initiative was axed before any real hardware could see the light of day.

This is where Caustic comes in. The company has been talking up its real-time ray tracing technology since it was a scrappy young startup back in 2009, and an acquisition by Imagination in 2010 has only increased its ambition. Its first commercially available real-time ray tracing cards go on sale this month, but as we'll discuss, the company has even bigger plans for the technology in the future.

The Caustic R2500: Work smarter, not harder

The faster of the two cards, the $1,500 Caustic R2500, is physically large, but the actual hardware driving it is much less powerful (and power-hungry) than what's found in a high-end workstation graphics card. Two of Caustic's ray tracing units, or RTUs, are located under the card's small cooling fans. Each of them has 8GB of memory dedicated to it for a total of 16GB on the entire card.

This sky-high amount of memory, which far exceeds the amount available on most graphics cards (and even many computers), is the highest-end spec the card has. The heavy lifting is done by silicon that seems ancient by today's standards: the RTUs are manufactured on a 90nm process and use DDR2 memory, both of which were cutting-edge circa 2005 or so. Despite this, the card still requires a relatively small amount of power. While its peak power consumption is rated at 60 watts, we were told that realistically it maxes out at about 40 watts—much less than a modern high-end (or even a low-end) graphics card, and well under the amount that would require the card to have a separate power plug.

For those with less cash, the $800 Caustic R2100 offers one RTU paired with 4GB of RAM and consumes about half the power of the R2500, with a peak power consumption of between 30 and 40 watts—because it has half the RTUs and a quarter of the RAM, its rendering speed should be less than half that of the R2500, though we weren't able to see the low-end card in action to be sure.

The cutting edge secret sauce of the R2500 isn't in its raw power, then, but the patented algorithms that Imagination and Caustic are using to solve the real-time ray tracing problem.

"The way the algorithm works, it turns ray tracing from a high-performance compute, memory-intensive problem to one that's more like a database problem," Imagination Technologies Director of Product Management Michael Kaplan told Ars. "It's highly optimized. We can store about 120 million triangles on that card."

The end result is easier to show than it is to describe—Imagination Technologies Director of Business Development Alex Kelley was on hand to give us a demonstration of just what the Caustic R2500 was capable of.

Video by Chris Foresman

The Caustic card enables CAD programs to realistically render light in real time and allows designers to change things like the angle of a windshield or the color of the paint without having to wait for the computer to re-render the image every time they make a change.

"You're getting all of this information very early on in the design process, where normally you'd have to sit, hit the render button, wait, and then move on," said Kelley.

Developers will have to build support for the Caustic cards into their applications, but Imagination is providing tools to make that as simple as possible. OpenRL is the low-level API, roughly equivalent to OpenGL on the graphics side, while the company's Brazil SDK is the high-level toolkit that developers can use to implement Caustic support in their software. The demo seen above was given in Autodesk Maya, for which a plugin is already available, and McNeel's Rhino 5 CAD software will also support the card. Imagination is working with several other vendors to implement support, but it couldn't give us any more specifics as of this writing.

Again, there's also a lower-end version of the card for those with less cash (the $800 Caustic R2100 mentioned above). Everything we saw in this demo was based on the performance of the R2500, but smaller shops will definitely appreciate the cheaper option.

An ambitious future: Real-time ray tracing in a tablet?

We came away impressed by the technology in Caustic's add-in card, but the reality is it's difficult to sell an extra add-in card that isn't a GPU these days—just ask Ageia, whose dedicated physics processing add-in cards went pretty much nowhere before Nvidia snapped them up and integrated PhysX support into its GeForce cards. The Caustic cards will appeal to people in the high-end, high-margin workstation market, but by Imagination's own admission that market is quite small.

The Caustic technology's path to the mass market will be similar: Imagination intends to integrate into future versions of its PowerVR GPUs. This isn't going to happen anytime soon—the Imagination representative gave us a tentative estimate of "four to five years" from now—but it may be that the phones and tablets of tomorrow will be capable of 3D rendering that is only now beginning to hit high-end workstations.

This would dovetail nicely with the way the industry is moving. Mobile devices are already getting more productive as they get more powerful, and the major hardware manufacturers seem determined to deliver devices that can be all things to all people—phones that can double as tablets, tablets that can double as laptops, and so on. By 2018, it's easy to imagine a tablet that can also do high-end CAD work, and if Imagination has its way, the Caustic ray tracing technology will be leading that charge.

Promoted Comments

I wrote the following (slightly redacted) up a little while ago for another company looking at consumer level ray tracing hardware as it relates to games. I do think workstation applications are the correct entry point for ray tracing acceleration, rather than games, so the same level of pessimism might not be apropriate. I have no details on Imagination’s particular technology (feel free to send me some, guys!).

------------

The primary advantages of ray tracing over rasterization are:

Accurate shadows, without explicit sizing of shadow buffer resolutions or massive stencil volume overdraw. With reasonable area light source bundles for softening, this is the most useful and attainable near-term goal.

Accurate reflections without environment maps or subview rendering. This benefit is tempered by the fact that it is only practical at real time speeds for mirror-like surfaces. Slightly glossy surfaces require a bare minimum of 16 secondary rays to look decent, and even mirror surfaces alias badly in larger scenes with bump mapping. Rasterization approximations are inaccurate, but mip map based filtering greatly reduces aliasing, which is usually more important. I was very disappointed when this sunk in for me during my research – I had thought that there might be a place for a high end “ray traced reflections” option in upcoming games, but it requires a huge number of rays for it to actually be a positive feature.

Some other “advantages” that are often touted for ray tracing are not really benefits:

Accurate refraction. This won’t make a difference to anyone building an application.

Global illumination. This requires BILLIONS of rays per second to approach usability. Trying to do it with a handful of tests per pixel just results in a noisy mess.

Because ray tracing involves a log2 scale of the number of primitives, while rasterization is linear, it appears that highly complex scenes will render faster with ray tracing, but it turns out that the constant factors are so different that no dataset that fits in memory actually crosses the time order threshold.

Classic Whitted ray tracing is significantly inferior to modern rasterization engines for the vast majority of scenes that people care about. Only when two orders of magnitude more rays are cast to provide soft shadows, glossy reflections, and global illumination does the quality commonly associated with “ray tracing” become apparent. For example, all surfaces that are shaded with interpolated normal will have an unnatural shadow discontinuity at the silhouette edges with single shadow ray traces. This is most noticeable on animating characters, but also visible on things like pipes. A typical solution if the shadows can’t be filtered better is to make the characters “no self shadow” with additional flags in the datasets. There are lots of things like this that require little tweaks in places that won’t be very accessible with the proposed architecture.

The huge disadvantage is the requirement to maintain acceleration structures, which are costly to create and more than double the memory footprint. The tradeoffs that get made for faster build time can have significant costs in the delivered ray tracing time versus fully optimized acceleration structures. For any game that is not grossly GPU bound, a ray tracing chip will be a decelerator, due to the additional cost of maintaining dynamic accelerator structures.

Rasterization is a tiny part of the work that a GPU does. The texture sampling, shader program invocation, blending, etc, would all have to be duplicated on a ray tracing part as well. Primary ray tracing can give an overdraw factor of 1.0, but hierarchical depth buffers in rasterization based systems already deliver very good overdraw rejection in modern game engines. Contrary to some popular beliefs, most of the rendering work is not done to be “realistic”, but to be artistic or stylish.

I am 90% sure that the eventual path to integration of ray tracing hardware into consumer devices will be as minor tweaks to the existing GPU microarchitectures.

A ray tracing algorithm will track rays of light from a light source to an object. Once the light hits that object, the algorithm can account for how much light will be absorbed by the surface, how much will be reflected or refracted by the surface, and how that reflected and refracted light interacts with other surfaces, among other things.

I could be getting completely mixed up here, but if I remember correctly you're actually describing photon mapping, not ray tracing. Ray tracing is a step below that, where instead of tracing where light goes from the perspective of the light sources and building up a 'map' of what's lit, instead it starts at the 'eye' and works backwards, working out for each given light if that lights manages to make it to the portion of the view you're currently rendering and if so what colour you should end up.

A ray tracing algorithm will track rays of light from a light source to an object. Once the light hits that object, the algorithm can account for how much light will be absorbed by the surface, how much will be reflected or refracted by the surface, and how that reflected and refracted light interacts with other surfaces, among other things.

I could be getting completely mixed up here, but if I remember correctly you're actually describing photon mapping, not ray tracing. Ray tracing is a step below that, where instead of tracing where light goes from the perspective of the light sources and building up a 'map' of what's lit, instead it starts at the 'eye' and works backwards, working out for each given light if that lights manages to make it to the portion of the view you're currently rendering and if so what colour you should end up.

I'm pretty sure you're right. Ray tracing starts at the eye, the ray intersects the screen and is then followed through the scene as it bounces around. This way you optimise the work by only caring about what is displayed on the screen - you don't want to do all the work of following a photon only to find it never goes to the screen for display!

As someone that lugs around a laptop with a workstation graphics card in it I can imagine that a low power consumption card capable of reasonably rendering a CAD model would really appeal to engineers such as myself. It's not a hug market and the card would have to shrink but the margins are high and we're used to paying a $1k+ premium for a certified card and driver. Laptops would be the halfway step from full-sized cards to tablets.

A ray tracing algorithm will track rays of light from a light source to an object. Once the light hits that object, the algorithm can account for how much light will be absorbed by the surface, how much will be reflected or refracted by the surface, and how that reflected and refracted light interacts with other surfaces, among other things.

I could be getting completely mixed up here, but if I remember correctly you're actually describing photon mapping, not ray tracing. Ray tracing is a step below that, where instead of tracing where light goes from the perspective of the light sources and building up a 'map' of what's lit, instead it starts at the 'eye' and works backwards, working out for each given light if that lights manages to make it to the portion of the view you're currently rendering and if so what colour you should end up.

You are correct. If you start at the light source and trace light paths forward, most paths do not end up a the camera, so are wasted effort. Instead, you trace back from the camera so that you are only tracing paths that you know end at the camera.

However, conceptually it amounts to the same thing. Working backwards just allows you to better choose which paths to calculate and which paths to ignore.

I'm surprised that they're saying 4-5 years before integration with PowerVR. I'd be fast tracking that product and release it as a desktop part. I'd get a little market share (like the PowerVR series 3 a decade ago) but the main reason would be to give developers a platform to work with before being integrated into mobile device. On that note, series 6 GPU's are due in mobile devices later this year.

The article is a bit incorrect that Larrabee went no where: it was rebranded Xeon Phi and is being sold as a coprocessor.

I can understand this taking a while to trickle down to consumer devices, but surely they're going to sell a boatload to dedicated render-farms like those needed for Pixar movies etc. Is this even compatible with those sorts of operations or is it only for small workstations and a designer running a single CAD program?

I'm also curious about it being a 90nm process chip. Now that it has been made and proven, can they now skip ahead to a 22nm process easily or is it necessary to visit all stages (90nm - 65nm - 45nm - 32nm - 22nm) along the way?

I see tablets displaying the ray tracing results beamed from a desktop or tower, but not doing the ray tracing themselves anytime in the near future.

The creative process still mainly requires fine control, which is rather lacking on most tablets where the input device is still mainly a clumsy finger or shaky stylus hand.

Also tablets are rather restricted on power, most have no mechanical cooling system, so processor technology has to progress tremendously before the computational power of ray tracing becomes the norm in tablet like devices.

However it certainly has a more immediate future on 3D gaming consoles and Ultra HDTV where power and cooling is virtually unlimited and it's evolution supported by the income from the gaming industry.

I can understand this taking a while to trickle down to consumer devices, but surely they're going to sell a boatload to dedicated render-farms like those needed for Pixar movies etc. Is this even compatible with those sorts of operations or is it only for small workstations and a designer running a single CAD program?

I'm also curious about it being a 90nm process chip. Now that it has been made and proven, can they now skip ahead to a 22nm process easily or is it necessary to visit all stages (90nm - 65nm - 45nm - 32nm - 22nm) along the way?

This only helps someone like Pixar is their rendering pipeline is compatible (or could be made compatible) with Imagination's new algorithm.

While they're not giving any concrete details about their proprietary algorithm, it's clearly highly memory-intensive. That's going to be the sticking-point when it comes to migration - the custom processor doesn't seem to be anything special, but bumping mobile GPUs up to 4-8GB of RAM is going to be hard.

Some background would have been nice, these people bought splutterfish if I am not mistaken, makers of one of the best renderers for 3da max: Brazil rs. It seems like they've transformed it to run in hardware. A shame they don't license brazil anymore forcing you to use vray or mental ray. I hope it's not too little too late for them.

I'm also curious about it being a 90nm process chip. Now that it has been made and proven, can they now skip ahead to a 22nm process easily or is it necessary to visit all stages (90nm - 65nm - 45nm - 32nm - 22nm) along the way?

There can be multiple reasons for them starting out with a 90nm fab process. The most prominent of these is price, as it is much cheaper to manufacture wafers with components at a larger size, e.g., 90nm or 65nm, than it is at smaller ones, e.g., 28nm or 22nm. Secondly, it's possible, though highly unlikely, that Caustic Graphics used some custom cell libraries that were only available at 90nm.

As for your question, there is nothing that prevents them from licensing standard cell libraries, or making their own, and converting their design to one that is less than 90nm, e.g., directly to 32nm, especially if they're synthesizing from VHDL or a similar hardware description language.

OldMacGuy wrote:

Also tablets are rather restricted on power, most have no mechanical cooling system, so processor technology has to progress tremendously before the computational power of ray tracing becomes the norm in tablet like devices.

Actually, it's possible to do 1080p, 24 FPS+ ray/path tracing, for reasonably complex, animated/static scenes, using a custom-designed processor that could feasibly be integrated with a tablet SoC. In fact, I recently submitted some journal papers to the IEEE Trans. Visualization and Computer Graphics and IEEE J. Solid-State Circuits that discussed a multiple-instruction, multiple-thread graphics architecture and 22nm hardware implementation, albeit not going so far as to have it physically implemented, for that, along with a scaled up design destined for workstations.

This only helps someone like Pixar if their rendering pipeline is compatible (or could be made compatible) with Imagination's new algorithm.

I doubt that someone like Pixar will change PRMan (their implementation of Renderman) to rely on a single manufacturers technology unless that manufacturer is NVidia of course. Also they already do a rasterizer-ray-tracing hybrid so adding in a third data-structure they need to take into account seems very complicated.

Chuckstar wrote:

Path tracing is much more computationally intensive than ray tracing.

Yes and no. In Ray Tracing you would normally shoot e.g. 1 ray pr. pixel from your camera into the scene. When you hit a surface that is NOT a perfect mirror but still have some reflectivity, you spawn a number of new rays, tracing from the hit-point into the scene etc. The bad thing about ray tracing is that the ratio between original "camera"-rays and lvl 3/4/5/6-reflection-rays is extremely skewed. Meaning we do a lot of work where it affects the final result only a tiny bit.

In path-tracing you send e.g. 20 rays into your scene pr. pixel. When they hit something (that is not a perfect mirror) you determine a new direction to trace in based upon some probability function. We need to cast many more rays per pixel to eliminate overly bright pixels and other artifacts, but the point is that the work here is placed closer to the final pixels.

I have a very hard time seeing their selling point. Apparently they have a novell ray-tracing algorithm that uses some form of database-like search-structure (just like this one that wen't viral some time ago: http://www.youtube.com/watch?feature=pl ... KUuUvDSXk4) but it sounds like you would have to put your entire data-structure into their database-format to make it work. Doubt that will happen as modern renderers have extremely complicated and optimized data-structures already. Rewriting those to encompass this, is likely to have far fetching consequences which would be hard to predict.

I also fail to see the ground-braking news in that demo. VRay RT have been able to do that for years: http://www.youtube.com/watch?v=6ZmuI2xQp2MThis is a renderer running on GPU and is widely used in both the feature, commercial, and architectural industry.

While they're not giving any concrete details about their proprietary algorithm, it's clearly highly memory-intensive. That's going to be the sticking-point when it comes to migration - the custom processor doesn't seem to be anything special, but bumping mobile GPUs up to 4-8GB of RAM is going to be hard.

One thing to consider is that smartphone and tablet rendering workloads will most likely be output at a lower resolution than high-end CAD design. That should considerably reduce the memory footprint.

This only helps someone like Pixar if their rendering pipeline is compatible (or could be made compatible) with Imagination's new algorithm.

I doubt that someone like Pixar will change PRMan (their implementation of Renderman) to rely on a single manufacturers technology unless that manufacturer is NVidia of course. Also they already do a rasterizer-ray-tracing hybrid so adding in a third data-structure they need to take into account seems very complicated.

Chuckstar wrote:

Path tracing is much more computationally intensive than ray tracing.

Yes and no. In Ray Tracing you would normally shoot e.g. 1 ray pr. pixel from your camera into the scene. When you hit a surface that is NOT a perfect mirror but still have some reflectivity, you spawn a number of new rays, tracing from the hit-point into the scene etc. The bad thing about ray tracing is that the ratio between original "camera"-rays and lvl 3/4/5/6-reflection-rays is extremely skewed. Meaning we do a lot of work where it affects the final result only a tiny bit.

In path-tracing you send e.g. 20 rays into your scene pr. pixel. When they hit something (that is not a perfect mirror) you determine a new direction to trace in based upon some probability function. We need to cast many more rays per pixel to eliminate overly bright pixels and other artifacts, but the point is that the work here is placed closer to the final pixels.

I have a very hard time seeing their selling point. Apparently they have a novell ray-tracing algorithm that uses some form of database-like search-structure (just like this one that wen't viral some time ago: http://www.youtube.com/watch?feature=pl ... KUuUvDSXk4) but it sounds like you would have to put your entire data-structure into their database-format to make it work. Doubt that will happen as modern renderers have extremely complicated and optimized data-structures already. Rewriting those to encompass this, is likely to have far fetching consequences which would be hard to predict.

I also fail to see the ground-braking news in that demo. VRay RT have been able to do that for years: http://www.youtube.com/watch?v=6ZmuI2xQp2MThis is a renderer running on GPU and is widely used in both the feature, commercial, and architectural industry.

Those cards are dead before they were ever printed.

I would imagine that, just like if I wanted to look up items in (amortized) constant time, while also being able to get the highest priority element efficiently, I would have to have the data (or pointers to the data items) sitting in both a hash map and a priority queue. I'm guessing that's why the card has an absurd amount of RAM, to store another copy of the data in its own data structure.

While they're not giving any concrete details about their proprietary algorithm, it's clearly highly memory-intensive. That's going to be the sticking-point when it comes to migration - the custom processor doesn't seem to be anything special, but bumping mobile GPUs up to 4-8GB of RAM is going to be hard.

Well, in principle, you can stick however much VRAM you want in a GPU. It's just going to cost more.

While they're not giving any concrete details about their proprietary algorithm, it's clearly highly memory-intensive. That's going to be the sticking-point when it comes to migration - the custom processor doesn't seem to be anything special, but bumping mobile GPUs up to 4-8GB of RAM is going to be hard.

One thing to consider is that smartphone and tablet rendering workloads will most likely be output at a lower resolution than high-end CAD design. That should considerably reduce the memory footprint.

Still, I don't doubt that it's a long way off right now.

True, the memory demands are quite likely exponential, but then there are all these high-end smartphones with 1080p displays... And you'll need to render over the whole screen, not just a window.

mhall1 wrote:

Well, in principle, you can stick however much VRAM you want in a GPU. It's just going to cost more.

While they're not giving any concrete details about their proprietary algorithm, it's clearly highly memory-intensive. That's going to be the sticking-point when it comes to migration - the custom processor doesn't seem to be anything special, but bumping mobile GPUs up to 4-8GB of RAM is going to be hard.

This thing uses DDR2 RAM. I just bought two 4GB *DDR3 SODIMMS* for $35 (for the pair) on Newegg. RAM is cheap. Even 16GB of RAM is cheap.

Here's a demo of one, semi modern, Nvidia card, running at 30+ fps, with refractions and reflections and now noise, unlike the demo. At this point the Nvidia card would cost less than half the cost of even the low end Caustic card.

The big difference, by the way, between a "Workstation" card and your average card GPU that a gamer might have is software. Nvidia and AMD just disable CAD acceleration on the normal cards and then charge three times as much for the "workstation" mostly because they can.

It is good to note that while many people can easily get caught up with the "ray-tracing" buzzword, it isn't necessarily the be-all and end-all of computer graphics. It's certainly an elegant solution and you get a lot of stuff for free (reflections and shadows, for instance). However, it's not inherently better than rasterization. The reason we have been using rasterization for our high-end computer graphics (even movies, these days) is because for one, it does a good enough job, and very quickly. Sometimes, this "good enough" job is imperceptibly different than a similar scene rendered through a ray-tracer. The key is because much of the same algorithms to compute light and effects that ray tracers use, are used in rasterization, with todays powerful shader languages and programmable shader hardware. The only real difference is that rasterization renders the image by projecting the scene and figuring out what's under each pixel, whereas ray-tracers follow the ray of light itself from each point on the view plane. In effect, the results can be extremely similar, if programmed properly. Besides, ray-traced images can look just as bad as old rasterized images. I can implement a phong shader (or even gouraud shader) in both a rasterizer or a ray-tracer quite easily. Ray-tracing doesn't automatically lead to realistic graphics.

And there's one advantage that rasterization methods have for gaming over ray tracing: they're more easily adapted to non-realistic graphics. Sure, sometimes ultra-realism in computer graphics is desired, however, it's very hard to get a ray-tracer to render a scene that uses other visual rules, and is flat-out worthless for sprite-based graphics. Ray-tracing is an excellent algorithm for simulating the physics of light through a camera, but it's just that. It's focused on tracing light rays and that only.

The reason that ray-tracing is useful for CAD style applications is because there, they want that attention to finite detail. They want to see exactly how the final product would look in real life. When you're doing engineering work, physical accuracy is very important. In artistic applications, though, like games or movies, it's not always as high a priority.

I can understand this taking a while to trickle down to consumer devices, but surely they're going to sell a boatload to dedicated render-farms like those needed for Pixar movies etc. Is this even compatible with those sorts of operations or is it only for small workstations and a designer running a single CAD program?

I'm also curious about it being a 90nm process chip. Now that it has been made and proven, can they now skip ahead to a 22nm process easily or is it necessary to visit all stages (90nm - 65nm - 45nm - 32nm - 22nm) along the way?

This only helps someone like Pixar is their rendering pipeline is compatible (or could be made compatible) with Imagination's new algorithm.

Pixar uses their own language, called PRMan, which actually can be plugged over any implementation you'd like. There's implementations that exist over ray-tracing or rasterization methods, so I'm sure they could easily implement it over this hardware and its API. (In fact, I believe that Pixar actually uses rasterization in their PRMan tool, these days.)

But isn't a gpu just a collection of algorithms implemented in hardware when it comes down to it ?

Isn't this true of all hardware design?

I don't mean the process details and how individual parts like transistors are built. But something like a new branch predictor or AMD's big.LITTLE are fundamentally no different than software.

What is patentable in a GPU is the combined algorithms and processes in the form of a working product.

Software is simply an implementation of a mathematical algorithm. A graphics card is an implementation of several mathematical operations and other systems coordinating together in a certain way on a certain piece of hardware such that it produces some desired work. There is a difference, there.

I would really love to put one of these in my home workstation. If they can bring the price down, there's demand for this in all sorts of hobbyist circles as well as in the markets they're already targeting. I'm looking at buying a pair of 7970s to use to accelerate ray tracing. I could definitely make do with a single $300 GPU and a $500 ray tracing accelerator card.

I also fail to see the ground-braking news in that demo. VRay RT have been able to do that for years: http://www.youtube.com/watch?v=6ZmuI2xQp2MThis is a renderer running on GPU and is widely used in both the feature, commercial, and architectural industry.

Those cards are dead before they were ever printed.

Hardly. Just from looking at that demo, I can easily see offhand a number of tricks they're using to get quick updates, which are incredibly rough to say the least. The real render still takes a few seconds to complete. I've seen other similar renderers several years ago too. They maybe fast, but not really "real-time".

But isn't a gpu just a collection of algorithms implemented in hardware when it comes down to it ?

Isn't this true of all hardware design?

I don't mean the process details and how individual parts like transistors are built. But something like a new branch predictor or AMD's big.LITTLE are fundamentally no different than software.

What is patentable in a GPU is the combined algorithms and processes in the form of a working product.

Software is simply an implementation of a mathematical algorithm. A graphics card is an implementation of several mathematical operations and other systems coordinating together in a certain way on a certain piece of hardware such that it produces some desired work. There is a difference, there.

Several mathematical algorithms working together is just a bigger mathematical algorithm. The fact that they are hardwired onto a chip doesn't change the "algorithmness" of the system.

But isn't a gpu just a collection of algorithms implemented in hardware when it comes down to it ?

Isn't this true of all hardware design?

I don't mean the process details and how individual parts like transistors are built. But something like a new branch predictor or AMD's big.LITTLE are fundamentally no different than software.

What is patentable in a GPU is the combined algorithms and processes in the form of a working product.

Software is simply an implementation of a mathematical algorithm. A graphics card is an implementation of several mathematical operations and other systems coordinating together in a certain way on a certain piece of hardware such that it produces some desired work. There is a difference, there.

Not really, any hardware can be simulated in software, even down to the gate-level. The only difference is a matter of efficiency in producing the same result.

But isn't a gpu just a collection of algorithms implemented in hardware when it comes down to it ?

Isn't this true of all hardware design?

I don't mean the process details and how individual parts like transistors are built. But something like a new branch predictor or AMD's big.LITTLE are fundamentally no different than software.

What is patentable in a GPU is the combined algorithms and processes in the form of a working product.

Software is simply an implementation of a mathematical algorithm. A graphics card is an implementation of several mathematical operations and other systems coordinating together in a certain way on a certain piece of hardware such that it produces some desired work. There is a difference, there.

Not really, any hardware can be simulated in software, even down to the gate-level. The only difference is a matter of efficiency in producing the same result.

They can but there's mechanical properties at work, too. There's engineering work going on in these products. They not only have to worry about the programming logic, but also electrical properties, layouts, timing, heat output, etc.

The logic itself is not patentable (though it may be a trade secret). The design itself (or even elements of it) could be. And that doubly goes for the fabrication process.

Cards like these aren't simply software printed on a chip. There's a lot more going on than just the logic gates. Stuff that can't be reliably simulated on the computer.

While they're not giving any concrete details about their proprietary algorithm, it's clearly highly memory-intensive. That's going to be the sticking-point when it comes to migration - the custom processor doesn't seem to be anything special, but bumping mobile GPUs up to 4-8GB of RAM is going to be hard.

This thing uses DDR2 RAM. I just bought two 4GB *DDR3 SODIMMS* for $35 (for the pair) on Newegg. RAM is cheap. Even 16GB of RAM is cheap.

Workstation class cards tend to have more memory than consumer cards, generally double with handful having four times as much. (The Quadro 6000 has 6 GB of memory while the consumer version, the GTX 480 typically shipped with only 1.5 GB for example.) Considering the targeted market, equipping the card with plenty of RAM is expected. The real question is how much is actively used.

The other thing is that GPU's tend to avoid expandable memory for a handful of reasons. First is signal integrity which impacts bandwidth. Using soldered memory is cleaner which allows for higher clocks and lower latencies. Another factor is that the high performance GPU's tend to use high speed GDDRx memory for further increases in memory bandwidth. The GDDRx spec has no provisions for expansion or DIMM formats. More commodity GPU's do use DDRx memory but there is another catch: GPU's have relatively cut down memory controllers only support certain capacities and bank configurations. As such there would be memory compatibility issues with various DIMM's on the market. Certainly a memory controller capable of memory expansion is possible but GPU designs have avoided such luxuries to keep die size down. Speaking of size, there would also be the issue of physically putting two or four DIMM slots on a PCI-e card along with a high performance cooler and good power circuitry. Adding slots also increases board cost which no board manufacturer wants to do due to the razor thin profits and cut throat nature of the GPU market. Don't get the wrong impression, I'd love to see some GPU's with DIMM slots but there are several technical and a few business reasons why this won't happen.

They can but there's mechanical properties at work, too. There's engineering work going on in these products. They not only have to worry about the programming logic, but also electrical properties, layouts, timing, heat output, etc.

The logic itself is not patentable (though it may be a trade secret). The design itself (or even elements of it) could be. And that doubly goes for the fabrication process.

Cards like these aren't simply software printed on a chip. There's a lot more going on than just the logic gates. Stuff that can't be reliably simulated on the computer.

MOSFET fabrication issues are a different matter altogether. Chip designs are usually made for an established fab process rather than requiring any novel modifications that would warrant a patent. These chips are being made on an old 90nm process. There's unlikely to be anything novel about the actual fabrication itself.

They can but there's mechanical properties at work, too. There's engineering work going on in these products. They not only have to worry about the programming logic, but also electrical properties, layouts, timing, heat output, etc.

The logic itself is not patentable (though it may be a trade secret). The design itself (or even elements of it) could be. And that doubly goes for the fabrication process.

Cards like these aren't simply software printed on a chip. There's a lot more going on than just the logic gates. Stuff that can't be reliably simulated on the computer.

MOSFET fabrication issues are a different matter altogether. Chip designs are usually made for an established fab process rather than requiring any novel modifications that would warrant a patent. These chips are being made on an old 90nm process. There's unlikely to be anything novel about the actual fabrication itself.

Well, that's true, given that 90nm has been out for a long time.

I still contend, though, that it is inherently different than patenting the algorithms themselves. From my understanding, patents will cover the full picture. Here, they have the algorithm but also the fact that it's implemented in hardware. When you're dealing with PC software, you have a general instruction set, which the computer operates on in a well-defined manner to produce some outcome or behavior. The hardware itself, however, while employing mathematical principles, is a machine, and a very specific implementation of one. It doesn't sound like they're patenting the idea of hardware-accelerated ray-tracing, but their particular implementation of one.

I'm thinking in comparison to patenting a home brewing kit design. It uses a well-defined process and employs natural, physical laws in order to make the beer itself. However, the design itself of the kit is what is patented. Someone else could certainly create and sell their own beer making kit, and it wouldn't violate the patent, despite it performing the same general task using the same scientific mechanisms.

The thing is that no one has really implemented ray tracing in hardware in this fashion. It's been programmed, yes, but not baked into silicon itself. Digital design is its own beast, and they do truly have a novel system here that at least merits patentability for now. Besides, it's not like patenting computer chips is all that unprecedented.

Andrew Cunningham / Andrew has a B.A. in Classics from Kenyon College and has over five years of experience in IT. His work has appeared on Charge Shot!!! and AnandTech, and he records a weekly book podcast called Overdue.