Imagination Technologies is one of those companies simultaneously ubiquitous and invisible. Its PowerVR graphics processors drive high-profile electronics like Sony's PlayStation Vita, Apple's iPhones and iPads, and any number of past-and-present smartphones, tablets, and laptops. But you'd probably be hard-pressed to find anyone outside of technology circles who actually knows the name.

While most of our coverage of Imagination is driven by these mobile GPU designs, the company also has its eyes on other markets. We stopped by its CES meeting room to get a glimpse at the Series6 PowerVR GPUs that are going to begin making their way into consumer products this year, but the company was also showing off something else: a pair of workstation-class PCI Express add-in cards that allow 3D rendering programs to do something called ray tracing in real time. This is something hardware developers have been chasing (and we've been covering) for many years, so we took some time to see the hardware in action.

What is ray tracing, and why do I want it?

Enlarge/ Ray tracing algorithms are designed to accurately render light and its interaction with various objects.

Chris Foresman

To put it as simply as possible, ray tracing is used to render light and its interactions with objects. A ray tracing algorithm will track rays of light from a light source to an object. Once the light hits that object, the algorithm can account for how much light will be absorbed by the surface, how much will be reflected or refracted by the surface, and how that reflected and refracted light interacts with other surfaces, among other things.

The end result is an image with very accurate, realistic light. There are plenty of reasons this would be useful—better-looking 3D games, movies, and computer-generated imagery on the entertainment side, faster and more lifelike CAD renderings on the professional side—but ray tracing has typically been too intensive a task to do in real time. It's a workload that is ill-suited to even the most powerful of today's graphics processors. Intel briefly promised real-time ray tracing would become an affordable reality in its discrete graphics card, codenamed Larrabee, but that initiative was axed before any real hardware could see the light of day.

This is where Caustic comes in. The company has been talking up its real-time ray tracing technology since it was a scrappy young startup back in 2009, and an acquisition by Imagination in 2010 has only increased its ambition. Its first commercially available real-time ray tracing cards go on sale this month, but as we'll discuss, the company has even bigger plans for the technology in the future.

The Caustic R2500: Work smarter, not harder

The faster of the two cards, the $1,500 Caustic R2500, is physically large, but the actual hardware driving it is much less powerful (and power-hungry) than what's found in a high-end workstation graphics card. Two of Caustic's ray tracing units, or RTUs, are located under the card's small cooling fans. Each of them has 8GB of memory dedicated to it for a total of 16GB on the entire card.

This sky-high amount of memory, which far exceeds the amount available on most graphics cards (and even many computers), is the highest-end spec the card has. The heavy lifting is done by silicon that seems ancient by today's standards: the RTUs are manufactured on a 90nm process and use DDR2 memory, both of which were cutting-edge circa 2005 or so. Despite this, the card still requires a relatively small amount of power. While its peak power consumption is rated at 60 watts, we were told that realistically it maxes out at about 40 watts—much less than a modern high-end (or even a low-end) graphics card, and well under the amount that would require the card to have a separate power plug.

For those with less cash, the $800 Caustic R2100 offers one RTU paired with 4GB of RAM and consumes about half the power of the R2500, with a peak power consumption of between 30 and 40 watts—because it has half the RTUs and a quarter of the RAM, its rendering speed should be less than half that of the R2500, though we weren't able to see the low-end card in action to be sure.

The cutting edge secret sauce of the R2500 isn't in its raw power, then, but the patented algorithms that Imagination and Caustic are using to solve the real-time ray tracing problem.

"The way the algorithm works, it turns ray tracing from a high-performance compute, memory-intensive problem to one that's more like a database problem," Imagination Technologies Director of Product Management Michael Kaplan told Ars. "It's highly optimized. We can store about 120 million triangles on that card."

The end result is easier to show than it is to describe—Imagination Technologies Director of Business Development Alex Kelley was on hand to give us a demonstration of just what the Caustic R2500 was capable of.

Video by Chris Foresman

The Caustic card enables CAD programs to realistically render light in real time and allows designers to change things like the angle of a windshield or the color of the paint without having to wait for the computer to re-render the image every time they make a change.

"You're getting all of this information very early on in the design process, where normally you'd have to sit, hit the render button, wait, and then move on," said Kelley.

Developers will have to build support for the Caustic cards into their applications, but Imagination is providing tools to make that as simple as possible. OpenRL is the low-level API, roughly equivalent to OpenGL on the graphics side, while the company's Brazil SDK is the high-level toolkit that developers can use to implement Caustic support in their software. The demo seen above was given in Autodesk Maya, for which a plugin is already available, and McNeel's Rhino 5 CAD software will also support the card. Imagination is working with several other vendors to implement support, but it couldn't give us any more specifics as of this writing.

Again, there's also a lower-end version of the card for those with less cash (the $800 Caustic R2100 mentioned above). Everything we saw in this demo was based on the performance of the R2500, but smaller shops will definitely appreciate the cheaper option.

An ambitious future: Real-time ray tracing in a tablet?

We came away impressed by the technology in Caustic's add-in card, but the reality is it's difficult to sell an extra add-in card that isn't a GPU these days—just ask Ageia, whose dedicated physics processing add-in cards went pretty much nowhere before Nvidia snapped them up and integrated PhysX support into its GeForce cards. The Caustic cards will appeal to people in the high-end, high-margin workstation market, but by Imagination's own admission that market is quite small.

The Caustic technology's path to the mass market will be similar: Imagination intends to integrate into future versions of its PowerVR GPUs. This isn't going to happen anytime soon—the Imagination representative gave us a tentative estimate of "four to five years" from now—but it may be that the phones and tablets of tomorrow will be capable of 3D rendering that is only now beginning to hit high-end workstations.

This would dovetail nicely with the way the industry is moving. Mobile devices are already getting more productive as they get more powerful, and the major hardware manufacturers seem determined to deliver devices that can be all things to all people—phones that can double as tablets, tablets that can double as laptops, and so on. By 2018, it's easy to imagine a tablet that can also do high-end CAD work, and if Imagination has its way, the Caustic ray tracing technology will be leading that charge.

Promoted Comments

I wrote the following (slightly redacted) up a little while ago for another company looking at consumer level ray tracing hardware as it relates to games. I do think workstation applications are the correct entry point for ray tracing acceleration, rather than games, so the same level of pessimism might not be apropriate. I have no details on Imagination’s particular technology (feel free to send me some, guys!).

------------

The primary advantages of ray tracing over rasterization are:

Accurate shadows, without explicit sizing of shadow buffer resolutions or massive stencil volume overdraw. With reasonable area light source bundles for softening, this is the most useful and attainable near-term goal.

Accurate reflections without environment maps or subview rendering. This benefit is tempered by the fact that it is only practical at real time speeds for mirror-like surfaces. Slightly glossy surfaces require a bare minimum of 16 secondary rays to look decent, and even mirror surfaces alias badly in larger scenes with bump mapping. Rasterization approximations are inaccurate, but mip map based filtering greatly reduces aliasing, which is usually more important. I was very disappointed when this sunk in for me during my research – I had thought that there might be a place for a high end “ray traced reflections” option in upcoming games, but it requires a huge number of rays for it to actually be a positive feature.

Some other “advantages” that are often touted for ray tracing are not really benefits:

Accurate refraction. This won’t make a difference to anyone building an application.

Global illumination. This requires BILLIONS of rays per second to approach usability. Trying to do it with a handful of tests per pixel just results in a noisy mess.

Because ray tracing involves a log2 scale of the number of primitives, while rasterization is linear, it appears that highly complex scenes will render faster with ray tracing, but it turns out that the constant factors are so different that no dataset that fits in memory actually crosses the time order threshold.

Classic Whitted ray tracing is significantly inferior to modern rasterization engines for the vast majority of scenes that people care about. Only when two orders of magnitude more rays are cast to provide soft shadows, glossy reflections, and global illumination does the quality commonly associated with “ray tracing” become apparent. For example, all surfaces that are shaded with interpolated normal will have an unnatural shadow discontinuity at the silhouette edges with single shadow ray traces. This is most noticeable on animating characters, but also visible on things like pipes. A typical solution if the shadows can’t be filtered better is to make the characters “no self shadow” with additional flags in the datasets. There are lots of things like this that require little tweaks in places that won’t be very accessible with the proposed architecture.

The huge disadvantage is the requirement to maintain acceleration structures, which are costly to create and more than double the memory footprint. The tradeoffs that get made for faster build time can have significant costs in the delivered ray tracing time versus fully optimized acceleration structures. For any game that is not grossly GPU bound, a ray tracing chip will be a decelerator, due to the additional cost of maintaining dynamic accelerator structures.

Rasterization is a tiny part of the work that a GPU does. The texture sampling, shader program invocation, blending, etc, would all have to be duplicated on a ray tracing part as well. Primary ray tracing can give an overdraw factor of 1.0, but hierarchical depth buffers in rasterization based systems already deliver very good overdraw rejection in modern game engines. Contrary to some popular beliefs, most of the rendering work is not done to be “realistic”, but to be artistic or stylish.

I am 90% sure that the eventual path to integration of ray tracing hardware into consumer devices will be as minor tweaks to the existing GPU microarchitectures.

73 Reader Comments

PowerVR ... wasn't that the product so weak that it couldn't stand up against NVIDIA and AMD graphics cards ? And therefore left the PC graphics card market.

Regarding specs : The circuit board looks like a late 90s design but hardly something from 2013. 90nm design .... ahhh. I guess this company is still 5 to 10 years behind what is cutting edge technology.

PowerVR ... wasn't that the product so weak that it couldn't stand up against NVIDIA and AMD graphics cards ? And therefore left the PC graphics card market.

Regarding specs : The circuit board looks like a late 90s design but hardly something from 2013. 90nm design .... ahhh. I guess this company is still 5 to 10 years behind what is cutting edge technology.

PowerVR is doing nice job in Mobile though (this time its Nvidia and AMD who do catch up ).

And regarding manufacturing process. ITS EASIER AND CHEAPER. For piece of hw already rated at 2500 $ its big difference.

It is good to note that while many people can easily get caught up with the "ray-tracing" buzzword, it isn't necessarily the be-all and end-all of computer graphics. It's certainly an elegant solution and you get a lot of stuff for free (reflections and shadows, for instance). However, it's not inherently better than rasterization.

Yeah, so far I haven't really seen anything that makes me want ray tracing more than rasterization. Ray tracing is fun and a very elegant way of rendering, but it's not very efficient. There are edge cases though, such as when your polygons become smaller than one pixel in size. Carmack talked about this (and he alludes to id Tech 6 having properties similar to the Euclideon Unlimited detail engine which was linked previously) in an old interview with PCPer back in 2008 (http://www.pcper.com/reviews/Graphics-C ... re?aid=532). In that interview he talks about Larrabee which, at that time, was still set to be a GPU. But basically, given a set number of transistors it's going to be more efficient to use them for rasetization than for ray tracing. Although you may need to use a lot of tricks to get the same results.

Regarding this specific ray-tracing card I have to say that the nVidia demo seemed a lot more impressive to my layman eyes (I have implemented a simple ray-tracer as a school project, but it was fairly basic):

...you don't want to do all the work of following a photon only to find it never goes to the screen for display!

But that's actually what we want, right? Because even if it doesn't directly hit the immediate view, it will likely impact hidden parts that in turn changes the way the scene is lit by diffuse light bounces - in turn actually affecting what we see on the display a great deal.

That's why basic ray tracing produces so mediocre results, it doesn't take diffuse bounces into account (radiosity, global illumination, whatever you call it) and photon mapping is one way of attacking this problem to produce more realistic results.

Any solution that cannot do at least a few diffuse bounces from each simulated light photon (with a decent interpolation at that) is truly only useful for clinical CAD drawings, not for rendering any realistic scenery.

I really wish somebody would make a high-spec add in card for PC gaming (etc.) based on PowerVR.

Or even just a decent Virtu / Synergy implementation with any 3D architecture.

These days, we almost all have IGPs, all of which are good enough for most desktop uses. I don't need a complete replacement / duplication of it's functionality. I just want an add-in card that will boost rendering and transfer the buffer back for display on the IGP - much like the original PowerVR cards.

But it needs to be a good implementation - it needs to shut down the add-in card when it's not in use. Especially if it has active cooling.

While they're not giving any concrete details about their proprietary algorithm, it's clearly highly memory-intensive. That's going to be the sticking-point when it comes to migration - the custom processor doesn't seem to be anything special, but bumping mobile GPUs up to 4-8GB of RAM is going to be hard.

Well, in principle, you can stick however much VRAM you want in a GPU. It's just going to cost more.

Hmm, that's quite a strong value of 'in principle' - VRAM generally isn't designed for multiple devices per channel, and 150 pins on the BGA per channel plus a set of 64 RAM-pin-controller blocks on the silicon gets expensive fast.

I don't quite understand what their custom processor with slow RAM (unless they've actually built sixteen channels of DDR2, which I think is impractical particularly in 90nm) is offering over a Xeon E5-2670 with much higher clock rate, vast caches in comparison to what can be done in 90nm, and potentially much larger amounts of likely significantly faster RAM.

I also fail to see the ground-braking news in that demo. VRay RT have been able to do that for years: http://www.youtube.com/watch?v=6ZmuI2xQp2MThis is a renderer running on GPU and is widely used in both the feature, commercial, and architectural industry.

Those cards are dead before they were ever printed.

Hardly. Just from looking at that demo, I can easily see offhand a number of tricks they're using to get quick updates, which are incredibly rough to say the least. The real render still takes a few seconds to complete. I've seen other similar renderers several years ago too. They maybe fast, but not really "real-time".

And neither are this one. Still lots of noise that takes a millisecond to disappear. From the viewpoint of a working artist, the difference doesn't matter at all. You can easily live with a little noise while you are tweaking shaders/light/etc. If they had demo'ed a complex scene with the same performance, I would have been impressed though.

The cutting edge secret sauce of the R2500 isn't in its raw power, then, but the patented algorithms that Imagination and Caustic are using to solve the real-time ray tracing problem.

Then why isn't Imagination Technologies a software company developing software for stock database server/workstation hardware? Surely then, their software would work much faster and more economically? Why then do they present this as a hardware+software proposition?

sidran32 wrote:

I still contend, though, that it is inherently different than patenting the algorithms themselves. From my understanding, patents will cover the full picture. Here, they have the algorithm but also the fact that it's implemented in hardware. When you're dealing with PC software, you have a general instruction set, which the computer operates on in a well-defined manner to produce some outcome or behavior. The hardware itself, however, while employing mathematical principles, is a machine, and a very specific implementation of one. It doesn't sound like they're patenting the idea of hardware-accelerated ray-tracing, but their particular implementation of one... The thing is that no one has really implemented ray tracing in hardware in this fashion. It's been programmed, yes, but not baked into silicon itself. Digital design is its own beast, and they do truly have a novel system here that at least merits patentability for now... It's more than mathematical algorithms. I can't stress that enough.

Have you never heard of hardware/software equivalence? I'm guessing you don't have a degree in computer science, and that you've never implemented any custom electronic hardware, at least on any significant scale; because if you had either of these experiences, you would know that any algorithm (or, computer program for a von Neumann/ Turing machine) can be compiled into a piece of calculation hardware (an ASIC design or FPGA netlist), and vice-versa! This equivalence (long-since theoretically proven) is the subject of active current practical research that could easily disrupt this project at Imagination Technologies if they expose themselves to the hardware market in this way.

15-20 years ago, I designed in rough outline an algorithm that would do real-time realistic 3D graphics using hierarchies of shapes (perhaps not doing true ray-tracing, but something like it that would be MUCH faster and almost as good quality). I wonder if they're doing something similar? If so, Imagination's lawyers might like to note that I published my work at the time, among a small group of friends; one of whom later became a 3D games physicist/ graphics engineer. A LOT of work has been done in this area over the first 60 years of electronic computing. Imagination Technologies' "patented algorithms" may well not be as original as they seem to think, and if their hardware is second-rate as well, their business model may be especially vulnerable to nasty surprises.

AT the price these things are being offered at, I think market penetration potential for this product will be about the same as the Matrox Millennium right now; with or without the "patented algorithm" secret sauce... More worryingly still, Imagination Technologies seems determined to pursue a similar technical development/ product differentiation strategy to that run by Matrox in the early 1990's against similarly dynamic competition. I predict that the end results of this project will be similar. At 90nm, in other words with (90nm/22nm)²=16× less feature density and less hardware speed than some cheaper competitors; I struggle to imagine how Imagination Technologies' ASICs (even if highly optimised) can have any long-term technical advantage over a competitor's more flexible, customisable, future-proof and ultimately more economical 22nm FPGA-based or GPU-based solution.

Andrew Cunningham wrote:

By 2018, it's easy to imagine a tablet that can also do high-end CAD work...

If the CAD market stands still between now and 2018 (so that CAD calculations remain at the same level of requirements for complexity and quality, despite improving workstation hardware); then perhaps you will be right about this.

90nm, holly crap. That was what, later Pentium 4 era? I would think if they shrank this down to 28nm it would be so power frugal that you could fit it in a lot of places, but then the limitation becomes the RAM, that 8GB has to go somewhere and on-die RAM isn't that large yet. Perhaps in another NAND fabrication process shrink, as some SoCs already house up to 2GB within themselves. But the chips themselves, they could already be very very small and low power.

And certainly if you shrank it all down and put it on a GPU sized board like this, you could add much more power to it as well.

PowerVR ... wasn't that the product so weak that it couldn't stand up against NVIDIA and AMD graphics cards ? And therefore left the PC graphics card market.

Regarding specs : The circuit board looks like a late 90s design but hardly something from 2013. 90nm design .... ahhh. I guess this company is still 5 to 10 years behind what is cutting edge technology.

They were weak in raw power, but got roughly the same FPS. PowerVR has always been about efficiency, but ATI/nVidia were more popular and had more mainstream support.

I could see PowerVR as the "no-name" company that suddenly blind-sides the current GPU industry.

So if I'm understanding how ray tracing works, it could be used by other disciplines besides 3d modeling. For example if you change "how a photon reacts with a surface" to "how an electrical impulse emitted from an X- or S-band radar interacts with a surface" you have a tool that would be very useful to military aircraft designers.

So if I'm understanding how ray tracing works, it could be used by other disciplines besides 3d modeling. For example if you change "how a photon reacts with a surface" to "how an electrical impulse emitted from an X- or S-band radar interacts with a surface" you have a tool that would be very useful to military aircraft designers.

Am I onto something or on something?

I can't say anything about your ideas outside of normal graphics, but one of the large benefits of ray tracing is how well it scales with the resolution of geometry.

Current raster based rendering has horrible scaling with the geometry and was said a few years back that ray-tracing will break even in performance with rasterizing in only a few years time and that it will only take 1-3 years for ray-tracing to become clearly dominant for performance.

It was stated that any graphics company that is not ready for the transition will be effectively destroyed because the performance gap will widen insanely fast, and any poised to have effective ray-tracing will stand to gain a lot of market.

The feeling given was that the switch to ray-tracing will be quick and violent because it will be so much better but will be so different from what the industry is used to.

[Have you never heard of hardware/software equivalence? I'm guessing you don't have a degree in computer science, and that you've never implemented any custom electronic hardware, at least on any significant scale; because if you had either of these experiences, you would know that any algorithm (or, computer program for a von Neumann/ Turing machine) can be compiled into a piece of calculation hardware (an ASIC design or FPGA netlist), and vice-versa! This equivalence (long-since theoretically proven) is the subject of active current practical research that could easily disrupt this project at Imagination Technologies if they expose themselves to the hardware market in this way.

sigh

I'm at the point of repeating myself, here.

I'm a CS major. I'm a DV engineer. I have worked directly on these sorts of products. I am fully aware of how these things come together.

I'm talking about the engineering side of things.

I hate software patents as much as the next guy, because you're patenting the process doing math. It doesn't fufill the machine or transformation test. Programming a computer isn't "building a new machine", as proponents like to say.

However, when it's baked into hardware, they *are* building a new machine. Specialized hardware like this card (or a GPU) are specific and designed to do one thing and one thing only. They use math, but that's hardly at issue, here. They can be simulated in a computer program, but no one cares, because it's irrelevent. We could simulate lots of patentable stuff in software and it makes little difference, because it's a simulation.

You can't patent math. But you can patent an invention that uses math to achieve some result.

Now, I will note that since all this blew up, I did go back and reread the article and noted that it says:

Quote:

our patented algorithms

which is very ambiguous. So maybe they did patent just the algorithm and not the device? I doubt that, but if that's the case, then sure, I'll complain alongside you.

However, if what they patented was their add-in card, here, then I have no qualms.

...You can't patent math. But you can patent an invention that uses math to achieve some result...

Are you American? US patent laws don't apply worldwide (thankfully, since America has a corrupted patent system which now rewards "first to publish", approves excessive numbers of low-quality speculative patent applications while patent examiners get rewarded for "closing" application cases quickly, and bullies other countries into "normalising" their patent laws toward the one-sided American system; while the entire language of the American system uses the terms "invention" and "inventor" so broadly that those terms lose all meaning).

In most software patents (especially those published in Europe), "machines" are just a linguistic device used to fool the patent examiners into thinking that the algorithm (comprising pure mathematics, in some cases, to varying degrees) should be patentable. A "machine" is patented (which just happens to be, the description of a hardware implementation of what was originally described as software by the actual engineers). And then when some 3rd party produces a similar computer algorithm (purely to run on some generic computation hardware) or even an entirely different computer program that does a similar job or "process"; the patent holder claims their patent on the "machine" has been infringed upon, and exercises that patent! They then present the 3rd party's software implementation as a "bad faith" effort to evade the patent holder's hardware patents.

So the lawyers have shown us that they know (as well as most computer scientists know) that there is no real distinction between computer software and calculation machinery. This distinction is entirely artificial for them, and software "inventions" (or sometimes mere implementations of mathematics that was discovered by someone else) are regularly transposed into patent legalese describing equivalent "machinery". The computer scientists have found ways to make the two (software/hardware based calculation) interchangeable and equivalent (and are increasingly doing this in practice on real-world cutting-edge hardware), and the lawyers are not now respecting the difference: the process of equivalence is being industrialised by engineers and lawyers alike. Therefore, anyone who tries to use this distinction to discriminate between software (which should be unpatentable, you suggest) and "machinery" (which should be patentable, you suggest); is barking up the wrong tree. This method is not going to help you end the software patent debate.Any suggestion that the development costs (the particular costs which patents are supposed to help the "inventor" to recover) are the moral basis for making calculation hardware patentable but making software unpatentable is wrong, because all the developers of calculation hardware must do is to compile their "patented algorithm" into a hardware design (using an automated and tooled-up process), manually optimise the design (similarly to how one might optimise software for a particular platform), and send the compiled design to fabrication partners (the process is then already industrialised). Formal verification of computer software and hardware are basically one and the same thing. The design process for this kind of hardware is therefore equivalent to the design process for creating good quality, reliable software; so there is no moral basis for separating the two (application-specific hardware implementations of algorithms, or, algorithms for general-purpose computing hardware).

As we all agree that software patents are bad; where are the true outer limits of the domain of "invention"?May I suggest that the outer borders of "invention" are precisely where they meet the borders of "discovery". Whether an "invention" is implemented or described as mathematical algorithms, computer software or industrial process machinery; these all amount to the same thing in the field of computing: once an "invention" is mature enough to be described in this way (if it ever can be), the distinction between hardware and software becomes a totally irrelevant question, morally speaking. The important question is whether we are dealing with:

a purely mathematical discovery or the equivalent (in which case, trade secrets might be an appropriate way to reap the rewards of your hard work), or

a truly creative invention (in which case, registered designs, copyrights, trademarks and trade secrets should be used), or

something in between creation and discovery (again, I feel there's no legitimate space here for patents at all, for any invention in any field that is sufficiently mature to be fully described in computer software; but that's my personal feeling.)

The central problem with what you have written is that you have given us no practical basis at all for discriminating between patentable computer machinery and unpatentable computer software, so I think your position is currently indefensible. Let us see if we can improve upon it.To illustrate how we might implement such a distinction (of the type I am suggesting) in practise; I would suggest a minimal next-step toward fixing the software patent system: let's broaden the grounds for invalidation of a patent. Let's say that if the pure mathematicians and computer scientists can prove that a patented method (or a set of patented methods) is either the best method or the only possible method of correctly solving a given problem; then the relevant patents to those methods should be invalidated, and all relevant lawsuits dismissed from the time when such formal proof was discovered. Let's further say that where the mathematicians & natural scientists can prove that any particular method (or set of methods) is optimal (or universally superior to any other for all practical instances of a problem), then patents on all sub-optimal methods should at that point be invalidated forthwith, so that the market can't be held hostage for inordinate amounts of time by petty "inventors" of trivial "inventions". (As a by-product of releasing the industrial & commercial hostages, an entire industry would be created for big commercial sponsorship of basic mathematical research & publication.)There are many more steps required before we will come close to fixing this monumental mess, but I'd suggest this would be a good start, and that this method offers us a much more useful and natural distinction than the old "hardware/software" humbug.

~~~

My personal guess is that either Imagination's lawyers told them to integrate this "patented algorithm" into a piece of economically irrelevant hardware to artificially bolster their claims for patentability and strengthen their case in any litigation that might take place in jurisdictions that don't officially support software patents; or perhaps, their marketing people told them that including a piece of hardware in the oversized carton for people to heft would make the oversized price-tag more defensible. Who knows, they may not even intend to produce these devices in economically serious numbers; but the whole exercise might just be a way to strengthen their lawyers' hands with some "real machinery". Based on the specifications of that hardware, are there any other explanations that make sense?

Well, when I say software is math, I don't mean mathematical equations. I mean that it is math. There was an excellent essay on this topic that I wish I still could locate, but you could likely read similar things on groklaw:http://www.groklaw.net/article.php?stor ... 8075658894

The thing is that a computer chip itself is not software. It is a machine, with a specific design and form. Software is an instruction set, and has no specific design or form. Any number of different designs of Turing machines could follow these instructions and achieve the same result. You could also feed another machine with these instructions and get a different result. Feed the raw bits of an x86 assembly program into a MIPS processor and you'll see entirely different behavior. You could follow the instructions using pen/paper and your brain and reach the same result. Of course it might take much longer but given enough time and resources, you could.

To try and tie this into something people colloquially see as "math", think of a scientific calculator. It is a computer that accepts instructions. The instructions would be a list of commands you feed the calculator. This list of commands could be the steps to solve a quadratic equation. These steps would undoubtedly be mathematics, even to someone with a basic understanding of math. But they are the exact same thing in essence as any software program, because they are both sets of instructions fed into the memory of the computer device in order to perform some particular task.

Solving a quadratic equation is not patentable, yet software is. This results from the courts apparent lack of understanding of what computers and software are. It's understandable, but it should be fixed.

A machine, however, is a physical device with form and function. A particular computer that was built could be patentable, because it's not abstract. Because it can be simulated in software doesn't change that fact. I can simulate a hammer in software, but I can still patent the physical hammer itself. I can simulate a slide-rule, but I can still patent one.

That has to be the least impressive demo for the most impressive technology out there. Come on, even showing some chrome balls would have been more impressive than that! What you displayed could have been done without ray tracing at much less power.

I could be getting completely mixed up here, but if I remember correctly you're actually describing photon mapping, not ray tracing. Ray tracing is a step below that, where instead of tracing where light goes from the perspective of the light sources and building up a 'map' of what's lit, instead it starts at the 'eye' and works backwards, working out for each given light if that lights manages to make it to the portion of the view you're currently rendering and if so what colour you should end up.

I think it's worth pointing out that photon mapping and ray tracing are not necessarily competing algorithms. Once you have done a lighting step you can use ray tracing to generate your output image but where you use the photon map to determine the color of the object (the eye rays can still be reflected and refracted though, that way you can get caustic effects like what you see at the bottom of a swimming pool).

The basics of ray tracing are quite simple and a fun programming project to try. The trick is to get the speed up.

I can't say anything about your ideas outside of normal graphics, but one of the large benefits of ray tracing is how well it scales with the resolution of geometry.

Current raster based rendering has horrible scaling with the geometry and was said a few years back that ray-tracing will break even in performance with rasterizing in only a few years time and that it will only take 1-3 years for ray-tracing to become clearly dominant for performance.

...

The feeling given was that the switch to ray-tracing will be quick and violent because it will be so much better but will be so different from what the industry is used to.

I think the doom of rasterizers is a bit hyped to be honest. Real time ray tracing is one of those things that have been "a few years away" for a long time now.

Carmack has commented quite a lot on this for the past few years. Eg in the previous pcper interview from 2008 I linked. He also talks about it every now and then in his QuakeCon keynotes (which are available on YouTube). One of the things he's mentioned is that we often see "tech demos" that some special trick in a set environment, but going from that to a real game engine is a pretty big step. IIRC he mentions that if you take a normal modern game and remove everything that is not rendering graphics then you can push the rendering to hundreds or 1000 FPS. So even if you can demonstrate a ray tracer that can run at 30 FPS that's quite far from what's actually needed. (Naturally all tech demos start at this stage, and you can bring them up to be optimized enough or wait for technology to get fast enough to run it well.)

He also mentions that he would like to make a hybrid rendering engine which uses rasterization for some parts and ray tracing for some parts. One problem with many tech demos is that they lack the assets for creating a good demo. Basically if your assets are made for a rasterizer they will probably not have the elements which would look a lot better in a ray tracer. But he hasn't talked all that much about future engines in the last few years so who knows what's happening with that. :-)

I wrote the following (slightly redacted) up a little while ago for another company looking at consumer level ray tracing hardware as it relates to games. I do think workstation applications are the correct entry point for ray tracing acceleration, rather than games, so the same level of pessimism might not be apropriate. I have no details on Imagination’s particular technology (feel free to send me some, guys!).

------------

The primary advantages of ray tracing over rasterization are:

Accurate shadows, without explicit sizing of shadow buffer resolutions or massive stencil volume overdraw. With reasonable area light source bundles for softening, this is the most useful and attainable near-term goal.

Accurate reflections without environment maps or subview rendering. This benefit is tempered by the fact that it is only practical at real time speeds for mirror-like surfaces. Slightly glossy surfaces require a bare minimum of 16 secondary rays to look decent, and even mirror surfaces alias badly in larger scenes with bump mapping. Rasterization approximations are inaccurate, but mip map based filtering greatly reduces aliasing, which is usually more important. I was very disappointed when this sunk in for me during my research – I had thought that there might be a place for a high end “ray traced reflections” option in upcoming games, but it requires a huge number of rays for it to actually be a positive feature.

Some other “advantages” that are often touted for ray tracing are not really benefits:

Accurate refraction. This won’t make a difference to anyone building an application.

Global illumination. This requires BILLIONS of rays per second to approach usability. Trying to do it with a handful of tests per pixel just results in a noisy mess.

Because ray tracing involves a log2 scale of the number of primitives, while rasterization is linear, it appears that highly complex scenes will render faster with ray tracing, but it turns out that the constant factors are so different that no dataset that fits in memory actually crosses the time order threshold.

Classic Whitted ray tracing is significantly inferior to modern rasterization engines for the vast majority of scenes that people care about. Only when two orders of magnitude more rays are cast to provide soft shadows, glossy reflections, and global illumination does the quality commonly associated with “ray tracing” become apparent. For example, all surfaces that are shaded with interpolated normal will have an unnatural shadow discontinuity at the silhouette edges with single shadow ray traces. This is most noticeable on animating characters, but also visible on things like pipes. A typical solution if the shadows can’t be filtered better is to make the characters “no self shadow” with additional flags in the datasets. There are lots of things like this that require little tweaks in places that won’t be very accessible with the proposed architecture.

The huge disadvantage is the requirement to maintain acceleration structures, which are costly to create and more than double the memory footprint. The tradeoffs that get made for faster build time can have significant costs in the delivered ray tracing time versus fully optimized acceleration structures. For any game that is not grossly GPU bound, a ray tracing chip will be a decelerator, due to the additional cost of maintaining dynamic accelerator structures.

Rasterization is a tiny part of the work that a GPU does. The texture sampling, shader program invocation, blending, etc, would all have to be duplicated on a ray tracing part as well. Primary ray tracing can give an overdraw factor of 1.0, but hierarchical depth buffers in rasterization based systems already deliver very good overdraw rejection in modern game engines. Contrary to some popular beliefs, most of the rendering work is not done to be “realistic”, but to be artistic or stylish.

I am 90% sure that the eventual path to integration of ray tracing hardware into consumer devices will be as minor tweaks to the existing GPU microarchitectures.

I don't have a lot of man-crushes and it's pretty much just you and David Bowie. Also, Axl Rose but deal with it. Also something-something about the post but I loved Rage despite the haters. Also, Anderson Cooper.

...you don't want to do all the work of following a photon only to find it never goes to the screen for display!

But that's actually what we want, right? Because even if it doesn't directly hit the immediate view, it will likely impact hidden parts that in turn changes the way the scene is lit by diffuse light bounces - in turn actually affecting what we see on the display a great deal.

That's why basic ray tracing produces so mediocre results, it doesn't take diffuse bounces into account (radiosity, global illumination, whatever you call it) and photon mapping is one way of attacking this problem to produce more realistic results.

Any solution that cannot do at least a few diffuse bounces from each simulated light photon (with a decent interpolation at that) is truly only useful for clinical CAD drawings, not for rendering any realistic scenery.

No. Diffuse light bounces still make it to your eye. Everything you see is because a photon made it to your eye.

Because ray tracing involves a log2 scale of the number of primitives, while rasterization is linear, it appears that highly complex scenes will render faster with ray tracing, but it turns out that the constant factors are so different that no dataset that fits in memory actually crosses the time order threshold.

Could you even begin to ballpark the amount of memory necessary for that tradeoff to start making sense?

I'm coming from an HPC background (solving I/O bottlenecks) so I'd be interested in any kind of scope you'd have, even scratched from the back of a napkin.

I've been working for film studios for the last 13 years and in that time, we've used a lot of high quality ray-tracers. My opinions pretty much mirror yours. We've got a long way to go before we even see a practical ray-tracing solution in hardware (with any good results).

I was a bit surprised that you didn't mention importance sampling when using direct lighting to get rid of noise. Not only do you need to sample the area lights, but you also need to sample the BRDF, PDF, and create a sampling function in order for the integration to converge faster. This is something that a lot of film studios are adopting because there's been this mass interest in going back to RT for film. Things get even more ugly when you want to implement GI on fur. Ouch..

Fake demo?Has anyone else noticed that the demo machine appears to be an iMac, which of course has no PCI-E expansions slots, and in the latest version is glued shut. The only way to use an R2500 with an iMac is to use a Thunderbolt PCI chassis, which seems unlikely, or to use the iMac as a Thunderbolt display for another machine, which also seems unlikely.

Hi, I've been demoing and testing Visualizer for Imagination/Caustic for the last few months so I can provide some concrete (but unofficial) answers to a number of user comments.

SCdF wrote:

Quote:

A ray tracing algorithm will track rays of light from a light source to an object. Once the light hits that object, the algorithm can account for how much light will be absorbed by the surface, how much will be reflected or refracted by the surface, and how that reflected and refracted light interacts with other surfaces, among other things.

I could be getting completely mixed up here, but if I remember correctly you're actually describing photon mapping, not ray tracing. Ray tracing is a step below that, where instead of tracing where light goes from the perspective of the light sources and building up a 'map' of what's lit, instead it starts at the 'eye' and works backwards, working out for each given light if that lights manages to make it to the portion of the view you're currently rendering and if so what colour you should end up.

also in response to a couple other since these are all tied in together...

drojf wrote:

For some idea of what is possible in terms of pathtracing (not raytracing)..

Gary Patterson wrote:

Ray tracing starts at the eye..

Chuckstar wrote:

Path tracing is much more computationally intensive than ray tracing.

'Ray Tracing' is all that the name implies: rays traveling through a scene and intersecting with objects. Photon mapping is a form of 'forward raytracing' followed by 'backward raytracing'. Forward raytracing is where you trace rays from the light and then their paths throughout the scene. This is very inefficient for single frames or dynamic scenes since the photons would have to strike the virtual 'image sensor' which is a very very tiny area of the scene. It's very efficient though in that you can calculate all of the scene once and then render multiple frames and viewpoints after the expensive pre-calculation when combined with a camera driven sample of the forward traced lighting in the scene. Backwards raytracing is typically used by path tracers and is efficient for interactive situations since you're only calculating the light as seen by the camera and not an entire scene including areas out of view. The Caustic card simply is a raytracing accelerator, and how you use its raytracing is up to the application developer--it doesn't impose one way or the other. It doesn't however support KNN lookup yet which means it doesn't yet accelerate photon map shading.

As to pathtracing vs raytracing. Pathtracing is a form of raytracing it's just a specific backward raytracing sampling method. Many "biased" rendering algorithms use multiple samples. So you shoot a ray out into the scene, it intersects an object and then sends out a number of sample rays for shadowing and global illumination and then sending more yet depending on the noisiness of the returned results. Each of those global illumination samples also emits many more sample rays and pretty soon you have an exponentially very large number of samples all from one camera ray. On a GPU you just don't have sophisticated enough programability (or enough memory per core) yet to handle tracking which ray belongs where. It's also very slow for feedback to the artist. You don't necessarily need to know exactly what a single perfect pixel is going to look like, you want a rough overall view of the entire frame. So path tracing works a little differently, it sends a single ray, which then emits a single ray, which then emits a single ray. It's very inefficient but it's also incredibly simple which makes it easy to implement on a GPU. Path tracing on a gpu is doing a very inefficient thing very very quickly thanks to the thousands of cores. It also reduces the time from when the artist makes a change to when an entire frame is displayed which is good for refining work. The downside is that you might need to fire way more rays than you would with a traditional biased renderer. Another advantage of a more brute-force path tracing approach is that you have very few settings to change. That means you don't have to know what hundreds of spinners and settings in a renderer do to optimize. The disadvantage is that you don't have any spinners to potentially optimize your render performance if you know what you're doing.

The Caustic card supports both approaches: biased rendering and path traced rendering (you just set the ray samples to 1 across the board). You can use it as a pathtracer for very fast feedback, or you can rely on its adaptive sampling and biased rendering for quicker final rendering if you want to dig into the settings.

Doerge wrote:

Apparently they have a novell ray-tracing algorithm that uses some form of database-like search-structure (just like this one: http://www.youtube.com/watch?feature=pl ... KUuUvDSXk4) but it sounds like you would have to put your entire data-structure into their database-format to make it work.

I also fail to see the ground-braking news in that demo. VRay RT have been able to do that for years.This is a renderer running on GPU and is widely used in both the feature, commercial, and architectural industry.

The YouTube link is memory intensive for different reasons. It's an octtree voxel renderer (which uses raytracing to sample the voxels). Except for being raytracers they're very dissimilar. The "database" to which they refer is for managing and bundling rays. In raytracing you have lots of rays. For an image without any reflections this is very easy to keep track of which ray came from where. But once you have something like global illumination you can have one ray spawn 20 global illumination rays all traveling in different directions and sampling completely different areas of the room. This makes a raytracer very inefficient since each of those 20 rays could be sampling a different material. You have to load that material into memory, you have to sample it and return the results to the original object as a global illumination sample. Then you have to do that 20 more times. And 20 more times for each of those first 20 times in the case of second bounces of the light ray. The "database approach" to which they refer (and the 'selling point') is that lots of these 'incoherent' rays when viewed as a whole (as opposed to as single pixel sources) can be packaged together so that 100 samples of say a red wall can all be processed as a batch. GPUs aren't designed to do that so they just brute force it and shoot one ray at a time completely independently of one another. That works really well for a GPU since they have thousands of cores and can therefore shoot thousands of independent rays. The caustic card bundles up huge packets of rays and hands them over to the CPU so that your 4 CPU cores can do the work of 1,000 GPU cores. The advantage of using the CPU (or the way they can package rays for a GPU hypothetically) is that you don't have to use the GPU memory. A system with 32GB of system memory can be had for a couple hundred dollars (plus a couple hundred dollars for a caustic card). A system with just 6GB of GPU memory (per GPU some have 2x GPUs with 3GB of memory which still limits you to 3GB) will cost you about $6,000.

VRay RT is an example of a renderer which runs very well on a GPU. But if you have more than 2GB of textures you can't load your scene. As an artist that means I can usually only open one production model in VRayRT GPU at a time (The CPU implementation of VrayRT doesn't have this limitation but it also isn't nearly as fast as the GPU VrayRT or the Caustic accelerated renderer on the CPU). Another big selling point of using the CPU for shading is that you aren't limited in how complex your shaders can be. VrayRT on the GPU has very limited shading options. iRay, another similar GPU renderer also doesn't support very complex arbitrary shaders. Neither support user programmable shaders on their GPU renderers. The Caustic technology is far more flexible.

Here's a demo of one, semi modern, Nvidia card, running at 30+ fps, with refractions and reflections and now noise, unlike the demo. At this point the Nvidia card would cost less than half the cost of even the low end Caustic card.

The big difference, by the way, between a "Workstation" card and your average card GPU that a gamer might have is software. Nvidia and AMD just disable CAD acceleration on the normal cards and then charge three times as much for the "workstation" mostly because they can.

In that YouTube video they are using a hard light source and very very simple shading. It's not really an apples to apples comparison. A GPU is very efficient at very basic raytracing. The Caustic card becomes advantageous with very large models (e.g. 120,000,000 polygons on the R2500) and very complex lighting/shading models. For instance the car shader in the video has several reflection layers and complex paint flecks interacting with multiple diffuse light sources and global illumination. Similarly trying to process procedural textures and shaders on a GPU would drop performance well below 30fps. It's kind of like comparing Battlefield 3 to Quake 2 and saying that Quake 2 runs at 100 FPS on the CPU while battlefield only runs at 30fps on the GPU.

Workstation cards do offer some advantages over gaming GPUs these days. For one they have better cooling. I know a lot of people who have burned out their GTX 580s from raytracing for several hours on end. Generally games run a GPU at variable usage. A raytracer though will peg a GPU at 100% solid for hours on end. Also some workstation GPUs have additional memory. You'll be hard pressed to find a gaming card with more than 3GB of memory. You can get a Quadro though with 6GB for $6k. Lastly they also sometimes have specialty outputs such as SDI for color accurate reference displays.

davida303 wrote:

Fake demo?Has anyone else noticed that the demo machine appears to be an iMac, which of course has no PCI-E expansions slots..

Andrew, since you were there, perhaps you can confirm the demo setup.

Demo setup was a high end PC workstation with 16GB of RAM. The "imac" in the photo is just an apple display.

Excellent post, clearing up the forward/backward thing in particular was quite informative.

I've always sort of thought the best approach for RTRT (and a variety of other things GPUs aren't good at) would be to axe dedicated hardware, and switch over to large numbers of simple, SIMD-focussed CPUs (i.e.: Cell, Larrabee, TILE64…) in a multiprocessor arrangement.

Andrew Cunningham / Andrew has a B.A. in Classics from Kenyon College and has over five years of experience in IT. His work has appeared on Charge Shot!!! and AnandTech, and he records a weekly book podcast called Overdue.