September 27, 2012

7 Problems Raytracing Doesn't Solve

I see a lot of people get excited about extreme concurrency in modern hardware bringing us closer to the magical holy grail of raytracing. It seems that everyone thinks that once we have raytracing, we can fully simulate entire digital worlds, everything will be photorealistic, and graphics will become a "solved problem". This simply isn't true, and in fact highlights several fundamental misconceptions about the problems faced by modern games and other interactive media.

For those unfamiliar with the term, raytracing is the process of rendering a 3D scene by tracing the path of a beam of light after it is emitted from a light source, calculating its properties as it bounces off various objects in the world until it finally hits the virtual camera. At least, you hope it hits the camera. You see, to be perfectly accurate, you have to cast a bajillion rays of light out from the light sources and then see which ones end up hitting the camera at some point. This is obviously a problem, because most of the rays don't actually hit the camera, and are simply wasted. Because this brute force method is so incredibly inefficient, many complex algorithms (such as photon-mapping and Metropolis light transport) have been developed to yield approximations that are thousands of times more efficient. These techniques are almost always focused on attempting to find paths from the light source to the camera, so rays can be cast in the reverse direction. Some early approximations actually cast rays out from the camera until they hit an object, then calculated the lighting information from the distance and angle, disregarding other objects in the scene. While highly efficient, this method produced extremely inaccurate results.

It is with a certain irony that raytracing is touted as being a precise, super-accurate rendering method when all raytracing is actually done via approximations in the first place. Pixar uses photon-mapping for its movies. Most raytracers operate on stochastic sampling approximations. We can already do raytracing in realtime, if we get approximate enough, it just looks boring and is extremely limited. Graphics development doesn't just stop when someone develops realtime raytracing, because there will always be room for a better approximation.

1. Photorealism

The meaning of photorealism is difficult to pin down, in part because the term is inherently subjective. If you define photorealism as being able to render a virtual scene such that it precisely matches a photo, then it is almost impossible to achieve in any sort of natural environment where the slightest wind can push a tree branch out of alignment.

This quickly gives rise to defining photorealism as rendering a virtual scene such that it is indistinguishable from a photograph of a similar scene, even if they aren't exactly the same. This, however, raises the issue of just how indistinguishable it needs to be. This seems like a bizarre concept, but there are different degrees of "indistinguishable" due to the differences between people's observational capacities. Many people will never notice a slightly misaligned shadow or a reflection that's a tad too bright. For others, they will stand out like sore thumbs and completely destroy their suspension of disbelief.

We have yet another problem in that the entire concept of "photorealism" has nothing to do with how humans see the world in the first place. Photos are inherently linear, while human experience a much more dynamic, log-based lighting scale. This gives rise to HDR photography, which actually has almost nothing to do with the HDR implemented in games. Games simply change the brightness of the entire scene, instead of combining the brightness of multiple exposures to brighten some areas and darken others in the same photo. If all photos are not created equal, then exactly which photo are we talking about when we say "photorealistic"?

2. Complexity

Raytracing is often cited as allowing an order of magnitude more detail in models by being able to efficiently process many more polygons. This is only sort of true in that raytracing is not subject to the same computational constraints that rasterization is. Rasterization must render every single triangle in the scene, whereas raytracing is only interested in whether or not a ray hits a triangle. Unfortunately, it still has to navigate through the scene representation. Even if a raytracer could handle a scene with a billion polygons efficiently, this raises completely unrelated problems involving RAM access times and cache pollution that suddenly become actual performance bottlenecks instead of micro-optimizations.

In addition, raytracing approximation algorithms almost always take advantage of rays that degrade quickly, such that they can only bounce 10-15 times before becoming irrelevant. This is fine and dandy for walking around in a city or a forest, but what about a kitchen? Even though raytracing is much better at handling reflections accurately, highly reflective materials cripple the raytracer, because now rays are bouncing hundreds of times off a myriad of surfaces instead of just 10. If not handled properly, it can absolutely devastate performance, which is catastrophic for game engines that must maintain constant render times.

How do you raytrace stars? Do you simply wrap a sphere around the sky and give it a "star" material? Do you make them all point sources infinitely far away? How does this work in a space game, where half the stars you see can actually be visited, and the other half are entire galaxies? How do you accurately simulate an entire solar system down to the surface of a planet, as the Kerbal Space Program developers had to? Trying to figure out how to represent that kind of information in a meaningful form with only 64 bits of precision, if you are lucky, is a problem completely separate from raytracing, yet of increasingly relevant concern as games continue to expand their horizons more and more. How do we simulate an entire galaxy? How can we maintain meaningful precision when faced with astronomical scales, and how does this factor in to our rendering pipeline? These are problems that arise in any rendering pipeline, regardless of what techniques it uses, due to fundamental limitations in our representations of numbers.

4. Materials

Do you know what methane clouds look like? What about writing an aerogel shader? Raytracing, by itself, doesn't simply figure out how a given material works, you have to tell it how each material behaves, and its accuracy is wholly dependent on how accurate your description of the material is. This isn't easy, either, it requires advanced mathematical models and heaps of data collection. In many places we're actually still trying to figure out how to build physically correct material equations in the first place. Did you know that Dreamworks had to rewrite part of their cloud shader1 for How To Train Your Dragon? It turns out that getting clouds to look good when your character is flying directly beneath them with a hand raised is really hard.

This is just for common lighting phenomena! How are you going to write shaders for things like pools of magic water and birefringent calcite crystals? How about trying to accurately simulate circular polarizers when most raytracers don't even know what polarization is? Does being photorealistic require you to simulate the Tyndall Effect for caustics in crystals and particulate matter? There are so many tiny little details all around us that affect everything from the color of our iris to the creation of rainbows. Just how much does our raytracer need to simulate in order to be photorealistic?

5. Physics

What if we ignored the first four problems and simply assumed we had managed to make a perfect, magical photorealistic raytracer. Congratulations, you've managed to co-opt the entirety of your CPU for the task of rendering a static 3D scene, leaving nothing left for the physics. All we've managed to accomplish is taking the "interactive" out of "interactive media". Being able to influence the world around us is a key ingredient to immersion in games, and this requires more and more accurate physics, which are arguably just as difficult to calculate as raytracing is. The most advanced real-time physics engine to-date is the Lagoa Multiphysics, and it can only just barely simulate a tiny scene in a well-controlled environment before it completely decimates a modern CPU. This is without any complex rendering at all. Now try doing that for a scene with a radius of several miles. Oh, and remember our issue with scaling? This applies to physics too! Except with physics, its an order of magnitude even more difficult.

6. Content

As many developers have been discovering, procedural generation is not magic pixie dust you can sprinkle on problems to make them go away. Yet, without advances in content generation, we are forced to hire armies of artists to create the absurd amounts of detail required by modern games. Raytracing doesn't solve this problem, it makes it worse. In any given square mile of a human settlement, there are billions of individual objects, ranging from pine cones, to rocks, to TV sets, to crumbs, all of which technically have physics, and must be kept track of, and rendered, and even more importantly, modeled.

Despite multiple attempts at leveraging procedural generation, the content problem has simply refused to go away. Until we can effectively harness the power of procedural generation, augmented artistic tools, and automatic design morphing, the advent of fully photorealistic raytracing will be useless. The best graphics engine in the world is nothing without art.

7. AI

<Patrician|Away> what does your robot do, sam
<bovril> it collects data about the surrounding environment, then discards it and drives into walls
— Bash.org quote #240849

Of course, while we're busy desperately trying to raytrace supercomplex scenes with advanced physics, we haven't even left any CPU time to calculate the AI! The AI in games is so consistently terrible its turned into its own trope. The game industry spends all its computational time trying to render a scene, leaving almost nothing left for the AI routines, forcing them to rely on techniques from 1968. Think about that - we are approaching the point where AI in games comes down to a 50-year old technique that was considered hopelessly outdated before I was even born. Oh, and I should also point out that Graphics, Physics, Art, and AI are all completely separate fields with fundamentally different requirements that all have to work together in a coherent manner just so you can shoot headshots in Call of Duty 22.

I know that raytracing is exciting, sometimes simply as a demonstration of raw computational power. But it always disheartens me when people fantasize about playing amazingly detailed games indistinguishable from real life when that simply isn't going to happen, even with the inevitable development2 of realtime raytracing. By the time it becomes commercially viable, it will simply be yet another incremental step in our eternal quest for infinite realism. It is an important step, and one we should strive for, but it alone is not sufficient to spark a revolution.

1 Found on the special features section of the How To Train Your Dragon DVD.2 Disclaimer: I've been trying to develop an efficient raytracing algorithm for ages and haven't had much luck. These guys are faring much better.

40 comments:

Oh look, a rendering method doesn't "solve" an ill-defined term which means "looks substantially better than today," it doesn't unlock rendering at the galactic scale, and it doesn't solve five topics that have nothing to do with graphics.

My point is that its not instant photorealism, it is approaching photorealism. The entire point is to address the perception of the public that raytracing is equal to photorealism. Raytracing is a STEP towards photorealism, which is obvious to anyone who knows what they're talking about, but not the general public. You can have raytracing and not actually have very good photorealism.

I think it really depends what you call raytracing, renderers like Arnold or Maxwell using Bidirectional Path Tracing ( Which is raytracing )achieve easily a picture quality where you cannot distinguish the generated from the real.

I guess your definition of raytracing is very restrictive.

You should talk about a specific implementation instead of speaking about raytracing in general because for instance Photon Mapping is based on raytracing.

Yeah but this is raytracing, an implementation of it, and it does produce photorealistic results ( Along with other implementations ), that's why I don't get the "7 Problems Raytracing Doesn't Solve: 1) Photorealism".

"For those unfamiliar with the term, raytracing is the process of rendering a 3D scene by tracing the path of a beam of light after it is emitted from a light source, calculating its properties as it bounces off various objects in the world until it finally hits the virtual camera. At least, you hope it hits the camera. You see, to be perfectly accurate, you have to cast a bajillion rays of light out from the light sources and then see which ones end up hitting the camera at some point."

I thought it was the other way around, i.e. raytracers cast rays from the virtual camera through the scene all the way back to the lights?

Early raytracers were in fact ray*casters*, i. e. they stopped the trace when hitting an object, and used that info just to calculate shading, assuming no obstructions of light sources (that is, no shadows). You could call that an approximation, but it is not not the kind of ray tracing Pixar does or ever did.

Sorry, I wasn't clear. What I meant is that your article implies (whether you meant to or not) that Pixar used 'approximative algorithms' - which they didn't. I mentioned ray-casting only because it was the most common algorithm originally used to approximate ray-tracing.

Your article isn't clear about that, but Photon-mapping is not an approximation, nor an alternate algorithm - it is a refinement of one of the aspects of the ray-tracing model.

As an engineer, what really caught my attention was the fluid behavior in that demonstration. What exactly is going on under the hood of that? It did *look* incredibly realistic.

I've worked with actual simulations of fluid dynamics via the physics-based Navier Stokes equations, over finite volume grids and other discretization schemes, and doing that requires insane amounts of number crunching *for each timestep*.

And yet, I've seen fluid-ish fluid-like behavior produced via things like a bunch of squishy particles, etc. I'm wondering if, even if it's not possible to tie it back to the physics, if for whatever reason the approximations you graphics guys are using capture enough of the right behavior to do things like lift/drag prediction far faster than the normal means of crunching PDEs.

I've seen many different ways of simulating fluids, but most of them are using various particle methods. I'd bet what they're doing there basically amounts to a whole bunch of tiny particles, plus some optimizations so that they can increase effective particle size for parts of the water that aren't moving much. Keep in mind that looking realistic and being realistic are vastly different things. There is no way they were actually calculating any PDEs or anything close to it, they're just treating it as an nbody problem.

Great article! Your point about physics reminded me of what might eventually be recognized as another limitation in current rendering technology: the physics of photons themselves, such as interference patterns. Modeling the optical behaviors of, say a several-micron-thick oil sheen on a puddle isn't just about statistical reflecting properties of surfaces, it's also about the colors produced when the photons reflected from the surface itself interfere positively and negatively with those reflected from the oil/water interface producing vivid color. And I think this is not just a "photo-realism" nit, but a whole different aspect of the physics of light that might become important with simulations and machine sensing.

Sorry, but none of these problems are really at the doorstep of ray-tracing. These are common problems of game design and visualization. And you as a game designer have the choice to use one or another algorithm, depending on which solves your problems, or doesn't.

Ray Tracing is an algorithm that constructs images based on a scientific model of cameras, lighting, a scene, and the optics involved. Depending on complexity of that model, Ray Tracing takes a very long time to compute, and modern hardware so far has been incapable to master that complexity in a sufficient degree to create real-time animations. That's all there is to say about it.

That said, while ray tracing is horribly complex to compute, saying that "ray tracing will never solve the problem of this complexity" is just as silly as blaming the physicists for the complexity of quantum physics. It is what it is. If you cannot afford the complexity, go find another solution.

I'd suggest to check out Ray *casting*, and Octree, if simple polygonal rendering is not sufficient for you. But then I haven't been working in this area for 20 years, my advice may be outdated.

Where did I say "never"? I simply said that raytracing itself does not intrinsically solve photorealism. It CAN solve photorealism, but only if you are using an approximation that is sufficiently accurate. Please stop reading your own opinions into my text.

Well, your headline says "raytracing doesn't solve". I may be reading too much into it, but as a mathematician I consider "doesn't solve" a pretty unambiguous and final statement. For me, this implies "cannot be solved, ever", which equates to "never". It is certainly not *my* opinion, just what I interpreted from your statements. Maybe I was wrong.

In any case, IMHO you're extending the term photo-realism way too far. It only means an image that is sufficiently detailed that an uninformed observer couldn't tell if it's a photo or not. Pixar pretty much achieved this 20 years ago, or maybe 10, depending on the sharpness of the observer.

The HDR is a problem of photos, not of ray-tracing.

After reading Nathans comment below I have to agree: you're going into way too much detail for any practical purposes, and not for any good reason. Any object not in the focus of the viewer or too far away to be seen clearly can be handled with a simplified model of physics and graphics (and AI, if applicable). You can't simulate the entire unverse. Compromise!

By "doesn't solve" I meant that raytracing, BY ITSELF, does NOT solve photorealism - and it doesn't, you need better materials no matter how fast your raytracing is, and if you consider better materials as part of raytracing, you are over-extending the term "raytracing". Of course, you the proceed to say that *I* am overextending the word "photorealistic", but what I'm trying to do is point out that "photorealism" is actually *more* subjective than most people think. If you don't think this, good for you.

If you think I am going into too much detail you simply don't realize how much those details contribute to the experience. I'm sure someone would have said the same thing about parallax mapping and relief mapping years ago.

Well, 'by itself', ray tracing is just a rendering algorithm, and it is well known what it solves: the mathematically exact rendering of complex geometrical bodies with nonplanar surfaces, including hidden-surface problem, reflections and transparency. That's about it. Like you said it doesn't solve materials. And it doesn't solve shading, or highlighting. It doesn't care about movement, animation, or any kind of physics. Nor AI.

Nor photo-realism! Ray tracing is no more about that than one artificial sweetener is about tasting the same as another artificial sweetener - the point is to emulate the real thing, not another artififcial one!

Mapping is about glossing over the details. It can turn a smooth surface into an apparently gravelly path. Would you prefer to simulate each individual gravel stone? IMHO that is just a waste of effort, unless there is a very good reason why the players attention is even turned to that path (e. g. trying to follow a trace of footsteps, or digging in the dirt). It would be easier to just have a hailstorm and let each hail stone jump in random directions upon hitting the path, to give the impression of an uneven surface.

Mapping is simply one way of solving a problem. It is not the only way. If we are going to increase the amount of detail in games, we must stop thinking only in terms of techniques that have been around for ages and start thinking outside the box. Simply coming up with progressively clever ways to map something is not the only possible direction, or even the most correct - it is simply the one that's the most popular.

I am convinced there are better ways to do this that do not involve impossible amounts of information or processing power.

I agree with a lot of the points in this article such as the increase it the complexity of the content, but also disagree about several things. Personally, I believe we can achieve very close to photo-realism using our current rasterization methods while incurring a relatively low-cost. That would obviously exclude things like real-time reflections on non-planar objects.

First off, almost every visual effects shot you've ever seen, and didn't realize you saw it, came from ray-tracing. You probably see hundreds of them every day and don't even notice it, so obviously the inaccuracies are not all that inaccurate.

Some things I disagree with in your article are things like were you make it seem like physics and AI having to suffer because of ray-tracing. I have been professionally working on games and other interactive media for several years now and worked professionally in the digital animation and visual effects field for a few years prior to that. Physics and AI have never put much burden on my CPU except when I was intentionally trying to push it to it's limit. These two things hardly put a burden on the CPU in most games. Why? Because, I can shut off physics and AI calculations for objects beyond a reasonable distance from my camera's position and turn them back on when I'm getting close to visible distance. Let's say I have a fairly dense urban level, no photo-realism has to suffer if I decide anything beyond (let's just say) 1500 meters isn't going to be seen anyway, so why compute the physics and AI on it?

You said, "In any given square mile of a human settlement, there are billions of individual objects, ranging from pine cones, to rocks, to TV sets, to crumbs, all of which technically have physics, and must be kept track of, and rendered, and even more importantly, modeled."

This is true, but...

1) In most cases, they actually don't have to be rendered after a certain distance of the camera because you wouldn't gain anything even in ray-tracing after a certain distance and is completely negligible. They could at some distance be completely removed from the render solution.

2) You can still use LODs for both materials and models. So you can replace a fully reflective material with a specular or cubemapped reflection at a distance where the player won't notice.

3) You don't have to model each individual rock, pine cone, etc... this is what instances and procedural textures are for (and possibly procedural displacement?) You can model a handful of these items and then place/paint them into your scene.

4) You don't have to have physics on every object, only for those objects that actually need it, IE background items don't need physics. Small rocks and other debris don't need physics. Let's say you have a chain link fence which apparently has a ton of polygons, you still only need a primitive box collider on it which computes very quickly.

5) AI for any actor only needs to be computed when that actor is within a certain distance. And things like paths don't need to be computed every frame.

6) Occlusion culling techniques can still be employed. Albeit, it would be a take a little longer for the developer to run the calculation on due to the recursion of reflections and such, but, it would make it run way faster on the gamer's end.

7) You can still use things like lightmapping if your lighting is consistent.

...) I could probably go on but, people should get the point.

Again, I'm perfectly happy with the rasterization methods we use now and don't think we're ready for ray-tracing quite yet, but with the advent of many-core CPUs that have already come and are coming into the market, sparing 4 or 5 cores for my physics and AI, while using a large chunk of them to do the rendering, and a few others for whatever other logic, animation, scene management, cleanup, etc, doesn't really sound all that unreasonable to me.

Physics has always been a problem for me. You can't just shut off physics for objects beyond a certain distance because you can often still see the objects in question. You can reduce the load, but again that requires a specialized physics solution that, AFAIK, does not currently exist. AI is only not a problem because the techniques we are using deliberately don't use very much CPU. You are making rash assumptions about the fundamental complexity of a given calculations based entirely off current approximations that seem "good enough".

I am specifically interested in trying to push the boundaries of what really is "good enough", and these almost invariably require massive increases in the computational complexity for physics. AI I do not know enough about to comment.

1. Doesn't matter. The problem is not rendering speed, its information.

2. Gives minimal benefit to raytracing.

3. Actually, you do. I have specifically noticed the flat, boring, obviously fake textures used by modern games on the ground. While this is fine for most cases, the ground right next to the character needs to be fully modeled and interacted with, which is doable in terms of rendering, but introduces massive problems in terms of information storage and physics. This problem has not yet been solved by procedural generation in a satisfactory way, which is why we need to work on it more.

4. Very, very wrong. The entire fact that background objects "don't need" physics is what makes most games seem boring as shit, because nothing in the background ever moves. This is a PROBLEM, not a solution. You are free to not believe me simply because no one has managed to do this yet because its too expensive, but I promise I will simply prove you wrong in another 5 years.

5. Completely wrong. Try simulating an entire global economy with millions of interacting NPCs in a space MMO like EVE. Have fun!

6. This doesn't work for raytracing, or not in any real way that wouldn't be absorbed into the approximation algorithm itself.

7. If your lighting is consistent, your game is boring.

I personally believe many of these problems can be solved by rasterization, and that's one of the reasons I wrote the article, but you are looking at this from the perspective of the current game industry and what is currently accepted as "good enough". It won't be good enough when someone else makes something better.

I agree that "good enough" is relative. But I've never heard a player complain about a game being boring because 'nothing is happening in the distance'. They do complain however if not enough is happening in their immediate vicinity! Your time and effort would be better spent in distracting the player from the fact that distant objects are mostly uninteresting.

Make the near view graphics better and noone will care whether the distant background is unicolor grey or a complex particle system simulation of volcanic eruptions viewed through clouds of haze and ash. Well, ok, maybe not unicolor, but it's infinitely less expensive to just project a prerecorded movie of said simulation on the background...

No I don't realize that, and if you're of that opinion, then mine differs. There are lots of known problems that would be rewarding to solve. I'm sure there are lots of unknown ones that would be rewarding to solve too, but I doubt they're as numerous, or, in general, equally rewarding.

Steve Jobs had the rare quality of recognizing such problems and the courage to invest into them. I do not. If you you feel you're up to something then I wish you the best of luck in your project. I've shared my piece of advice already. Maybe I'm wrong and you're right to ignore it.

I'm sorry, I was paying too much attention to your suggested solutions rather than the problem at hand: improving (game) graphics. You said it yourself: ray tracing doesn't solve it. So you need different solutions, and rather than critizize your ideas I should have suggested other options.

In any case, your observations are correct: there is definitely room for improvement. I've had similar ideas myself over the past 3 decades. But so far I've never managed to come up with an original idea that hadn't been invented and implemented already. Maybe you can be more successful than that.

3D gaming is quite old now. The first game presenting pseudo-realistic graphics was released almost 20 years ago (check out "The 7th guest"). And when I look at todays graphics, little has changed since then, at least on first glance.

When I compare my memories of that game to todays' games I realize I was wrong in some of my earlier statements: The only things that notably changed are the improved physics, and increased number and detail of objects. And it sure looks better! But look at the cost: our computers are a thousandfold more powerful than those of 1993, and still the animations appear to be lacking.

Yes, more details would improve the graphics, but that's what has been done for 20 years, and it comes at the cost of computing power. If you want something game-changing (literally), you should either think of some alternate algortihms that don't cost as much as the current ones, or find other aspects of the game (graphics or other) that would be worth improving.

It may be worth checking out current developments at Pixar: they're working on an incredibly complex physical model of curly hair in 'Brave'. If you can think of something simpler that could be introduced in real time graphics for gaming that would be something! It *is* about details too, just not the far off ones that Nathan and I suggested can be ignored for the most part. Instead it's the detail right in front of your camera.

The key is not coming up with an idea that has not been invented already, its making the idea work. This is the underlying insight that I am relying on for my 2D graphics engine. What I'm doing with it has been done before, but it has never been *practical*. I figured out a way to take several existing ideas and combine them to make it practical.

A little slow on this one, 3 months down the line though, but I had to reply:

I was actually looking for an article I read on how rasterization (and various "faking it" shaders) is(are) effectively indistinguishable from raytracing when I came across this. Put in that light, I'd say your points about not being able to trace to very deep depths, perfectly simulate the effects of distant stars, and provide adequade material descriptions are moot.

But really, it seems to me when you wrote that you were just warming up, and the point you are trying to push home is something along the lines of "rendering isn't everything". A point which I wholeheartedly agree on. I would love to see proper physicaly simulated materials in game, not just ghetto-destruction-algorithms [more accurate non-real time algorithms do exist, which excite me about future possibilities] but also chemical interaction. I want to see wood walls burn, and see acid eat through metal floors [i guess this would require voxel based rendering though?]

What really made me want to reply though, was your point on game AI. IO think when you wrote "rely on techniques from 1968" you meant "rely on A*" because to see ANY other algorithm, even a dated one, is an absolute blessing. Not that i'm knocking on A*, it's a great solution to path-finding, but it's roles are limited. I would LOVE to see more use of proper AI, machine learning, neural networks, computer vision (kinect was a godsend here). It seems every (nearly, there are exceptions) game "AI" I see though, is nothing but a simple precompiled script of "actions to take". Not something that even deserves the title of "AI".

I got that sense too. It seems that the thesis of this blog post is that raytracing only helps do a few things, it presents its own computational and design challenges, and doesn't save the software designers from having to create actual content.