A blog by Michael Abrash

I was going to start this post off with a discussion of how we all benefit from sharing information, but I just got an email from John Carmack that nicely sums up what I was going to say, so I’m going to go with that instead:

Subject: What a wonderful world…

Just for fun, I was considering writing a high performance line drawing routine for the old Apple //c that Anna got me for Christmas. I could do a pretty good one off the top of my head, but I figured a little literature review would also be interesting. Your old series of articles comes up quickly, and they were fun to look through again. I had forgotten about the run-length slice optimization.

What struck me was this paragraph:

First off, I have a confession to make: I’m not sure that the algorithm I’ll discuss is actually, precisely Bresenham’s run-length slice algorithm. It’s been a long time since I read about this algorithm; in the intervening years, I’ve misplaced Bresenham’s article, and have been unable to unearth it. As a result, I had to derive the algorithm from scratch, which was admittedly more fun than reading about it, and also ensured that I understood it inside and out. The upshot is that what I discuss may or may not be Bresenham’s run-length slice algorithm—but it surely is fast.

The notion of misplacing a paper and being unable to unearth it again seems like a message from another world from today’s perspective. While some people might take the negative view that people no longer figure things out from scratch for themselves, I consider it completely obvious that having large fractions of the sum total of human knowledge at your fingertips within seconds is one of the greatest things to ever happen to humanity.

Hooray for today!

But what’s in it for me?

Hooray for today indeed – as I’ve written elsewhere (for example, the last section of this), there’s huge value to shared knowledge. However, it takes time to write something up and post it, and especially to answer questions. So while we’re far better off overall from sharing information, it seems like any one of us would be better off not posting, but rather just consuming what others have shared.

This appears to be a classic example of the Prisoner’s Dilemma. It’s not, though, because there are generally large, although indirect and unpredictable, personal benefits. There’s no telling when they’ll kick in or what form they’ll take, but make no mistake, they’re very real.

For example, consider how the articles I wrote over a ten-year stretch – late at night after everyone had gone to sleep – opened up virtually all the interesting opportunities I’ve had over the last twenty years.

In 1992, I was writing graphics software for a small company, and getting the sense that it was time to move on. I had spent my entire career to that point working at similar small companies, doing work that was often interesting but that was never going to change the world. It’s easy to see how I could have spent my entire career moving from one such job to another, making a decent living but never being in the middle of making the future happen.

However, in the early 80’s, Dan Illowsky, publisher of my PC games, had wanted to co-write some articles as a form of free advertising. There was nothing particularly special about the articles we wrote, but I learned a lot from doing them, not least that I could get what I wrote published.

Then, in the mid-80’s, I came across an article entitled “Optimizing for Speed” in Programmer’s Journal, a short piece about speeding up bit-doubling on the 8088 by careful cycle counting. I knew from optimization work I’d done on game code that cycle counts weren’t the key on the 8088; memory accesses, which took four cycles per byte, limited almost everything, especially instruction fetching. On a whim, I wrote an article explaining this and sent it off to PJ, which eventually published it, and that led to a regular column in PJ. By the time I started looking around for a new job in 1992, I had stuff appearing in several magazines on a regular basis.

One of those articles was the first preview of Turbo C. Borland had accidentally sent PJ the ad copy for Turbo C before it was announced, and when pressed agreed to let PJ have an advance peek. The regular C columnist couldn’t make it, so as the only other PJ regular within driving distance, I drove over the Santa Cruz Mountains on zero notice one rainy night and talked with VP Brad Silverberg, then wrote up a somewhat breathless (I wanted that development environment) but essentially correct (it really did turn out to be that good) article.

In 1992, Brad had moved on to become VP of Windows at Microsoft, and when I sent him mail looking for work, he referred me to the Windows NT team, where I ended up doing some of the most challenging and satisfying work of my career. Had I not done the Turbo C article, I wouldn’t have known Brad, and might never have had the opportunity to work on NT. (Or I might have; Dave Miller, who I worked with at Video Seven, referred me to Jeff Newman, who pointed me to the NT team as well – writing isn’t the only way opportunity knocks!)

I was initially a contractor on the NT team, and I floundered at first, because I had no experience with working on a big project. I would likely have been canned after a few weeks, were it not for Mike Harrington, who had read some of my articles and thought it was worth helping me out. Mike got me set up on the network, showed me around the development tools, and took me out for dinner, giving me a much-needed break in the middle of a string of 16-hour workdays.

After a few years at Microsoft, I went to work at Id, an opportunity that opened up because John Carmack had read my PJ articles when he was learning about programming the PC. And a few years later, Mike Harrington would co-found Valve, licensing the Quake source code from Id, where I would be working at the time – and where I would help Valve get the license – and thirteen years after that, I would go to work at Valve.

If you follow the thread from the mid-80’s on, two things are clear: 1) it was impossible to tell where writing would lead, and 2) writing opened up some remarkable opportunities over time.

It’s been my observation that both of these points are true in general, not just in my case. The results from sharing information are not at all deterministic, and the timeframe can be long, but generally possibilities open up that would never have been available otherwise. So from a purely selfish perspective, sharing information is one of the best investments you can make.

The unpredictable but real benefits of sharing information are part of why I write this blog. It has brought me into contact with many people who are well worth knowing, both to learn from and to work with; for example, I recently helped Pravin Bhat, who emailed me after reading a blog post and now works at Valve, optimize some very clever tracking code that I hope to talk about one of these days. If you’re interested in AR and VR – or if you’re interested in making video games, or Linux, or hardware, or just think Valve sounds like an great place to work (and you should) – take a look at the Valve Handbook. If, after reading the Handbook, you think you fit the Valve template and Valve fits you, check out Valve’s job openings or send me a resume. We’re interested in both software and hardware – mechanical engineers are particularly interesting right now, but Valve doesn’t hire for specific projects or roles, so I’m happy to consider a broad range of experience and skills – but please, do read the Handbook first to see if there’s likely to be a fit, so you can save us both a lot of time if that’s not the case.

The truth is, I wrote all those articles, and I write this blog, mostly because of the warm feeling I get whenever I meet someone who learned something from what I wrote; the practical benefits were an unexpected bonus. Whatever the motivation, though, sharing information really does benefit us all. With that in mind, I’m going to start delving into what we’ve found about the surprisingly deep and complex reasons why it’s so hard to convince the human visual system that virtual images are real.

How images get displayed

There are three broad factors that affect how real – or unreal – virtual scenes seem to us, as I discussed in my GDC talk: tracking, latency, and the way in which the display interacts perceptually with the eye and the brain. Accurate tracking and low latency are required so that images can be drawn in the right place at the right time; I’ve previously talked about latency, and I’ll talk about tracking one of these days, but right now I’m going to treat latency and tracking as solved problems so we can peel the onion another layer and dive into the interaction of head mounted displays with the human visual system, and the perceptual effects thereof. More informally, you could think of this line of investigation as: “Why VR and AR aren’t just a matter of putting a display an inch in front of each eye and rendering images at the right time in the right place.”

In the next post or two, I’ll take you farther down the perceptual rabbit hole, to persistence, judder, and strobing, but today I’m going to start with an HMD artifact that’s both useful for illustrating basic principles and easy to grasp intuitively: color fringing. (I discussed this in my GDC talk, but I’ll be able to explain more and go deeper here.)

A good place to start is with a simple rule that has a lot of explanatory power: visual perception is a function of where and when photons land on the retina. That may seem obvious, but consider the following non-intuitive example. Suppose the eye is looking at a raster-scan display. Further, suppose a vertical line is being animated on the display, moving from left to right, and that the eye is tracking it. Finally, assume that the pixels on the display have zero persistence – that is, each one is illuminated very brightly for a very short portion of the frame time. What will the eye see?

The pattern shown on the display for each frame is a vertical line, so you might expect that to be what the eye sees, but the eye will actually see a line slanting from upper right to lower left. The reasons for this were discussed here, but what they boil down to is that the pattern in which the photons from the pixels land on the retina is a slanted line. This is far from unusual; it is often the case that what is perceived by the eye differs from what is displayed on an HMD, and the root cause of this is that the overall way in which display-generated photons are presented to the retina has nothing in common with real-world photons.

Real-world photons are continuously reflected or emitted by every surface, and vary constantly. In contrast, displays emit fixed streams of photons from discrete pixel areas for discrete periods of time, so photon emission is quantized both spatially and temporally; furthermore, with head-mounted displays, pixel positions are fixed with respect to the head, but not with respect to the eyes or the real world. In the case described above, the slanted line results from eye motion relative to the pixels during the time the raster scan sweeps down the display.

You could think of the photons from a display as a three-dimensional signal: pixel_color = f(display_x, display_y, time). Quantization arises because pixel color is constant within the bounds defined by the pixel boundaries and the persistence time (the length of time any given pixel remains lit during each frame). When that signal is projected onto the retina, the result for a given pixel is a tiny square that is swept across the retina, with the color constant over the course of a frame; the distance swept per frame is proportional to the distance the eye moves relative to the pixel during the persistence time. The net result is a smear, unless persistence is close to zero or the eye is not moving relative to the pixel.

The above description is a simplification, since pixels aren’t really square or uniformly colored, and illumination isn’t truly constant during the persistence time, but it will suffice for the moment. We will shortly see a case where it’s each pixel color component that remains lit, not the pixel as a whole, with interesting consequences.

The discrete nature of photon emission over time is the core of the next few posts, because most display technologies have significant persistence, which means that most HMDs have a phenomenon called judder, a mix of smearing and strobing (that is, multiple simultaneous perceived copies of images) that reduces visual quality considerably, and introduces a choppiness that can be fatiguing and may contribute to motion sickness. We’ll dive into judder next time; in this post we’ll establish a foundation for the judder discussion, using the example of color fringing to illustrate the basics of the interaction between the eye and a display.

The key is relative motion between the eye and the display

Discrete photon emission produces artifacts to varying degrees for all display and projector based technologies. However, HMDs introduce a whole new class of artifacts, and the culprit is rapid relative motion between the eye and the display, which is unique to HMDs.

When you look at a monitor, there’s no situation in which your eye moves very rapidly relative to the monitor while still being able to see clearly. One reason for this is that monitors don’t subtend a very wide field of view – even a 30-inch monitor would be less than 60 degrees at normal viewing distance – so a rapidly-moving image would vanish off the screen almost as soon as the eye could acquire and track it. In contrast, the Oculus Rift has a 90-degree FOV.

An even more important reason why the eye can move much more rapidly relative to head-mounted displays than to monitors is that HMDs are attached to heads. Heads can rotate very rapidly – 500 degrees per second or more. When the head rotates, the eye can counter-rotate just as fast and very accurately, based on the vestibulo-ocular reflex (VOR). That means that if you fixate on a point on the wall in front of you, then rotate your head as rapidly as you’d like, that point remains clearly visible as your head turns.

Now consider what that means in the context of an HMD. When your head turns while you fixate on a point in the real world, the pixels on the HMD move relative to your eyes, and at a very high speed – easily ten times as fast as you can smoothly track a moving object. This is particularly important because it’s common to look at a new object by first moving the eyes to acquire the target, then remaining fixated on the target while the head turns to catch up. This VOR-based high-speed eye-pixel relative velocity is unique to HMDs.

Let’s look at a few space-time diagrams that help make it clear how HMDs differ from the real world. These diagrams plot x position relative to the eye on the horizontal axis, and time advancing down the vertical axis. This shows how two of the three dimensions of the signal from the display land on the retina, with the vertical component omitted for simplicity.

First, here’s a real-world object sitting still.

I’ll emphasize, because it’s important for understanding later diagrams, that the x axis is horizontal position relative to the eye, not horizontal position in the real world. With respect to perception of images on HMDs it’s eye-relative position that matters, because that’s what affects how photons land on the retina. So the figure above could represent a situation in which both the eye and the object are not moving, but it could just as well represent a situation in which the object is moving and the eye is tracking it.

The figure would look the same for the case where both a virtual image and the eye are not moving, unless the color of the image was changing. In that case, a real-world object could change color smoothly, while a virtual image could only change color once per frame. However, matters would be quite different if the virtual image was moving and the eye was tracking it, as we’ll see shortly.

Next let’s look at a case where something is moving relative to the eye. Here a real-world object is moving from left to right at a constant velocity relative to the eye. The most common case of this would be where the eye is fixated on something else, while the object moves through space from left to right.

Now let’s examine the case where a virtual image is moving from left to right relative to the eye, again while the eye remains fixated straight ahead. There are many types of displays that this might occur on, but for this example we’re going to assume we’re using a color-sequential liquid crystal on silicon (LCOS) display.

Color-sequential LCOS displays, which are (alas, for reasons we’ll see soon) often used in HMDs, display red, green, and blue separately, one after another, for example by reflecting a red LED off a reflective substrate that’s dynamically blocked or exposed by pixel-resolution liquid crystals, then switching the liquid crystals and reflecting a green LED, then switching the crystals again and reflecting a blue LED. (Many LCOS projectors actually switch the crystals back to the green configuration again and reflect the green LED a second time each frame, but for simplicity I’ll ignore that.) This diagram below shows how the red, green, and blue components of a moving white virtual image are displayed over time, again with the eye fixated straight ahead.

Once again, remember that the x axis is horizontal motion relative to the eye. If the display had an infinite refresh rate, the plot would be a diagonal line, just like the second space-time diagram above. Given actual refresh rates, however, something quite different happens.

For a given pixel, each color displays for one-third of each frame. (It actually takes time to switch the mirrors, so each color displays for more like 2 ms per frame, and there are dark periods between colors, but for ease of explanation, let’s assume that each frame is evenly divided between the three colors; the exact illumination time for each color isn’t important to the following discussion.) At 60 Hz, the full cycle is displayed over the course of 16 ms, and because that interval is shorter than the time during which the eye integrates incident light, the visual system blends the colors for each point together into a single composite color. The result is that the eye sees an image with the color properly blended. This is illustrated in the figure below, which shows how the photons from a horizontal white line on an LCOS display land on the retina.

Here the three color planes are displayed separately, one after another, and, because the eye is not moving relative to the display, the three colored lines land on top of each other to produce a perceived white line.

Because each pixel can update only once a frame and remains lit for the persistence time, the image is quantized to pixel locations spatially and to persistence time temporally, resulting in stepped rather than continuous motion. In the case shown above, that wouldn’t produce noticeable artifacts unless the image moved too far between frames – “too far” being on the order of five or ten arc-minutes, depending on the frequency characteristics of the image. In that case, the image would strobe; that is, the eye would perceive multiple simultaneous copies of the image. I’ll talk about strobing in the next post.

So far, so good, but we haven’t yet looked at motion of the eye relative to the display, and it’s that case that’s key to a number of artifacts. As I noted earlier, the eye can move relative to the display, while still being able to see clearly, either when it’s tracking a moving virtual image or when it’s fixated on a static virtual image or real object via VOR while the head turns. (I say “see clearly” because the eye can also move relative to the display by saccading, but in that case it can’t see clearly, although, contrary to popular belief, it does still acquire and use visual information.) As explained above, the VOR case is particularly interesting, because it can involve very high relative velocities between the eye and the display.

So what happens if the eye is tracking a moving virtual object that’s exactly one pixel in size from left to right? (Assume that the image lands squarely on a pixel center each frame, so we can limit this discussion to the case of exactly one pixel being lit per frame.) The color components of each pixel will then each line up differently with the eye, as you can see in the figure below, and color fringes will appear. (This figure also contains everything you need in order to understand judder, but I’ll save that discussion for the next post.)

Remember, the x position is relative to the eye, not the real world.

For a given frame, the red component of the pixel gets drawn in the correct location – that is, to the right pixel – at the start of the frame (assuming either no latency or perfect prediction). However, the red component remains in the same location on the display and is the same color for one-third of the frame; in an ideal world, the pixel would move continuously at the same speed as the image is supposed to be moving, but of course it can’t go anywhere until the next frame. Meanwhile, the eye continues to move along the path the image is supposed to be following, so the pixel slides backward relative to eye, as you can see in the figure above. After a one-third of the frame, the green component replaces the red component, falling farther behind the correct location, and finally the blue component slides even farther for the final one-third of the frame. At the start of the next frame, the red component is again drawn at the correct pixel (a different one, because the image is moving across the display), so the image snaps back to the right position, and again starts to slide. Because each pixel component is drawn at a different location relative to the eye, the colors are not properly superimposed, and don’t blend together correctly.

Here’s how color fringing would look for eye movement from left to right – color fringes appear at the left and right sides of the image, due to the movement of the eye relative to the display between the times the red, green, and blue components are illuminated.

It might be hard to believe that color fringes can be large enough to really matter, when a whole 60Hz frame takes only 16.6 ms. However, if you turn your head at a leisurely speed, that’s about 100 degrees/second, believe it or not; in fact, you can easily turn at several hundred degrees/second. (And remember, you can do that and see clearly the whole time if you’re fixating, thanks to VOR.) At just 60 degrees/second, one 16.6ms frame is a full degree; at 120 degrees/second, one frame is two degrees. That doesn’t sound like a lot, but one or two degrees can easily be dozens of pixels – if such a thing as a head-mounted display that approached the eye’s resolution existed, two degrees would be well over 100 pixels – and having rainbows that large around everything reduces image quality greatly.

Color-sequential displays in projectors and TVs don’t suffer to any significant extent from color fringing because there’s no rapid relative motion between the eye and the display involved, for the two reasons mentioned earlier: because projectors and TVs have limited fields of view, and because they don’t move with the head and thus aren’t subject to the high relative eye velocities associated with VOR. Not so for HMDs; color-sequential displays should be avoided like the plague in HMDs intended for AR or VR use.

Necessary but not sufficient

There are two important conclusions to be drawn from the discussion to this point. The first is that it should now be clear that relative motion between the eye and a head-mounted display can produce serious artifacts, and what the basic mechanism underlying that is. The second is that a specific artifact, color fringing, is a natural by-product of color-sequential displays, and that as a result AR/VR displays need to illuminate all three color components simultaneously, or at least nearly so.

Illuminating all three color components simultaneously is, alas, necessary but not sufficient. Doing so will eliminate color fringing, but it won’t do anything about judder, so that’s the layer we’ll peel off the perceptual onion next time.

Interesting question. It’s been tried in some forms, but without success. I guess my question would be what exactly you have in mind for a non-frame-based display pipeline? Personally, I don’t have any picture of what potentially successful pipeline of that sort would look like.

It is true that using foveated rendering for rendering and transmission could reduce the load in those areas a great deal, but it would still be a frame-based pipeline.

Anyway, for my purposes this is, as you note, an academic question; the cost of developing a whole new hardware pipeline would be huge.

Much appreciated breakdown, as always. I’m curious about this particular point:

That doesn’t sound like a lot, but one or two degrees can easily be dozens of pixels – if such a thing as a head-mounted display that approached the eye’s resolution existed, two degrees would be well over 100 pixels – and having rainbows that large around everything reduces image quality greatly.

This seems to imply that color fringing actually becomes more problematic with higher resolution displays, which makes some sense at the level of individual pixels: as the size of a pixel shrinks compared to the the angular distance that the display moves per frame relative to the eye, the width of the region where sequential pixel colors overlap (ie: the “white” region) shrinks as well, resulting in more pronounced RGB fringing.

However, is this a complicating factor for increased HMD display resolutions in practice, where image features are roughly the same size regardless of pixel count? Wouldn’t fringing be identical for a given image feature width (eg: 1-degree-wide white dot) whether it’s represented by 1, 50, or 100 pixels since fringing would be reduced for high density pixels by virtue of overlapping with the past positions of adjacent pixels? (this of course assumes that all pixels represent the same color value)

You’re correct that the perceived size of the fringes will be the same in both cases, so the perceptual effect of the fringes should be similar. However, as we’ll see next time, the same eye motion that produces color fringes also produces judder (unless persistence is really low), and the smear part of judder causes worse detail loss at higher resolutions simply because there’s more detail to lose.

Question for you regarding possible solutions: IF a game engine could predict “the next frame” given all it knows about the current input and screen-space velocities of the various visible objects, and IF displays were equipped with the capability to receive both the current frame and the predicted next frame — one to show immediately and then over the course of 16ms lerp toward the predicted frame, do you think that would that solve the smearing problem while still staying within the world of frame-based rendering?

Wow, wild idea. The answer is, I don’t think so. Lerping wouldn’t be correct – what you want is to see the correct content at each pixel at all times, which is not necessarily closely related to a lerped value between one frame and the next. Suppose in frame N the pixel is blue and in frame N+1 it’s blue, but in the texture that’s sliding across the screen those two blue values are separated by white; then lerping from blue to blue would produce blue for the whole frame, when it should have become white, then gone back to blue. So high-frequency features would potentially blink in and out. And given that the frame to frame movement can be several degrees with FOV, the features wouldn’t even have to be that high-frequency. But a very creative idea!

We effectively already have this in regular (non-LCOS) LCD displays. The LCD elements take time to transition from the value in one frame to the value in the next. In some cases, they take more than a 17ms frame to make the transition! So you do get this blurring effect “for free”. It’s not a good a thing – what you get is what you’d expect – lots of blurring. On the other hand – no juddering! But I’m pre-empting Michael’s next article here…

Would it be possible to use a mirror to reflect the photons from the display to the eye, and then reorient that mirror at a much faster rate than the screen is refreshing, causing the lit pixels to remain in the correct position relative to the eye during head movement?

My knowledge of the hardware is practically nil, so I’ve no idea if actuators both responsive and accurate enough exist, nor if the range of movement necessary would be prohibitive. I assume vibration and weight would be problems too, but an optical solution seems more realistic to me than a significant increase in refresh rate.

Cool idea, but I’m pretty sure there’d be no way to get the required tolerances with mechanical parts, particularly since the mirror would have to counter-slide in perfect synchrony with the frame rate; the jump back at the start of each frame seems really hard, since it involves reversing the mirror twice, with great accuracy, in a a few milliseconds. I’ll check with the hardware people and post a correction if I’m wrong, but I’d be surprised.

Also, this would have to be done farther up the optical pipeline; you can’t just put a mirror in front of the eye and get a wide FOV. You’d have to project the reflected image onto something (in which case there are focal depth issues) or inject it into a waveguide, so there’s be additional complication.

Still, worth thinking about to see if there’s any way to implement it, because if it could be implemented, it would solve the problem.

I had the same idea as BJ, but I was thinking of a particular technology as a starting point: DLP.

In particular, I was thinking you might be able to combine DLP mirror technology with eye tracking to adjust the image based on where the person was looking “in the frame”. I think there might still be color artifacts but you might be able to fix motion artifacts. And of course I’m sure there would be new problems to consider.

I don’t think DLP tech is quite compact enough for this application (yet), but I think it could be modeled and simulated to see if it is an avenue worth pursuing.

As you say, DLP isn’t well suited to HMDs right now, so it’s not something I tend to think of. DLP mirrors don’t have any kind of fine adjustment capability that I know of; the mirrors just flip between two positions. So it would require a major redesign, if it’s even possible to have that level of angle control. And there’s always the problem of doing very low-latency, high-accuracy eye-tracking. Finally, as noted previously, this would solve judder but introduce potential strobing issues (although I don’t know how big a problem that would be). It’s an interesting idea, but there are a lot of hurdles involved with it.

I should have mentioned that because the screen is flat but the distance from the pupil increases away from the center, a simple shift of the screen wouldn’t produce the correct results; it’d only be exactly right if the screen was a sphere. That’s probably not too bad at the center of the screen, but it could be quite noticeable 40 degrees out from the center.

Okay, after getting some sleep I realized that the photons land on the same place on the retina for the whole frame, so this is pretty much the zero-persistence case, which I’ll talk about sometime in the next two posts. The bottom line is that there would be strobing, and probably motion detection issues, although I don’t know how serious either one would be.

Yeah sorry, when said DLP I was really thinking someting like adaptive optics like the Keck reflectors use, only in this case it would probably have to be a flexible mirror (or really tiny segments).

The whole point was really that a solution could be modeled and simulated to determine if either the hardware exists in some form or if it’s even feasible to build the hardware. That way, you might be able to get an idea if there is a solution there worth pursuing (which is related to the idea that models don’t provide answers, they provide insight).

Well, as I said, it’s a clever idea but it’s basically the same as zero persistence, and it would be a lot easier to build a zero-persistence HMD than to figure out how to build one that compensates for intra-frame motion, if that’s even possible with current technology.

I was discussing something similar with a cow-orker yesterday: how to handle eye movement in a HMD.

Currently you need to do the ‘Terminator Scan’ when you look around a virtual environment using an HMD. This is mostly because the display, even one with a wide FOV, is only to the front. However, making the display ‘wrap around’ would introduce a new problem in the sense that the image stretching at the edges (to make the FOV look right) would fail when your eyes went off-center.

We were speculating that future HMDs might have eye tracking cameras and graphics drivers that compensated the image for each eye based on where the eye was pointed. (Although physical differences in the eyes could make that a very hard problem. For example, what about people with lazy eyes? Do you compensate for the lazy eye or do you compensate based on the dominate eye as in the real world?)

We haven’t noticed any problem with the distance the pupils move. It’s only comfortable out to 25-30 degrees each way, and everything looks fine within that range, anyway.

As for applying eye tracking to judder and fringing, alas, that wouldn’t help. Judder and fringing result from each pixel staying the same color for the persistence time (assuming reasonably long persistence), and nothing that tracking or the driver could do would change that.

Color-sequential displays in projectors and TVs don’t suffer to any significant extent from color fringing because there’s no rapid relative motion between the eye and the display involved, for the two reasons mentioned earlier: because projectors and TVs have limited fields of view, and because they don’t move with the head and thus aren’t subject to the high relative eye velocities associated with VOR.

This is very slightly not true. VOR is essentially not a factor for fixed displays (unless you purposely move your head to match a moving scene while fixating a ‘stationary’ object relative to the screen within the scene). However, with larger displays, especially ‘home cinema’ setups becoming more common (and in actual cinemas, especially imax), colour fringing and judder & smearing are both visible effects, down mainly to the endemic use of 24fps. At small screen sizes and/or low speed pans or objects, the effect is not visible, but with fast pans/fast objects, even at 60i, you can notice motion artefacts when seated at the correct distance from a HDTV.

Nowhere near the sort of problem that is is in a HMD, but for film and, to a lesser extent, HDTV, you need to take into account maximum panning speed and maximum object speed.

I knew I should have talked about how TV and movie content is authored to avoid judder (“judder” is a cinematographer term), but the post was already long, and I figured I’d discuss it next time, when judder was the focus. You’re absolutely right, of course.

I have an idea. create a virtual reality for every roller coaster. make it all go in sync with the roller coaster as the roller coaster is rolling while a person is riding on the roller coaster. It would be possible to add all kinds of surroundings int eh virtual reality. This would make riding on a roller coaster a lot more different and probably more fun.

Thanks! Of course, this makes John’s point very nicely. 20 years ago you would have sent me a letter with the citation, or if I was very lucky you would have sent a copy of the paper. Either way, I wouldn’t have gotten any feedback for months after I wrote the article, and unless you sent me a copy I would have had to go to a library and put in an interlibrary request and wait days or weeks more.

I just wanted to say thanks for posting the first part of this article, some time ago I was writing a series about a language interpreter I started writing for a *nix game engine I made for the fun of it during college. I stopped after posting the first few articles on my blog because I thought no one was interested in my thoughts on the matter, after reading this I’m starting to think it’s worth it to give it one more try and see if I can get them somewhere.
If I ever manage to finish it, I promise to give the first game I make with it for free in return.

Actually, less than 30%, but I take your point. Then again, the previous poster thanked me for the opening, so opinions vary. I thought about making a post with just the opening, but I don’t feel comfortable having posts with no technical meat.

Of course, you can just skip over the openings; it’s always clear where the opening ends and the technical discussion ends. But I agree this one was long, and I will keep your comment in mind in the future.

Color fringing could be solved with current display technology by rapid micro-oscillation of the display synchronized to the speed of eye movement. Eg: for the time a pixel is displayed, have the screen move in the direction the eye is tracking, then shift it back during a blank period. Note the speed would need to precisely match eye movement. . It seems it would be tremendously difficult to have such precise eye tracking though?

Yes, that came up in an earlier post, and is a clever idea, but has potential drawbacks that I’ll discuss when I get to zero persistence and strobing. And as you note it would be hard to actually implement this.

The majority of people do seem to acclimate over time. Some don’t need any acclimation, and some never do acclimate, but most get comfortable eventually.

However, there’s another level of VR, and that’s true immersion, to the extent that you feel like you’re in an actual place. I can’t describe it, but you know it when you experience it. It requires really good tracking and wide FOV as a foundation, and lack of judder and high resolution help a lot. Unfortunately, there’s nothing you can buy that really comes close to delivering on the required functionality.

once acclimatized on a non-trueimmersion device, is it then on par with ‘true immersion’? (ex, the rift. alot of anecdotes mention that they feel as they are in the room).
if you try a vr/hmd that is technically better, is ‘true immersion’ just a relative experience (to that of the best vr hardware?)
or is true immersion just something that you don’t need to be acclimated to?

also, is ‘true immersion’ contingent on vr environment being lifelike?

Great question. Opinions vary wrt acclimitizing. Mark Bolas, who has decades of experience, says the research shows people just get used to it and become immersed. And obviously a lot of people are experiencing immersion with the Rift. However, my experience is that an HMD that is technically better – with tracking that supports translation, for starters – is considerably more immersive. Pretty clearly there’s a spectrum of immersion; the really interesting question is whether there’s a knee in the curve where the experience jumps to another level. I don’t know the answer to that yet.

Also in my experience, a lifelike environment is not a requirement for a strong sense that you are in a “real” place.

Would this be visually similar to the effect I see through the edged of my glasses? I am fairly heavily myopic, and when I look at high-contrast things (whether images on a computer screen or sunlight and shadow) out of the extremities of my lenses, I can see some red/blue smearing. I hardly notice it anymore, but when I first jumped up to a prescription/medium that made it evident, it was often quite distracting.

I’m not sure what you’re referring to – color fringing? The post you replied to didn’t give me any clue. Assuming it’s color fringing, yes, it’s somewhat similar to the effect you refer to, but it only happens when the eyes move relative to the display, and then it happens to all the pixels on the screen.

I think a closer approximation to this would be when you go to a movie theater and around white lettering you see a blue fringe on one side and red on the other. In a movie scenario, you notice this effect around bright letters/edges. In a VR situation, you would see it around objects trying to stay “static” in contrast to your moving head/eyes.

It’s hard to get a wide FOV with them in a comfortably wearable package, but they’re a possibility. We did a protoype with one that in some ways was the most realistic VR I’ve seen, although that implementation had many limitations. They are zero persistence, which solves judder completely, but introduces strobing issues, as I’ll discuss in the next post or two.

Hello again Michael, have you considered manually moving the display using eye tracking so that the display is always oriented to the position the eye is looking at (the center of each display is always aligned with the center of each eye)? You could get away with smaller displays that way, but they would need to be closer to the eye and have a higher resolution. Plus anything that requires mechanical moving parts would be very easy to malfunction, I still can’t find a reliable printer for example.

That would be very cool, but the mechanical part seems really hard. Not to mention it could produce odd sensations as the HMD moved around, plus noises to match. Also, getting the screen closer means a shorter focal length lens, which creates more optical aberrations.

Abrash, I have a post for you to read: http://www.oculusrift.com/viewtopic.php?f=2&t=1904 It is critical of you, but I hope you can see the constructive criticism. I really do believe in the sharing of information that you claim is very important, and I am really saddened by Palmer and Iribe that they took an OPEN SOURCE HMD closed. You have censored me before, but maybe you can do some good putting Carmack in touch with someone.

I just met with the former director of NASA, Kennedy Space Center, and made sure to give him an earful of Carmack’s recent twitters about NASA corruption, the “orphans of apollo”, and the recent sillyness at NTRS. We were together in Shanghai complaining about the problems of China not working with the US Space program. He said he was no NASA fanboy even though he was a former director and to take the problems up with the current state department, they were the ones tying the hands of himself and NASA and other aerospace companies. Kerry lost the presidency because of his “global test” debate talking points, so certainly the current state department could do better things with Kerry at the helm.

http://www.weneedourspace.com/ Jim Kennedy, his personal email is JimandBernie@cfl.rr.com I think you should pass that along to Carmack and maybe those 2 can get together and help bring about some needed change. Kennedy is a progressive guy, good friends with Les Johnson at Marshall, I am sure he would love to talk to Carmack about changing some policies. Share some information dude! LOL!

MAbrash says:
May 17, 2013 at 9:16 pm
I should have mentioned that because the screen is flat but the distance from the pupil increases away from the center, a simple shift of the screen wouldn’t produce the correct results; it’d only be exactly right if the screen was a sphere. That’s probably not too bad at the center of the screen, but it could be quite noticeable 40 degrees out from the center.

I remember in the early days of MTBS3D and looking over paul bourke’s data, some people were wondering about small spherical displays over each eyeball. Perhaps if you move away from flat display screens to spherical displays like bourke uses, but instead of a large 9 foot dome, 2 small domes that fit over each eye it could help with some issues. I think I remember cyberreality talking about pico projectors projecting the image into 2 small spherical bowls that would fit over the eyes. The pico projectors would have tiny fisheye lenses to match up with the 2 half domes you would be placing over each eyeball.

A big question is how you get a high-quality image onto the spheres. Front projection is difficult and would have to be off-axis, and rear projection would require of translucent surface with image quality better than any I’ve seen.

They are using a special LCD that alters the luminence channel making everything much clearer and the colors much better when projected on curved or other exotic surfaces.

Also the technology helps if you have a very short throw projector and sit very close and want to take care of the “screen door” effect that you see now in most settings. They slightly defocus the image to remove the “screen door” then using thier technology get most of the clarity and resolution back giving you an image almost as good as the original but without any screendoor! Practical application of this – think about the gundamn POD game pictures posted earlier in this thread, a short throw projector, and the user sitting very close to the dome screen, this will make it all so much more wonderfully beautiful!

I’m a bit confused about what limitations are specific to raster-scan displays.
If I were to break one such display into multiple smaller ones (each rendering a fraction of the frame buffer simultaneously), what would improve?
At the limit, all the pixels would be rendering at once.
That wouldn’t solve the color fringing issue in any way, right?

PS: This is off-topic.
Since I got the Rift, I’ve been wondering about using VR goggles to render static stereoscopic 3D views from the real world, taken from a fixed position but with 360 head-tracking intact (rotations and tilt).
It’s easy to take a 360 spherical 2D image of the world and map it in Rift, but then it’s flat (a bit like this implementation of StreetView on Rift: http://oculusstreetview.eu.pn/?lat=44.301996&lng=9.211584000000016&q=3&s=false&heading=0 ).
Maybe one solution could be to take a lot of 2D stereo photos (dual camera setup) at different angles (yaw, pitch) covering 360 degrees, and interpolate between them in real time based on head position to compute an approximation of each eye’s point of view?

You are correct, that wouldn’t help color fringing at all. Color fringing is a temporal per-pixel artifact.

360 rendering in general is an interesting possibility. If you don’t move, there shouldn’t be any problem with flatness, and that case seems readily implementable. However, the case where your head moves and the view changes accordingly is definitely hard.

Abrash, it seems from what I have read about carmacks warping with his oculus prototype tests, he was just extending the FOV in the way that strlen criticizes.
Can you share some information and illuminate us how robust your warping code is, how adaptive, etc etc?

Yes rendering four 90degree perspectives and combining them will create a lot of work for the hardware to do (and may be beyond hardware for the next few years), but if you guys will think to the future, what about all of us that want to play Full FOV valve games in home domes? Or people who want to play them projected onto a wall, the corner of a room, or panoramic view, of full sphere? Or any non standard shape or surface? Don’t send us down a path that limits our options to use your software or hardware on whatever display solutions we want to use.

If you don’t set the trend now to make sure developers are on board with this kind of flexibility, then maybe we will have to reinvent the wheel all over again perhaps. I already see developers hardcoding the warp to the Oculus, so inflexible and going against what strlen and bourke warned about.

I’m not sure what the question is. When you display a scene on an HMD, you have to put the pixels in the right place. Period. You don’t get to choose between fisheye or not, or different FOVs – it just has to match the real world. Which is what the warping does.

The question was how are you doing your warping? Are you just extending the FOV or are you doing it the way Strlen says:

Strlen here says you must take 6 “90 degree FOV” renderings and combine them, Bourke says you can do it with 4 combined views. (that is why it would take massive amounts of hardware power)

http://strlen.com/gfxengine/fisheyequake/index.htmlHow?
This version of quake renders 6 views of exactly fov 90 in each direction, then uses a table to transform these pixels to a single view according to fisheye projection. I initially made this hack using the quake 1.01 source code for linux (illegally obtained), and later ported it to the GPL-ed win32 code released by id software.

Why bother?
Personally I just love fisheye and high fovs for the speed of movement sensation they give and the great graphics, the cheating aspect of it has never interested me. But in general a high fov can help your game, as you can see more, and in fights you are disoriented less. Sadly, the standard flat projection in quake gives a “stretching” distortion that gets unbearable as fov gets really high. Fisheye projection doesn’t have this problem. See this comparison chart to see why: one of the most stunning conclusions you can draw from it is that, if you’re used to playing with fov 120, you essential get fov 180 fisheye for “free”, as objects in the middle of your screen (what you aim at) are the same size still (!),

For me, say I buy half life 4 or whatever, and you guys have developed it with the rift in mind, so FOV 110, but say I also have a 270 degree FOV device like the wide 5 HMD (which palmer also has) It is going to look AWFUL to just extend the FOV of your half life 4 game to 270 FOV unless I do the rendering of 4 views as bourke says or 6 views as Strlen says. Or say I have a dome, I get tired of playing the rift at 110 FOV, and decide I want to finish the game on my dome at 180 degree FOV, if you aren’t thinking outside the box, you will make this hard on me.

http://www.elumenati.com/products/software/omnity/ Here are people doing non standard display cameras with a unity3d plugin (for projection onto domes or any non standard display – should mesh well with rift 110 fov as well. I am not sure if they are doing what strlen or bourke advocated though with rendering 4 or 6 views and combining them though.)

I still don’t get it. Every pixel in an HMD has to map to the right place relative to the real world. There’s no reason to do multiple views; just render one view that covers the FOV plus a little, then warp it to correct for the lens distortion. If there was no lens distortion, no warping would be needed. You can render at a higher resolution and filter down in the warp if desired to provide better filtering. Next frame you render again. As far as I can see, that’s the whole deal.

How would go about rendering extreme FOVs in a single view with a linear perspective transformation? Nearing 180 degrees seems problematic and over 180 impossible, as far as I know.
I don’t really see this being useful for HMDs since there’s no point in extending the FOV beyond the reach of the eyes. But there might be indeed other immersive devices, such as the mentioned domes, that do have a use for this.

Sure, that’s true. I’ll be happy to worry about that when we get 180 degree HMDs Any such HMD would have to have multiple screens or a curved screen, and in that case, the same principle would still hold – draw every pixel in the right place relative to the real world – so whatever warping would produce that would be the correct one.

“just render one view that covers the FOV plus a little, then warp it to correct for the lens distortion.”

Ok, but my worry is that this will be inflexible in future games, the developers will make a few choices and leave us end users out in the cold. Lets say you design your game for a 110 FOV rift and several resolutions on a flat screen monitor or equivalent, but I own a 180 FOV dome, and 270 FOV specialized HMD and 360 FOV Barco sphere system. I want to play your game on all devices, but if the game is designed to only render one view at 110, and warp for the rift, then how do you suggest I make it work for my 180 FOV dome or 270 FOV custom HMD or 360 FOV Barco Sphere? (without turning to all these peacemeal 3rd party solutions where things don’t work so well?) Will I be able to go into the half life 4 display options and choose which FOV I want and then choose which Warping I require? Or is that something that I will not be able to select because you only have 1 FOV and warp option? This was the issue I had with iz3d people, they kept telling me there was no reason to develop their HMD drivers to do anything extra, because there was no dome systems out there or anyone with 270 FOV custom HMDS, so it was the chicken/egg problem. I am certain Valve is way ahead of those guys though, and will give the users the flexibility to choose what FOV they need and also what warping they need to play the games on anything from a flat screen, to a wide FOV HMD to a dome and even perhaps that barco 360 FOV sphere system. http://forum.iz3d.com/viewtopic.php?t=1367&highlight=dome

Agreed that this would need to be solved for 180+ FOV domes. However, it’s just not a factor at any FOV achievable in a consumer HMD, so there’s no reason to solve it. It’s not even clear yet whether a real consumer VR HMD market will emerge in the near future or not, and it’d be hard for a game developer to justify investing a lot of time in that; there’s no way any game developer could justify investing time in adding a capability that’s only useful for domes or 180+ HMDs, a market for which isn’t even on the horizon. Anyone can take the Doom 3 source code and modify it to render however they want, so there’s nothing stopping dome enthusiasts from solving this for themselves.

it’d be hard for a game developer to justify investing a lot of time in that; there’s no way any game developer could justify investing time in adding a capability that’s only useful for domes or 180+ HMDs, a market for which isn’t even on the horizon. Anyone can take the Doom 3 source code and modify it to render however they want, so there’s nothing stopping dome enthusiasts from solving this for themselves.

Very true and I see your points, but here are mine in return, the nobody I am (except for leading palmer to eric howlett 5 years ago when you and carmack were doing what?)

There was NO content for a rift 110 FOV device, Palmer made the device, Carmack came to Palmer, and here we are all today. You and I are not going to live forever, I would like to see as rapid a techinal revolution as possible so we can all enjoy, and the resources in money or whatever you could do to foster this tech, which relatively speaking has to be not too much for a big company like Valve, could provide content options that would spur millions of people/developers/engineers if not more to develop the hardware to make your wide FOV software options playable. Frankly, and people like you and Carmack are heroes to many of us, I find your negative answer about chicken/egg issues beneath your vision and capability and that same kind of thinking is why IZ3D is nothing now. Just do it! Abrash, build it and they will come.

http://www.youtube.com/watch?v=3eb1w7nYZQs#t=02m03s Here is an oliver stone film about VR from 1993, wild palms, starring belushi and other famous actors, william gibson of nueromancer also appears in a cameo. This particular clip at 2m03s has belushi trying on his VR for the first time, and getting nausea, how prescient! LOL! It was a fascinating series about the development and control of VR and AR type stuff. Seems you can watch it all for free on youtube.

An alan watts piece on VR that would probably make a good adaptation to an HMD device even though this clip was originally made for a full FOV dome.

“So from a purely selfish perspective, sharing information is one of the best investments you can make”

I LOVE the duplicity in this statement. Truly, having the plethora of monolithic volumes of information at our very fingertips ultimately leads to a loss of appreciation for the effort in putting in the legwork yourself to reach those same conclusions. It is often said that we stand on the shoulders of those who have come before us. To hold onto information that may be exclusively known to us simply cuts the heels of that information’s potential.

As Valve is commonly known for, when that information can be shared among others and run through the cognitive gauntlet that the collaborative effort affords us, the return on our investment in the long run is usually exponential. It’s not that one chooses to ignore tradition; in fact, we use tradition as the basis. But to adopt an out-of-the-box mentality for your thought process, to allow yourself to approach solutions from not only the vectors you can think of, but the collective sum that a team can come up with; it really is no wonder why the saying “Two heads is better than one” rings as true today as the day it was coined.

From my position, I can only see, from the outside in, on how such a mindset makes returns on your dividends. Sadly, I cannot say that under my current employ that we look to making any out-of-the-box analysis on even the most pedantic of day to day problems. Instead, management seems complacent with traditional thinking and, as such, has left us with a resolve as effective as it is innovative. Be we mired down in politics, red tape, or any other verbiage which equates to little more than an excuse, we are doomed to complacency and a reactive problem-solving process rather than a proactive one.

I had spoke with Al Farnsworth a short while back through email after hearing and seeing Valve’s web-related postings on their official website. How paradoxically and unequivocally mind shattering it must be to take a career one is genuinely interested in, and then to break all preconceived notions on how that field operates simply by applying a collaborative out-of-the-box worth ethic and mentality to it.

They say the grass is always greener on the other side. But from where I’m standing, Valve’s lawn must be introducing new spectrums of green day in and day out. And as for those feelings of pride you get from hearing that others have learned from the knowledge you have procured, I think that is as humble and definitive a pat on the back as one can expect. None of us can do it alone, but what we can achieve when we share information and work together really is both impressive and exponential.

Coming in from a more philosophical angle on VR – as long as we are aware we are in VR, it’ll never seem wholly real to us.

Of course in the next few decades as VR development progresses to its (hopefully) inevitable zenith, this will simply mean artifacts created by technology that isn’t quite up to snuff been tell tale signs of the fact that we’re not experiencing a real thing (i.e. the visual motion artifacts discussed in this article)

But even when we progress to a matrix quality VR, where without knowledge, we would otherwise think we’re in reality*, as long as we maintain the ability to transition between VR and reality in a relatively smooth manner (i.e. taking off headset/jack, or using an exit VR simulation menu command or some such), we simply won’t quite be able to replicate the sensation of fear and stress (and indeed, all the positive emotions such as relief that would stem from that) that exists in more real world situations.

Having said that… I wonder if there’s a way of determining the efficacy of the problem that we’re working on in terms of its immersion factor. We may identify many problems that has an impact on the perceptual reality of VR, but which problem is most pressing? And how far can we push it before we reach a point of diminishing return whereupon it becomes more efficient to move onto the next problem?

I wonder how much a high end VR system that could simulate lower end VR systems, for the explicit purpose of this sort of testing would cost. Is such a thing even physically achievable at this point in time with our technology? Massive resolution and FOV, low latency and high refresh, etc.

And certainly, there’s only so much we can push simply the visual side of it before we start to bump sharply into diminishing returns. I mean, when immersion can purportedly be increased simply by standing in a VR experience where the avatar is also standing (or in an upright position)… you know that there’s a proverbial immersion gold mine to explore when coming at the overall problem of VR from that trajectory.

*On a side note – if low quality VR was all our brains were ever aware of – it would seem as though that was indeed the nature of reality itself. I mean, that’s what the whole allegory of the cave was about anyway.

I'll post here whenever there's something about what I'm doing or about Valve that seems worth sharing. The initial post is an unusual one - it's long, my attempt to distill the experience of my first year and a half at Valve - but I think it's well worth reading to understand what I'm doing, why I'm doing it, and the context in which it's happening, and just to understand more about Valve in general.

Michael Abrash is the author of several books, including Zen of Code Optimization and Michael Abrash's Graphics Programming Black Book, and has written columns on graphics and performance programming for several magazines, including Dr. Dobb's Journal and PC Techniques. He was the GDI programming lead for the original version of Windows NT, coauthored Quake at Id Software with John Carmack, and worked on the first two versions of Xbox. He is currently working on R&D projects, including wearable computing, at Valve. He can be reached here.