Stereo vs Mono 360° Video for VR

Submitted by Matt Rowell on Mon, 02/09/2015 - 1:25pm

Since we began showing lots of demonstrations of our content in the Samsung Gear VR, we’ve noticed that a lot of people don’t really understand the difference between what’s stereoscopic 3D and what isn’t. Because of the nature of our content, everything in our library to date is monoscopic 360 video. In many situations, stereoscopic 3D in VR can add a lot of coolness factor to a production, but for some use cases it doesn’t work very well.

What is the Difference?

A standard 360 video is just a flat equirectangular video displayed on a sphere. Think of it like the face of a world map on a globe, but with VR your head is on the inside of the globe looking at the inner surface. As you move, the head tracking on your device moves with you, giving you that feeling like you are inside the scene.

Stereoscopic 3D can add another level of immersion by adding depth data between the foreground and background. Your favorite 3D blockbuster films are typically shot with 2 lenses side by side, to give you a feeling of a different vantage point per eye. Like any production, this can look strange if poorly implemented, or absolutely amazing if done right.

The Challenge of Stereoscopic 3D in VR

With stereoscopic 3D in VR, that depth information has to be overlaid and mapped to sphere. Because of parallax between cameras, this can be especially challenging. Any minor flaws or “stitch seams” in the footage are magnified in 3D, and sometimes anomalies occur in different places per eye - which makes it uncomfortable to watch.

Poorly implemented 3Dx360 video footage can cause a great deal of discomfort to the viewer including headaches, eye strain or nausea. Beyond the physical, there can also be production quality issues. Objects and people in poorly implemented stereoscopic 3D can look like crude cardboard cutouts, and chromatic aberration becomes very apparent. Chromatic aberration is that magenta or green “fringe” you see on the edges of objects through some lenses, it’s an optical flaw that we find is much more noticeable in stereoscopic 3D.

Perfecting 3D for VR by Limiting Variables

Flaws in stereoscopic 3Dx360 video can be avoided by shooting in a controlled environment. If your actors or subjects are instructed to remain a specific distance from the camera and remain stationary in the same quadrant of the shot, you avoid having them cross over the stitching seams. By not moving the camera, static scenes can be set as a backdrop for the 360 itself, while 3D subjects are composited into the footage. And of course you could also rotoscope every single frame to remove anomalies as well.

So typically the best 3D video content we see for VR is very controlled and static. Our favorite stereoscopic 3D for VR was with the release of the Samsung Gear VR from Felix & Paul Studios - but the majority of those shots are kind of the same, although very compelling and the best, most comfortable 3Dx360 we’ve seen in VR. Could new camera technology help make shots like these dynamic with motion?

Because of these limitations, in my opinion 3Dx360 is not an ideal use case for news gathering, live events, sports, extreme sports or any situation with lots of variables and moving parts. You’re going to make people uncomfortable trying to watch content like this in 3D. Some producers simply eliminate the FOV by shooting a 180x120 shot with what seems like 2 forward facing cameras with 180 lenses. In this case, you have some limited head tracking to look around but you can very easily find the boundaries.

A lot of the content in 360 Labs video library would not be possible in stereoscopic 3D, so we focus on making our monoscopic 360 shots as high quality as possible - at least until we can perfect a shot with motion and variables in quality 3D. But perhaps this won’t be good enough until the next generation of displays come out.

The Case for Monoscopic

With a monoscopic video, you’re getting more resolution out of the device because you’re not having to stack the left and right channels (or top and bottom channels) within the phone’s limited resolution. Oculus CTO John Carmack explains, “People that are resolution-picky will probably prefer monoscopic videos, which can have twice the resolution of stereo videos. The stereo effect may not be worth anything to you if you can't get past the blurring.”

As I’ve already mentioned, a lot of our viewers get confused about the difference of true 3D stereoscopic and monoscopic footage. Often times they’re fooled into thinking they’re looking at a 3D video. In a way, it is 3D because it’s projected on to a sphere, but it has no depth information between background and foreground. But after hundreds of tests and demos, I don’t think the general public even cares.

Most of the time, our goal at 360 Labs with virtual reality is to capture real life experiences. But how real is the experience when every shot is completely static and you have to instruct your subjects on where they can stand and where they can move? This doesn’t happen in the middle of a raft on the Colorado River in the Grand Canyon. This doesn’t happen on the back of a kiteboarder, or speed flying down a mountain.

While I’m sure that we’ll continue to research and test stereoscopic live action 3D footage for VR, for now it really doesn’t make sense for the majority of our projects. We want to see the beautiful places we go to in as much resolution as possible. But who knows what the future holds; displays, resolutions, processing power and bandwidth will only get better. We look forward to testing the future of VR!

Comments

To get the high quality needed every time 360° S3D rigs needs to have gen locked cameras. With GoPro it is not possible to lock the Hereo4 cameras and only can lock pairs of Hero3 cameras.

If you move a S3D camera rig you need to make sure the rig stays level. The left and right camera must be horizontal to each other.

It is hard to get a good stitch with 360° video and it is many time harder to get good stitch in stereo.

Where S3D shines over mono is when the camera can not move. By shooting in stereo the scene is more immersive. The sense of depth would be limited in the mono version without the moving camera. Some of the problems of not having gen locked cameras diminishes when the camera rig is stationary.

Jim, as always, you make so much sense. Please write a book :) Although you're probably too busy making another cool camera invention. You have been one of the most helpful and generous people in the 360 Video sphere (Yes, that was a pun.) Love your work.

At present both 2D and 3D spherical video suffer from technical defects due to imperfect camera sync and incomplete parallax correction during stitching -- the defects are just a lot more noticeable in 3D. When these problems are overcome, there should be no reason to restrict camera or actor movements.

Hi, cool thread! I agree with every part of it, 3d is not necessary for many shots and 180 are enough for many other kinds of videos.. so we can have comfortable stereography
I'm trying to understand what kind of rig is used for this videos:
virtualrealporn
They are beautiful and they seem to be made with something like the Red stereoscopic setup shown in your article. What do you guys think about those?
Thanks! Ciao!

I think most likely they are using a stereo rig like the one in the picture with reds. I believe NextVR had a similar setup in a project they did with Coldplay that's available via Samsung Gear VR. It's a 180 FOV with a decent sized nadir and zenith cap.

Good article, that pretty much sums up the issues.
Shameless self promotion ahead...
We have released a demo of a musical performance that uses 180-3D video.www.sensorium.xyz
We consistently get asked why we don't let the viewer see what is behind them. There are lots of answers but the most compelling is that we want to create narratives for the technology and this will always involve directing the view toward the point of interest. Also looking behind yourself at a concert is interesting for a second then you want to focus on the band - unless your that creepy guy looking at my girlfriend.
The other reason is bandwidth as the view behind, which is largely redundant, doubles the file size and number of pixel you need to push.
Another reason to first use 180-3D is that after considerable research and experimentation 360-3D video is just really difficult to watch, even if the bending is almost perfect, as the eyes need to work very hard to focus on elements in the scene.
With 180-3D we can estimate what part of the view the audience will be focusing on and make that the most comfortable zone to look at.
You don't need to release 4K 360-3D video clips to create an totally engaging and immersive VR experience.

Why do people blame genlock for bad 3D. In professional s3D world, gen lock is a given parameter, you don't shoot with camera that doesent have genlock capability. What is aways missing in all the information available on S3D for 360 Video is, that "It's practically impossible to do a 360 S3D live video with anything currently available". Simple reason is Corner stoning, wider the FOV, more prominent will be the cornerstoning.
Each pair of Stereo camera has a sweet spot for perfect 3D, rest of the surrounding are flat or with distortion. So when you stitch all the S3D pair rigged for 360, you will have sweet spots for 3D only in the center of each pair. By simply pairing stereo camera facing all the direction cannot make 360 S3D image, leave alone video.

Firstly With current 360 setups, there are always weird seams at overlapped stitch, because all the cameras are placed away from the nodal point of the sensor causing parallax between each camera. No one has made any rig that perfects the edge blending without any post production need except Silicon Imaging with their 360 Mirrored rig. Each camera placed exactly in the centre using mirror creating actually "Seam less" video that doesn't need edge blending or seam correcting.
To summarise- seamless 360 S3D live video is not possible with any of the physical camera rig available. But yes, it's possible to do in CGI world where Stereo camera renders everything in realtime on the fly. It's possible in simulator, games, 3D engines.

With only a pair of cameras, you are right and there is only one sweetspot. However when having a stereo pair, say every 40-60 degrees, you create something called omni stereo (your multiple sweetspots) That works very well. Theoretically it isn't correct, but it probably never will be. However it works so good in our cylindrical projection system that nobody notice that the stereo is not 100% correct inside the overlapping region for the different projectors.

Current graphics hardware is very capable of correcting basic lens distortions in real time to help generate good live video.

Stitching can be very good, but has its limitations. But it is perhaps worth to mention that you don't have to stitch the video. If the sender knows in which direction you are looking, it can just send you the video stream of the pair of cameras that most closely matches your viewing direction, provided that camera's the field of view matches at least the field of view of your head mounted display. Or you could send all the data, non-stitched, and let the receiver figure out which of the camera pairs to use. That would increase the bandwidth substantially, but I could envision a special, but very efficient compression, as the overlapping areas show almost the same content. Switching viewing pairs shouldn't be an instant action, but blended a little in time. So rotating you head might show a tiny blur for a very short moment, but once rotated the stereo will be crisp and clear without stitching errors. At least that's my idea.

You mentioned the a 360 mirrored rig. I can understand that this creates a seamless 360 image/video, but how could that be used to create stereo in all directions??? You need the parallax for 3d.

A perfect 360 S3D will definitely be more immersive then 2D 360 or Normal 3D projection. But the challenge is to achieve it. The only possible way is to use perfectly node centered depth sensing camera for seamless 360 capture and use the depth information to create 3D spherical mesh with depth map applied and texture of 360 projected on it, then use a real time CGI Stereo camera to render the S3D image of whatever FOV the viewer wants to see. This will surely be computational heavy work. Hope some software company must have started working on that direction.

Your node centred depth sensing camera (a laser z cam for example) would only work perfectly for one cgi camera (eye). The camera for the other eye would be slightly off and see round the edges a bit where no data would have been recorded. Even two z cams would not solve that problem because the orientation of the z cams would not usually match the orientation of the cgi cameras. Good idea though.

These days many of the 360-degree cameras do not provide a full 360-degree views only they are advertise like they can capture 360 videos or photos. So people are freaking with those thing and forgetting about the main thing you explain here . Like your words

I'll just have to agree to disagree. Until platforms can display stereoscopic 360 without cutting the resolution in half, I'm just not interested in going that route. Until there's a stereoscopic 360 setup I can strap to an athlete, get wet, get dirty... I don't see the pay off. For most of the footage we shoot; outdoors, action, nature... it doesn't make sense because so many objects are far away and don't even really benefit from the stereo 3D. Or everything is moving by so fast, you wouldn't notice anyway. It's also worth mentioning, most 360 video content being consumed today is on mobile devices without a headset. I suppose if we did more narratives in a studio or indoors, we'd be more concerned about it. What is your preferred stereo 360 setup?

I got some google cardboard compatibile "glasses" (I insert my iPhone etc)... but it seems I can only view 3D videos or play 3D VR games. I can't use them to view MONO things like 2D 360Videos or play around with augmented reality.. If I try to view 2D content (a simple iPhone screen) my eyes twist because the glasses are made to give the two eyes a separate and different image....Is this normal?
What glasses should I buy to view plain simple 2D 360videos and.. that's what I want.. play around with Augmented Reality (as now I can o"only" see AR through my iPhone or iPad)?
As far as I understand I need glasses that let you focus very near BUT don't expect a double L/R image, but just one. Any idea?

don't confuse 3d content with the means of viewing vr content. In both cases each eye will expect to see it's own version of the video. If you are using cardboard to view a monoscopic video you must make sure your player goes into "cardboard" mode, where it splits the screen and shows the same exact thing in both halves of the screen. You shouldn't be trying to view a single movie image displayed across the full width of you phone through two separate lenses. Adjusting how the vr player interprets the video is a standard feature of most players.

I need to shoot a real estate tour of a house with a realtor touring the place. The camera will be on a tripod stand and the pivot point will NOT walk with the realtor but it will be moved to certain points in the house. The realtor will be looking into the camera and will have slight movement as humans do.
I was planning to use gopro omni for this but after reading this article, it seems I should buy a setup with S3D.
The users will be using Flat Screen : VR Headset -- 4 : 1
The budget is under $6000
Any recommendations?

The difficulty with stereo 3D 360 for realty is going to be stitching in tight areas. Although if the majority of the house is static with the realtor only talking on 1 quadrant, you'll have a lot of leeway for painting out seams without getting too time intensive with VFX. Personally I'm liking 6 x GoPros genlocked, either Hero3+ or Hero4 in a tight circle. All in with cameras, mount and sync hardware should be around $3000. Consider it like a small GoPro Odyssey camera with full 180 floor to ceiling coverage. Minimum distance is going to be around 1.5 to 2 meters for objects near the seams. It seems like something like a Jaunt 1 or a Nokia OZO wouldn't be cost effective to rent and cloud stitch every single time there's a new property to shoot.

Dear Matt, thank you for your explanation, however I'm still a little bit in the dark about the difference between stereoscopic and monoscopic visuals. I'm working on a master thesis on various effects of 360 video and I'm using New York Times' "Fight for Falluja" video (it's on YouTube). I would love to know wheter and why you consider this mono or stereoscopic and why. I think it is monoscopic 360 but I am not sure.

Thanks for commenting. The NY Times "Fight for Falluja," as it's uploaded to YouTube is not a stereoscopic 3D 360 video. If it were, it would have a 3D indicator on the quality setting like this video:https://www.youtube.com/watch?v=a1OoOdTNiUM&t=4s

I will have to agree with Firefox. I have seen a good number of monoscopic 360 VR's. Boring! I would guide you all to download my Friend Steve's Android app called "Mars is a Real Place .You can either get on Google Play or buy it through Oculus. It starts off with interesting Mars hyperstereo views. Hyperstereo means the base distance between two cameras is greater than 70 mm.Hypostereo is less than 70 mm.Regular stereo is 65-70 mm apart. If you choose to download this app, it is the very last 360 stereoscopic VR that is worth everything. Talking about such visual impact is worthless. You simple have to experience it:) It is not a video so It's a still 360 stereo VR. I have done stereoscopic imagery for educational purposes since 1980. I have collected enough educational stereoscopic imagery to create I believe the world's largest database of its type. It's called SEDC (Stereoscopic Educational Database Commission) I founded SEDC in 1996 so that there would be 3D stereoscopic imagery for teachers and students. I'm on Facebook. Use SEDC3D in a search. Ok, enough of a blurb about the database. As an educator, I am not about most VR games etc. I also do not see using 3D stereoscopic 360 VR all the time in classes. However, there are times when I will want the students to look around the scene and take it all in. It might well be just a static scene, but I also want students to be able to look completely around and see in 3D things and people moving ie 3D stereoscopic 360 VR video. The reason I use stereoscopic imagery is because of the fact that every single 3D image gets jammed deep into the brain so that the imagery goes into long term memory which is how we get faster and complete recall from upon which we form original ideas and concepts. Nothing else works like static stereoscopic imagery except moving 3D imagery ie 3D videos, and if want even more then allow the students to look around a scene and 3D 360 VR would have the highest impact. Ok so there are stiching problems and 3D sweet spot problems. So what? We still need to march forward and solve them no matter what. Again when I look at flat anything like pictures in a textbook or a video which is also flat, it is boring visually and for educational purposes, it is not a complete waste of time but pretty close:) I can't begin use the written word here to convince anyone reading this. I have seen incredible things from the past for instance. You have to see them in 3D. I will be doing some experiments with using two Samsung Gear 360 (2017) cameras. I have found a way to sync them well enough to be basically be blocked. Now comes the question if one camera will see the other and so can I crop it out or is it acceptable? Then comes the issue of how to edit the left camera's 360 as well as the right and also compositing both scenes into a side by side format that allows for use in VR headsets and let the students look around that is if there is a strict educational advantage in doing so otherwise if I am doing a frog dissection in my biology class, I only need to show still 3D imagery of the heart, intestine, lungs,kidney etc. My goal is for acquisition of this imagery into long term memory, not to have them look around the room and see other students moving around instead of concentrating on the frog:) But there are times that I want them to look around at moving scenes like a jungle Biome for instance so they can pick up all the nuances. Like animals moving in trees etc. I also am using the dual Gear 360 camera setup is because they only cost $49 each when I bought my 2 S8's. There is a world of what I don't know. This setup may not work. I have no idea what particular software to use etc etc. It's all a big learning curve, but so what? I simply don't have the ability to pay $30 k for a multiple lens or Go Pro multiple camera rig. If two cheap Gear 360's will allow me to even start down this path either to success or failure, then I will just jump in:) I would therefore rather see messed up scenes with lower resolution rather than waste my time with higher resolution mono 360's. Flat is flat period!

Thanks for sharing, Andrew! I can certainly see you are passionate about 3Dx360. I certainly agree it's worth experimenting with, we do ourselves quite a bit. I know we'll get to a point where live action stereo 3D 360 will be comfortable for most audiences and easily accessible.

I haven't worked with it, but I have continued to check up on new samples as the project progresses at various trade shows. Unfortunately, at the moment it's still giving me a great deal of eye strain when trying to watch. Personally I'm more impressed with stereo from the Z-cam V1, Jaunt 1 and the Yi Halo. Kandao Obisidian R and S are showing some promise too.

Hi, I have started using photogrammetry (used it for 2nd years vfx assigment on my degree course) I had not realised the issues in generating 3d video. Hmmm, Genlock?, gosh not used one of those since I used an amiga 3000 for video titling. lol. (age give away). I presume 3d static would be simpler? capturing an interior environment from a fixed location or would that not give the room to move about in the vr space?