Binocular Vision

HTC introduced the HTC One (M8) this past week, and there’s been a lot of discussion about it’s camera, which is a lot different than other smartphone cameras. We’ll see more of this clever use of multiple sensors, as I’ve written before, but let’s discuss what HTC is doing and how it works today.

The M8 has two imaging sensors (dark circles in image at right). The primary one is 4mp and provides the image data for the photo. The secondary one is offset, 2.1mp, and is used only to provide additional data to the image data. What additional data would that be? A binocular type data set.

Imagine for a moment that only one of your eyes provided the image your brain sees. That’s actually closer to the truth that you might think, as we humans all tend to have a dominate eye. So your brain is processing the image data from that eye. The other eye is providing a second set of data that is offset from the first, and our brain is using that to create depth information. This is binocular vision.

And that’s what the M8 is doing: 4mp of image data supplemented with 2.1mp of offset data to create an offset data set that can then be used to calculate a map of depth relationships.

The problem I see with the M8 is that the number of pixels used is too small. Of course, if you increase the amount of pixel data you end up needing a lot more computing power, so the choices may have been simply to stay within reasonable bounds for the current processor in the smartphone.

So here’s what the M8 and its software does: it applies progressive blur to areas determined to be further from the focus point. The problem I see with the results I’ve seen so far is this: there’s little subtlety in the transition between zones, and not enough zones. Instead of the circle of confusion becoming slowly bigger at a consistent rate, it tends to jump up to the next level in ways that the eye detects, thus the results look a little false. Again, I suspect the problem is that they just don’t have enough pixels or data or computer horsepower to do more at the moment. This technique should improve as you add all those things.

Meanwhile, Apple apparently is using a binocular approach a different way: they’ve patented a two-sensor design that separate luminosity from color information. This would be as if you right eye had all rods and your left eye all cones. I suppose you could still create depth data from this, by converting color information to luminosity information, but it appears Apple’s main goal is to increase dynamic range while reducing color artifacts.

We’re going to see more and more of this multiple sensor approach coming to the smartphone realm.