If your eyes had pixels, how many would they have? Part 2

In the first part of his series we covered spacial resolution and whether we really need multi-K cameras. This time Dimitris Stasinos covers resolutions greater than 8K, and how our eyes really perceive things.

What about resolutions greater than 8K?

All I have to say about this, is that camera manufacturers must definitely understand that every generation of filmmaker desperately needs a universal frame with “collectively defined industry standards” to work with (including codecs). The discussion I am trying to trigger right now has nothing to do with budgets.

I don’t mind if I personally can’t afford an 8K camera right now. I will enjoy what the “BIG” studios produce with those wonderful REDs and future ARRIs. But doubling resolutions every two years from now on could also be very disorienting for present and future filmmakers because different resolutions need totally different mastering techniques. Can you Imagine what would happen to the audio engineering community if the music tech industry doubled sample rates every two years? Regarding audio production we can now safely state (after many controversies) that 24bit 192Khz capturing is “enough” for most audio recording tasks (I am leaving sound design out of this due to different sampling needs).

How many megapixels would be just “enough” as a resolution for a movie to be considered as a technically future proof piece of art? Let me ask this question again using different words: “Supposing that 2D Cinema will survive as an art in the distant future, from which point will future generations start noticing inefficient resolution in movies produced from this day onwards?” Have we reached that point, have we passed it, or is it a pale dot in the horizon?

Foveal Vision & Peripheral Vision

This is critical before we can proceed to motion. Our eyes permit optimal visual acuity only in a fraction of our field of view (2 degrees from the point where both our eyes focus). In the case of my monitor that would be a circle with a diameter of 36mm, right on the point where I am looking to at a specific moment. Using again simple math formulas we can measure the optimal spatial resolution in this circle, and that would be 7 megapixels. The rest of the screen is blurred as it falls into our peripheral vision.

This is one of the reasons why small depth of field is desired in films, as the audience will concentrate on a specific part of the screen and pay attention to details that are critical for keeping up with the story. In other words, small depth of field mimics our foveal vision on the big screen. But in a case of a displayed landscape with little or no action at all, a bigger depth of field is also critical, because our eyes may focus anywhere on the screen as the whole frame is a critical part of the story and not just a detail or an actor’s expression. But what about perceived motion?

Perceived Motion

Our 2 eyes together are making binocular (in most cases) movements, driven by six muscles. When the circle of our foveal vision is moved by these muscles, our brain is “stitching” together the incoming frames as pieces of a greater puzzle which is of course our entire field of view. In the case of a static image our brain needs some time to process the entire frame, while the incoming snapshots are collected. Any recognisable objects or living beings in the frame are not perceived as raw pixel data but mainly as ideas, familiar or unfamiliar, by pairing the various objects and living beings with silhouettes that already exist in our memories.

This process takes time in a relatively peaceful state of mind and when our field of view contains little to no action at all. But what about moving pictures? In the case of moving pictures, our eyes don’t have enough time to capture every fine detail by moving our foveal field of view through the whole frame, so they focus on critical parts of the image. In the same time, our brain does it’s best to collect critical information, which will help us to perceive every nuance of the unfolding story.

A good story teller / film maker / director can make this process less intensive for our brain by using the ideal framing, composition & depth of field for each scene. As for the frame rate, it is widely known that we can process more than 60 frames per second but as it has been discussed extensively, 24 fps is the standard frame rate for cinema from the good old days of film, and anything above that, well, it doesn’t look like cinema (these are not my own words).

This is strange though…50 or 60 fps were supposed to boost the perceived image quality by giving us video content closer to real life. And the truth is that 50 & 60 fps video content produce less strain for our eyes than 24, 25 & 30 fps content as it’s closer to linear motion, which we are experiencing in real life. And maybe this is the problem. Cinema, as an art, is supposed to give the audience a window to a parallel reality and not the one which is behind the screen. So let’s say that any “realistic” content is not necessary “cinematic” content and vice versa. But then, why do we reject motion that is closer to real life and then embrace high resolution that is also closer to real life? This has to do again with film.

Cinema has a history longer than digital media and super 35mm film was a recording medium with astonishing capabilities of capturing fine details and up to 15 stops of dynamic range. So, what we are referring to as Full HD or even 2K (and in some cases even greater) these days, was always a part of cinema. Until digital video came in the early 90s and that pixelated look, paired with big amounts of digital noise became the modern filmmakers’s nightmare…But hey, these days are gone, look at us now…