Is Improved Scaling the Key to the 8K Content Void?

I have been thinking more about 8K recently. Many will agree that the increase from 4K to 8K pixels will be of marginal benefit for those watching TVs at standard distances on 65” TVs. It will help a bit for larger displays and a lot if you are very close to the display – but that is not a TV watching mode.

But I think 8K TVs will come anyway. There are two driving forces. One is the 2020 Olympics in Japan where NHK has been working steadily to produce, distribute and display in 8K. Their efforts have created a full 8K production ecosystem. Granted, there are limited suppliers with very expensive equipment, but it is there.

Satellite distribution is being used requiring an enormous 80-100 Mbps. This will mean very limited distribution, but set top boxes are available to decode 8K.

In displays, Sharp has announced it will now offer 70” 8K displays on a worldwide basis with more sizes to follow. LG is planning an 8K OLED TV for 2019 and others are likely to be announced at CES 2018. These will be expensive TVs, I am sure.

The other big factor that I think will drive 8K is the massive investment in 10.5G LCD and OLED fabs. At least eight are already planned with rollouts starting soon and running into 2021. Many are Chinese and these are expected to have an impact in the 2019 time frame and beyond. This gen size is optimized for 65” and 75” screen sizes. You can assume that most of this will be dedicated to 4K TV panels, but a percentage will get devoted to 8K I think as well. Production on this massive scale will drive down prices.

I said before that the resolution increase will offer little improvement in picture quality (these 8K TVs will undoubtedly be HDR/WCG too), but I think they will sell. The first and maybe the most significant market will be China where the simple notion that 8K is more pixels and therefore a better image than a 4K image, will resonate. We saw this in the roll-out of 4K and history will repeat as 4K will be decidedly main stream in a few years. The US and Europe will lag, but 8K will hold a place as an aspirational and premium category.

What about content?

But what about 8K content? That will be very hard to come by for a variety of reasons which are not likely to be solved in the next few years. As a result, consumer will have to rely on TV or home-based device scaling to create the 8K image.

So the next question is, are today’s scalers up to the task – especially of asked to scale from 2K/FHD content to 8K? I don’t really know the answer to that as I have not tested them to see if and how they would perform. However, I suspect many would be somewhat adequate if working from native 4K content but not very well using 1080p content. That is a lot of scaling and requires the creation of a lot of new pixels.

The authors point out that most scaling algorithms seek to “minimize the pixel-wise mean squared error (MSE) between the HR [high resolution] ground truth image and its reconstruction from the LR [low resolution] observation, which has however been shown to correlate poorly with human perception of image quality.”

In academic terms, the team says that, “Using a fully convolutional neural network architecture, we propose a novel modification of recent texture synthesis networks in combination with adversarial training and perceptual losses to produce realistic textures at large magnification ratios. The method works on all RGB channels simultaneously and produces sharp results for natural images at a competitive speed. Trained with suitable combinations of losses, we reach state-of-the-art results both in terms of PSNR and using perceptual metrics.”

I won’t pretend to understand all of their work, but I think I get the gist of it. They first start with high resolution ground truth images that are downsampled to low resolution and then reconstructed to high resolution. The reconstructed images are compared to the ground truth original using several tools (PSNR and perceptual metrics) to determine the fidelity of the upscaling.

Their method does use conventional scaling such as bi-cubic formulas but instead identifies the texture in each part of the image, then tries to find this texture from a reference database. This database is developed with a machine learning neural network. The selected textures are then inserted into the image.

The 19-page paper has a lot more details for those who are interested. Figure 5 below from the paper nicely summarizes various methods along with their best method (ENet-PAT) vs. the ground truth (IHR).

The method is not perfect and they point out one failure, but this certainly seems like a step in the right direction. – CC