This is exactly why interpolation of a linear image (at least with spline-based algorithms) is a bad idea. Interpolating after a a perceptually even gamma has been applied (approximately 2) gives much better results.

I get your point, sometimes working in the "perceptual space" is better. My concern is that in a gamma space, 2+2 does not equal 4, and depending on the operation that could cause visible distortion (like a harmonic generator).

Some time ago I was inspired by Bart's down-sampling page to try down-sampling in linear, and I recall seeing better results. Do you have some examples?

If one used a simple linear filter to do the interpolation of color channels, resolution would be limited by the sampling frequency, and so would be particularly poor for R and B channels on a Bayer RGGB array. However, image data is correlated between the color channels, and if those correlations are used one can achieve much higher resolution with much less artifacting. No good demosaic just does a linear interpolation; and the better ones make some use of the correlations between R,G, and B data to achieve resolution near Nyquist for the full array rather than the individual color subsampled arrays.

Cliff, thanks for the vote of confidence.

Emil,

Thanks for your thoughtful response about the spatial correlations between channels. While I do not dispute anything you wrote, it does raise more questions in my mind. For example, if the red and blue channels were over sampled, relying on the partial correlations would add distortion.As the sampling of the red and blue channel improves without limit, the need for relying on the correlation diminishes. Certainly using the correlation was much more important when we had 6 mPix cameras than now at 24 mPix. The degree of correlation must be dependent on the scene. How is that taken into account? What is the criteria for including the correlation?

I get your point, sometimes working in the "perceptual space" is better. My concern is that in a gamma space, 2+2 does not equal 4, and depending on the operation that could cause visible distortion (like a harmonic generator).

Some time ago I was inspired by Bart's down-sampling page to try down-sampling in linear, and I recall seeing better results. Do you have some examples?

Interpolation in gamma 2:

Interpolation in linear, with nearest-neighbor for reference:

The ringing effects in the dark area are much more noticeable in the linear image, and most unforgivably, the rendering of the white dot on black background is not the perceptual inverse of the black dot on white. The gamma 2 interpolation is better on both counts.

The ringing effects in the dark area are much more noticeable in the linear image, and most unforgivably, the rendering of the white dot on black background is not the perceptual inverse of the black dot on white. The gamma 2 interpolation is better on both counts.

Those are some funky colors! I'm seeing different results than you - for the gamma version there is a dark gray band between the green and magenta patches, and the little black square looks noticeably larger.

It could be a display problem - either yours, mine, or both. Maybe it's because channels are clipped.

I would think that linear would better emulate the additive mixing of light in the eye.

Try looking at them in Photoshop side by side with a calibrated/profiled monitor. Comparisons on an uncalibrated monitor aren't going to be very useful. I see a light band on the edge of the green that fades into a narrower dark band on the edge of the magenta in the linear, as opposed to a wider dark band between green and magenta in the gamma-2 image. For real-world images, the gamma-2 is generally better because of the perceptual uniformity of sharpening/ringing artifacts, and black on white more closely matches a negative image of white on black.

Think of it this way: when transitioning from black (0) to white (1) in a series such as 0,0,0,1,1,1, the middle interpolated value is going to be (0.5). (0.5) should correspond perceptually to "halfway between white and black". In a linear image, (0.5) is only 1 stop below clipped white, and thus is on the brighter end of the luminance scale perceptually. As a result, light colors will bleed into dark colors, as shown by the second image I posted, causing bright objects on a dark background to become somewhat larger than a dark object on a light background having the same pixel dimensions.

I would think that linear would better emulate the additive mixing of light in the eye.

Cliff

Why not a perceptually uniform space such as Lab? I thought that would be the space where one would want to interpolate, since color differences of adjacent pixels in the source image are, well, perceptually uniform.

That's what I did. And I confirmed that L* goes from 88 in the green, dips to 49 in the seam, then goes up to 60 in the magenta.

I don't dispute the existence of the dark band between the green and magenta. Try counting how many pixels it takes to get to L*50 from the edge the clipped part of the black square vs the white square in the linear image vs the gamma-2 image. IMO accurately rendering the perceptual transition between white and black is more important than perfectly handling oddball hard-edged color transitions like the green/magenta example that only rarely occur in real-world images.

Converting to LAB before resampling would solve both issues simultaneously, though.

Why not a perceptually uniform space such as Lab? I thought that would be the space where one would want to interpolate, since color differences of adjacent pixels in the source image are, well, perceptually uniform.

It depends on what you want to do. If you want to reconstruct the pattern of light that fell on the focal plane, I think you want to stay linear.

If you want to make a perceptual enhancement of some kind, use a perceptual space. I just think that the perceptual tweaks should occur later in the pipeline.

OK, I think I know what's happening. My guess is that you are viewing your monitor in a dark room. I've been looking at my monitor in a more "average surround" environment, with a CRT with a low black point. That's enough to explain the differences in what we are seeing, since contrast perception is affected by the surround.

When interpolating in a linear space, ringing artifacts are accentuated in dark areas and somewhat de-emphasized in bright areas, perceptually speaking. Interpolating with a non-linear gamma evens out the visibility of the artifacts so that the artifacts are perceptually similar in highlights and shadows. As a result, a lesser degree of mathematical gymnastics is needed to reduce ringing artifacts to the point where they are no longer visible.

I don't dispute the existence of the dark band between the green and magenta. Try counting how many pixels it takes to get to L*50 from the edge the clipped part of the black square vs the white square in the linear image vs the gamma-2 image. IMO accurately rendering the perceptual transition between white and black is more important than perfectly handling oddball hard-edged color transitions like the green/magenta example that only rarely occur in real-world images.

OK, I think I know what's happening. My guess is that you are viewing your monitor in a dark room. I've been looking at my monitor in a more "average surround" environment, with a CRT with a low black point. That's enough to explain the differences in what we are seeing, since contrast perception is affected by the surround.

I would try it with more usual colors like those on a Color Checker.

Quote

Converting to LAB before resampling would solve both issues simultaneously, though.

OK, I think I know what's happening. My guess is that you are viewing your monitor in a dark room. I've been looking at my monitor in a more "average surround" environment, with a CRT with a low black point. That's enough to explain the differences in what we are seeing, since contrast perception is affected by the surround.

The ringing artifacts in the black area of the linear image (between the white dot and magenta square) have peak RGB values between 25 and 35. The same area in the gamma-2 image peaks between 10 and 14. That's a pretty significant difference, IMO, not just monitor viewing conditions.

In the sample I provided earlier (post 5), there is no room in the FT beyond 64 x Sqrt(2) distance from DC. Here are the image, its top right corner zoomed in, and the FFT (the yellow rectangle in the next image is 64x64 pixels):

Of course there is more room in the FT, but we also know what happens with the image when we push detail to the extreme (the image turns gray at given frequencies).

You have room to go right out to the edge of the frequency space. Here's an example of what you get when you approach the diagonal corners. First, what was created in the frequency domain:

Next, what you get after inverse FT to spatial domain:If you look closely, it's all checkerboard with just the 3 white lines.

Amazingly, the above reconstructs to the following (1.5 interpolated, then zoomed 2x, center crop):

So you can reconstruct up to sqrt(2) higher resolution, super-fine detail along the diagonals. Wasn't this what Fuji exploited in their Super CCD sensor?

The ringing artifacts in the black area of the linear image (between the white dot and magenta square) have peak RGB values between 25 and 35. The same area in the gamma-2 image peaks between 10 and 14. That's a pretty significant difference, IMO, not just monitor viewing conditions.

The apparent difference in gamma between average surround viewing and dark surround viewing is about a factor of 1.5. If you're in a dark surround, you have to increase the gamma to 1.5 for the image to appear the same as in an average surround.

Try it in PS Levels or Exposure - set gamma to 1/1.5 = .67 (PS uses reciprocal gamma) to simulate what I see in an average surround.

If I do the opposite and set (reciprocal) gamma to 1.5 in my average surround, I can see exactly what you are describing.

I think your test target is too sensitive to viewing contrast. Plus it responds strangely to changes in Levels, I guess because channels are maxed out. You might end up optimizing your routines so the results look good only under limited conditions.

You can't send a crop of the RAW file; all Phocus is doing is setting some metadata tags indicating the preferred cropping parameters when converting. Try converting the file to losslessly compressed DNG; that should at least bring down the file size considerably. If there are permission/usage issues with the subject, you'll simply have to get the appropriate permissions, use a different RAW that shows the same problem, or send a TIFF crop of the problem area instead of the RAW.

There is a quick and dirty way to try reconstruction in Photoshop - no Matlab required. You just need a plugin that does forward and inverse FFTs. I have the freeware KamLex FFT Plugin (for Windows), that works well.

Here is the the proceedure:

1. Do the forward FFT.2. Increase the canvas size on all sides. The ratio of the new size/original size is your interpolation factor.3. Do the inverse FFT.

That's it! The color of the increased canvas area will affect edge effects and rippling. There are standard methods to minimize edge effects by mirroring, etc.

There are limitations to the plugin, but actually I think this might work as well or better than the Yaroslavsky Matlab code.