This topic has caused me to think about something that I’d considered settled. As often happens, this kind of thinking has made me question what I thought I knew, and raised as many issues as it’s settled.

I’ve created a list of file types in order of increasing processing required for rendering, and also in order of decreasing determinism – the types later in the list tend to have a wider variety of acceptable renderings. The two qualities are not perfectly correlated, and I’ve had to apply subjective weightings to, if you will, turn a vector ranking into a scalar one. The distinctions are rough, as there are so many file formats that, once processed, result in images.

A. Raster-arranged files in the color space of the intended output device. Examples are gray gamma 2.2 for a particular monitor, sRGB, one of the SWOP CMYK standards. In the case where an offset press is the output device, the data in the file is halftoned. In the case where an inkjet printer is the output device, the data is also halftoned – this image form is hardly ever saved on a disk except for spooling.

B. Raster-arranged files with tags enabling appropriate processing for a range of output devices. Examples are PSD, many TIFF variants with ICC profiles attached, and raw files. Raw files sometimes need data that’s not in the tags for acceptable processing.

C. Quasi-raster-arranged files with non-raster data included. Examples are the discrete cosine transform coding of the original JPEG, the wavelet coding of JPEG 2000, and, I believe, all of the MPEG video formats.

I’ve created a list of file types in order of increasing processing required for rendering,

Simply make a distinction between: "decoding", and "interpretation".

A: fully decoded and interpreted, ready for output

B: already decoded, needs interpretation for output

C: needs decoding and interpretation for output

RAW files can be considered a form of lossy encoding, and hence would fall in category C. (This would also hold true in your categories imo).

Strictly speaking, all file formats fall into category C, as they are only containers to raster-data, which needs to be extracted to be displayed. But more importantly is the idea of "compositing" as distinct, since PSD, TIFF, PDF, and PostScript can contain data for several images. A composition obviously needs to be interpreted (rendered) before display is possible.

You can obviously make higher levels by including "encoding" steps and "interpretation" tags, that is: a text file can be encoded as postscript with colorprofiles, which can then be decoded and interpreted by a RIP.

I’ve created a list of file types in order of increasing processing required for rendering, and also in order of decreasing determinism – the types later in the list tend to have a wider variety of acceptable renderings.. . . . .

F. Plain text files, or files that do not uniquely specify the ultimate raster image. Examples are TXT files or early HTML. HTML has been heading in the direction of category D of late.

I think most people would agree that category A files are image files. I think most people would agree that category F files are not.

Where do you all think the line should be drawn? Or is it enough to agree that, in this context, image file is not a binary term?

The mind of an older person such as myself flashes back to those "computer-generated" images formed from text-only files that, from a distance, looked like Marilyn Monroe, President Lincoln or even oneself.

So perhaps an 'image file' is simply one that produces a likeness when rendered by whatever means?

That would lead to the conclusion that any file type can be an image file provided that software and equipment exists to render it to our eyes. Thus I could invent a file type ".bit" (assuming it doesn't already exist) that arranged the bits in the bytes, taken sequentially, to form a black and white image when suitably decoded and rendered.

The mind of an older person such as myself flashes back to those "computer-generated" images formed from text-only files that, from a distance, looked like Marilyn Monroe, President Lincoln or even oneself.

Ted, I'd call those category A, since they are rasterized and only produce the intended result on a specific (or a narrow set of compatible) output devices. The characters are basically halftoning glyphs.

Are you putting them in the same category as other lossy file systems like Jpeg? That would be misleading. Raw files of the appropriate form (losslessly or perceptively coded) with appropriate metadata represent image information entropy: no other file system is more efficient at storing full information captured.

Are you putting them in the same category as other lossy file systems like Jpeg? That would be misleading. Raw files of the appropriate form (losslessly or perceptively coded) with appropriate metadata represent image information entropy: no other file system is more efficient at storing full information captured.

I was referring to RAW bayer data, which inherently lacks information.

Clearly, Monochrome and Sigma type captures could potentially be considered complete and might only require basic color interpretation.

color interpretation is not necessary for something to be considered an image... a "beautiful" image may be.

I understand what your saying, and that is exactly what the word "might" means in english.

But you should also consider the act of displaying image data straight to a display as a color interpretation. That is: the primaries of the image data are then assumed to equal display primaries (or B&W response). Whether that is actually true or intentional is completely irrelevant for the purpose of categorisation.

I was referring to RAW bayer data, which inherently lacks information.

Clearly, Monochrome and Sigma type captures could potentially be considered complete and might only require basic color interpretation.

There are upper bounds on the amount of information that can be present in a finite size file. I am no expert on the entropy of the physical world, but I would guess that any scene can be expected to contain "virtually infinite" amount of information, at least in the practically way that the power of an elephant is seemingly infinite compared to that of a mouse.

Any camera does some serious information reduction. Fine spatial information is bundled into "pixels", after being smeared by lens and filters. Fine spectral information is bundled into "r"/"g"/"b" (or some poor substitue in the case of Foveon/achromatics). Intensity information is partly buried in read noise or clipped, then discretized into a finite number of ADC codes. The 3D scene is rendered into a 2D representation where occluded objects are forever lost (except in wishful Hollywood movies). Information about true movement is either smeared into a blur, or never even recorded because the photographer hit the button a bit late.

I don't really see what this has to do with the previous claims about it not being possible/sensible to base histograms on raw files because raw files are not really files (now, that is a contradicting statement). Whatever terminology we choose to use, raw (or raw-like) information does tell us a lot about the relationship between the current scene, camera settings and camera sensor limitations that we can use to select better camera settings. Is not that the core issue?

I don't really see what this has to do with the previous claims about it not being possible/sensible to base histograms on raw files because raw files are not really files (now, that is a contradicting statement). Whatever terminology we choose to use, raw (or raw-like) information does tell us a lot about the relationship between the current scene, camera settings and camera sensor limitations that we can use to select better camera settings. Is not that the core issue?

+1 on this topic concerning the non-file status of raw files--That was a meaningless diversion. Cllipping can be introduced by processing of a raw file, during white balance for example. However, this complication can be eliminated by using WB multipliers of less than unity. By the same token, clilpping can be produced by editing of a JPG file.

+1 on this topic concerning the non-file status of raw files--That was a meaningless diversion. Cllipping can be introduced by processing of a raw file, during white balance for example. However, this complication can be eliminated by using WB multipliers of less than unity. By the same token, clilpping can be produced by editing of a JPG file.

Bill

Yes, and clipping can also be introduced in the Raw file by ISO via analog and/or digital gain.

Cllipping can be introduced by processing of a raw file, during white balance for example. However, this complication can be eliminated by using WB multipliers of less than unity. By the same token, clilpping can be produced by editing of a JPG file.

Bill

Using WB multipliers of less than one is fine unless you have clipping in the raw values. In this case, the resulting values will not be clipped and you might end un with a color cast in those highlights and they will not be considered by recovery algorithms.

Even if you don't get clipped values after WB, you might get clipping because of color space encoding (red flowers anyone?). It is not always easy to know if your clipping is because of overexposure/white balance or out of gamut color. What I think is wrong is to underexpose to compensate for out of gamut issues.

Using WB multipliers of less than one is fine unless you have clipping in the raw values. In this case, the resulting values will not be clipped and you might end un with a color cast in those highlights and they will not be considered by recovery algorithms.

Yes, that is true. One should avoid exposing to the right to the extent of incurring channel clipping. However, many highlights are nearly neutral and recover algorithms that recover to neutral can often avoid a color cast.

Even if you don't get clipped values after WB, you might get clipping because of color space encoding (red flowers anyone?). It is not always easy to know if your clipping is because of overexposure/white balance or out of gamut color. What I think is wrong is to underexpose to compensate for out of gamut issues.

I agree that one should not underexpose to avoid saturation clipping but rather one should render into a wider color space. Unfortunately, most cameras do not allow ProPhotoRGB. As I mentioned earlier, UniWB is useful to avoid clipping to to WB.

It is not always easy to know if your clipping is because of overexposure/white balance or out of gamut color. What I think is wrong is to underexpose to compensate for out of gamut issues.

This is an interesting comment, worth spending some time on imo. It entails understanding and visualizing color spaces (camera, colorimetric and human) in 3D, something most people (including myself) haven't done much of. Can you elaborate Francisco?

This is an interesting comment, worth spending some time on imo. It entails understanding and visualizing color spaces (camera, colorimetric and human) in 3D, something most people (including myself) haven't done much of. Can you elaborate Francisco?

Jack

Totally irrelevant because during the capture stage "overexposure" and "out-of-gamut" is exactly the same thing. This is only relevant in post-processing where the potential damage of incorrect exposure is already done.

Totally irrelevant because during the capture stage "overexposure" and "out-of-gamut" is exactly the same thing. This is only relevant in post-processing where the potential damage of incorrect exposure is already done.

This would be true if you consider the camera color space as the working color space, which is usually not known. I am refering to working color spaces such as ProphotoRGB, AdobeRGB etc.

It is possible that without having any clipped raw channel, you get out of gamut colors in a standard working space. This issue increases as you reduce the volume of the color space, such as AdobeRGB or sRGB. Even if you use UniWB you might encounter this issue when checking the histogram or blinkies in the camera, since the largest color space available is AdobeRGB.

It is possible that without having any clipped raw channel, you get out of gamut colors in a standard working space.

Yeah, LDO….! That is the entire point of the whole RAW histogram discussion. What's the use of stating the obvious?

You should also be aware that the output colorspace rendition on the camera is usually a perceptual rendition, not the usual matrix conversion, therefore clipping indication may still be accurate, regardless of colorspace.