Brief summary of archiving

Requirements

There is a need to be able to archive raw image files. This may be so that photographers and their descendents can retrieve images of personal importance in future, or so that images of wider historical importance can be retrieved. Archiving is not the same as backing-up, although there are some factors in common. A back-up is typically a reserve or substitute, often for relatively short-term resilience or performance reasons. After retrieval, it may be exploited in the same context as the original file, for example it may be accessed by the same software products. But an archive is intended to be exploited in a relatively unknowable context, for example where its origins may be in doubt, and by very different software products.

“Future proof”

For photographs to be “future proof”, you will need to be able to find the images, understand their ownership and subject matter, and present them, using your future choice of workflow and tools.

Key unique characteristics of DNG

Some key requirements that DNG is intended to satisfy include:

Longevity and “critical mass”.
It obviously makes the archivists’ task much easier to have relatively few formats to cater for, each able to cater for a large range of points of origin (such as cameras), over a long time. DNG is designed to evolve and to cater for a vast range of camera and sensor characteristics and technologies.

An openly-documented self-contained format.
Presumably the archivists’ need for open documentation is obvious! “Self contained” means that the absolute minimum of additional information, other than the DNG file itself, is needed for the image to be retrieved. This especially means that separate knowledge of the characteristics of the camera that captured the image should not be needed, and certainly means that retrieval should not rely on specific software products from particular companies, Adobe or otherwise. DNG is intended to satisfy this, and each file contains the camera details – more than a colour profile. (Note that Adobe supply a DNG SDK comprising more than 100 C++ source files to optionally complement the documentation).

Comprehensive metadata (including preview).
Ideally, any object of historical important should carry its own historical context with it. Many of the challenges faced by archeologists, historians, archivists, and librarians, arise from loss of the context for the object, or its separation from that context. Obviously, it is important that the metadata itself is held in an openly-documented format, and this applies to XMP. (Note that Adobe supply an XMP SDK to optionally complement the documentation).

How does DNG’s main archival feature work?

DNG has several archival features, (see later), but in my opinion this is the most innovative, and probably also the least understood.

In order to do its work, a raw converter (or other raw image processor) needs more than the raw image data. It also needs to know how to interpret the data. What do the numbers mean? What are the dimensions (in pixels) of the sensor? Which of the numbers correspond to pixels with “red” filters, which to “green”, which to “blue”? Or are there other colours too? What is the relationship between those colours and a known colour space? What are the other values that need to be taken into account, such as the strength of the anti-alias filter? (Plus answers to several other questions). This set of parameters are referred to here as “camera details”, describing the camera in sufficient detail.

Rarely do two current camera models have the same camera details. Typically there are significant differences from one camera model to another, even from the same manufacturer. Therefore, raw converters need to be able to select the correct camera details to process the raw image data contained in a raw file they have been given. Raw converters have camera details for many current and past camera models built into them, (typically derived from examining one or more of each of those cameras), and choose the correct one according to the identification of the camera model contained within the raw file itself.

Every DNG file holds comprehensive camera details. One method of creating a DNG file is to use a DNG Converter, often from Adobe, (including ACR or Lightroom, which can also act as DNG Converters), but several others exist too. The DNG Converter needs to have built-in camera details, (typically derived from examining one or more of each of those cameras), which it then embeds in the resultant DNG file.

Ideally, when raw converters read the raw image data from a DNG file, they read the camera details from that file too. (At least as a user option). This is the target, and most powerful, exploitation of DNG. Such converters can support DNG files of camera models of which they have no prior knowledge.

Another method of creating a DNG file is for the camera to use DNG as its native raw file format. The Ricoh GR Digital camera is one example. The camera knows its own details, (built-in by the camera manufacturer), and embeds these in the DNG file together with the raw image data. No DNG conversion is needed, and most raw converters can process the images without needing to be updated to recognise the camera. So no software needs to recognise the camera: this is truly “future proof”, but is also “now proof”!

Some raw converters can read the raw image data from a DNG file, but can’t read the camera details from that file. They still have to use built-in camera details, (typically derived from examining one or more of each of those cameras). Therefore they can’t support DNG files of camera models they don’t otherwise have support for. Other raw converters can read the camera details from the DNG file, but if they have their own knowledge of the camera concerned they may choose to use that instead of the camera details from the DNG file.

Other features

There are several other archival features in DNG. For example, the publicly available specifications, full use of XMP, previews of various sizes held within the file, an MD5 digest of the raw image data, the source-code based SDK (which can also act as a supplement to the specification if necessary), use of standards wherever possible, Adobe’s submission of DNG to ISO for standardisation, etc.

The reason why these are not given greater prominence in this article is that they are relatively easy to understand, and some (such as use of XMP) are features of other file formats too. The above description of how DNG holds embedded camera details is less well understood, and even denied by many people! There are few descriptions of it using diagrams in this way.