PNG (Portable Network Graphics) Specification, Version 1.2

This chapter gives some recommendations for encoder behavior.
The only absolute requirement on a PNG encoder is that it produce
files that conform to the format specified in the preceding chapters.
However, best results will usually be achieved by following these
recommendations.

When encoding input samples that have a sample depth that cannot be
directly represented in PNG, the encoder must scale the samples up to a
sample depth that is allowed by PNG. The most accurate scaling method
is the linear equation

output = ROUND(input * MAXOUTSAMPLE / MAXINSAMPLE)

where the input samples range from 0 to MAXINSAMPLE
and the outputs
range from 0 to MAXOUTSAMPLE
(which is 2sampledepth-1).

A close approximation to the linear scaling method can be achieved by
"left bit replication", which is shifting the valid bits to
begin in the
most significant bit and repeating the most significant bits into the
open bits. This method is often faster to compute than linear scaling.
As an example, assume that 5-bit samples are being scaled up to 8 bits.
If the source sample value is 27 (in the range from 0-31), then the
original bits are:

which matches the value computed by the linear equation. Left bit
replication usually gives the same value as linear scaling and is never
off by more than one.

A distinctly less accurate approximation is obtained by simply
left-shifting the input value and filling the low order bits with
zeroes. This scheme cannot reproduce white exactly, since it does not
generate an all-ones maximum value; the net effect is to darken the
image slightly. This method is not recommended in general, but it does
have the effect of improving compression, particularly when dealing
with greater-than-eight-bit sample depths. Since the relative error
introduced by zero-fill scaling is small at high sample depths, some
encoders may choose to use it. Zero-fill must not
be used for alpha channel data, however, since many decoders will
special-case alpha values of all zeroes and all ones. It is important
to represent both those values exactly in the scaled data.

When the encoder writes an sBIT chunk, it is required to
do the scaling in such a way that the high-order bits of the stored
samples match the original data. That is, if the sBIT chunk
specifies a sample depth of S, the high-order
S bits of
the stored data must agree with the original S-bit
data values. This allows decoders to
recover the original data by shifting right. The added low-order bits
are not constrained. Note that all the above scaling methods meet this
restriction.

When scaling up source data, it is recommended that the low-order
bits be filled consistently for all samples; that is, the same source
value should generate the same sample value at any pixel position. This
improves compression by reducing the number of distinct sample values.
However, this is not a requirement, and some encoders may choose not to
follow it. For example, an encoder might instead dither the low-order
bits, improving displayed image quality at the price of increasing file
size.

In some applications the original source data may have a range that
is not a power of 2. The linear scaling equation still works for this
case, although the shifting methods do not. It is recommended that an
sBIT chunk not be written for such images, since sBIT
suggests that the original data range was
exactly 0..2S-1.

Encoders capable of full-fledged color management [ICC]
will perform more sophisticated
calculations than those described here, and they may choose to
use the iCCP chunk. Encoders that know that their image
samples conform to the sRGB specification [sRGB]
should use the sRGB chunk and not perform
gamma handling. Otherwise, this section applies.

The encoder has two gamma-related decisions to make. First, it must
decide how to transform whatever image samples it has into the image
samples that will go into the PNG file. Second, it must decide what
value to write into the gAMA chunk.

The rule for the second decision is simply to write whatever
value will cause a decoder to do what you want. See Recommendations
for Decoders: Decoder gamma handling.

The first decision depends on the nature of the image samples
and their precision. If the samples represent light intensity in
floating-point or high-precision integer form (perhaps from a computer
image renderer), then the encoder may perform "gamma encoding"
(applying a power function with exponent less than 1) before quantizing the data
to integer values for output to the file. This results in fewer banding
artifacts at a given sample depth, or allows smaller samples while
retaining the same visual quality. An intensity level expressed as a
floating-point value in the range 0 to 1 can be converted to a file
image sample by

If the intensity in the equation is the desired display output
intensity, then the encoding exponent is the gamma value to be written to
the file, by the definition of gAMA (See the gAMA chunk specification).
But if the intensity
available to the encoder is the original scene intensity, another
transformation may be needed. Sometimes the displayed image should have
higher contrast than the original image; in other words, the end-to-end
transfer function from original scene to display output should have an
exponent greater than 1. In this case,

gamma = encoding_exponent / end_to_end_exponent

If you don't know whether the conditions under which the original
image was captured (or calculated) warrant such a contrast change, you
may assume that display intensities are proportional to original scene
intensities; in other words, the end-to-end exponent is 1, so gamma and
the encoding exponent are equal.

If the image is being written to a file only, the encoder is free to
choose the encoding exponent. Choosing a value that causes the gamma
value in the gAMA chunk to be 1/2.2 is often a reasonable
choice because it minimizes the work for a decoder displaying on a
typical video monitor.

Some image renderers may simultaneously write the image to a PNG file
and display it on-screen. The displayed pixels should be appropriate
for the display system, so that the user sees a proper representation of
the intended scene.

If the renderer wants to write the displayed sample values to the PNG
file, avoiding a separate gamma encoding step for file output, then the
renderer should approximate the transfer function of the display system
by a power function, and write the reciprocal of the exponent into the
gAMA chunk. This will allow a PNG decoder to reproduce what
the file's originator saw on screen during rendering.

However, it is equally reasonable for a renderer to compute displayed
pixels appropriate for the display device, and to perform separate
gamma encoding for file storage, arranging to have a value in the
gAMA chunk more appropriate to the future use of the image.

Computer graphics renderers often do not perform gamma encoding,
instead making sample values directly proportional to scene light
intensity. If the PNG encoder receives intensity samples that have
already been quantized into integers, there is no point in doing gamma
encoding on them; that would just result in further loss of information.
The encoder should just write the sample values to the PNG file. This
does not imply that the gAMA chunk should contain a gamma
value of 1.0, because the desired end-to-end transfer function from
scene intensity to display output intensity is not necessarily linear.
The desired gamma value is probably not far from 1.0, however. It may
depend on whether the scene being rendered is a daylight scene or an
indoor scene, etc.

When the sample values come directly from a piece of hardware, the
correct gamma value can in principle be inferred from the transfer
function of the hardware and the lighting conditions of the scene.
In the case of video digitizers ("frame grabbers"), the samples
are probably in the sRGB color space, because the sRGB specification was
designed to be compatible with video standards. Image scanners are less
predictable. Their output samples may be proportional to the input
light intensity because CCD (charge coupled device) sensors themselves
are linear, or the scanner hardware may have already applied a power
function designed to compensate for dot gain in subsequent printing (an
exponent of about 0.57), or the scanner may have corrected the samples
for display on a monitor. The device documentation might describe
the transformation performed, or might describe the target display or
printer for the image data (which might be configurable). You can also
scan a calibrated target and use calibration software to determine the
behavior of the device. Remember that gamma relates file samples to
desired display output, not to scanner input.

File format converters generally should not attempt to convert
supplied images to a different gamma. Store the data in the PNG file
without conversion, and deduce the gamma value from information in the
source file if possible.
Gamma alteration at file conversion time causes re-quantization of
the set of intensity levels that are represented, introducing further
roundoff error with little benefit. It's almost always better to just
copy the sample values intact from the input to the output file.

If the source file format describes the gamma characteristic of the
image, a file format converter is strongly encouraged to write a PNG
gAMA chunk. Note that some file formats specify the exponent
of the function mapping file samples to display output rather than the
other direction. If the source file's gamma value is greater than
1.0, it is probably a display system exponent, and you should use its
reciprocal for the PNG gamma. If the source file format records the
relationship between image samples and something other than display
output, then deducing the PNG gamma value will be more complex.

Regardless of how an image was originally created, if an encoder
or file format converter knows that the image has been displayed
satisfactorily using a display system whose transfer function can be
approximated by a power function with exponent display_exponent,
then the image can be marked as having the gamma value:

gamma = 1 / display_exponent

It's better to write a gAMA chunk with an approximately
right value than to omit the chunk and force PNG decoders to guess at an
appropriate gamma.

On the other hand, if the encoder has no way to infer the gamma
value, then it is better to omit the gAMA chunk entirely. If
the image gamma has to be guessed at, leave it to the decoder to do the
guessing.

Gamma does not apply to alpha samples; alpha is always represented
linearly.

Encoders capable of full-fledged color management [ICC]
will perform more sophisticated
calculations than those described here, and they may choose to
use the iCCP chunk. Encoders that know that their image
samples conform to the sRGB specification [sRGB]
are strongly encouraged to use the sRGB chunk.
Otherwise, this section applies.

If it is possible for the encoder to determine the chromaticities of
the source display primaries, or to make a strong guess based on the
origin of the image or the hardware running it, then the encoder is
strongly encouraged to output the cHRM chunk. If it does so,
the gAMA chunk should also be written; decoders can do little
with cHRM if gAMA is missing.

Video created with recent video equipment probably uses the
CCIR 709 primaries and D65 white point [ITU-R-BT709],
which are:

R G B White
x 0.640 0.300 0.150 0.3127
y 0.330 0.600 0.060 0.3290

An older but still very popular video standard is SMPTE-C [SMPTE-170M]:

R G B White
x 0.630 0.310 0.155 0.3127
y 0.340 0.595 0.070 0.3290

The original NTSC color primaries have not been used in decades.
Although you may still find the NTSC numbers listed in standards
documents, you won't find any images that actually use them.

Scanners that produce PNG files as output should insert the filter
chromaticities into a cHRM chunk.

In the case of hand-drawn or digitally edited images, you have to
determine what monitor they were viewed on when being produced. Many
image editing programs allow you to specify what type of monitor
you are using. This is often because they are working in some
device-independent space internally. Such programs have enough
information to write valid cHRM and gAMA chunks, and
should do so automatically.

If the encoder is compiled as a portion of a computer image renderer
that performs full-spectral rendering, the monitor values that were
used to convert from the internal device-independent color space to
RGB should be written into the cHRM chunk. Any colors that
are outside the gamut of the chosen RGB device should be clipped or
otherwise constrained to be within the gamut; PNG does not store
out-of-gamut colors.

If the computer image renderer performs calculations directly in
device-dependent RGB space, a cHRM chunk should not be written
unless the scene description and rendering parameters have been adjusted
to look good on a particular monitor. In that case, the data for that
monitor (if known) should be used to construct a cHRM chunk.

There are often cases where an image's exact origins are unknown,
particularly if it began life in some other format. A few image
formats store calibration information, which can be used to fill in
the cHRM chunk. For example, all PhotoCD images use the CCIR
709 primaries and D65 white point, so these values can be written into
the cHRM chunk when converting a PhotoCD file. PhotoCD also
uses the SMPTE-170M transfer function. (PhotoCD can store colors
outside the RGB gamut, so the image data will require gamut mapping
before writing to PNG format.) TIFF 6.0 files can optionally store
calibration information, which if present should be used to construct
the cHRM chunk. GIF and most other formats do not store any
calibration information.

It is not recommended that file format converters
attempt to convert supplied images to a different RGB color space.
Store the data in the PNG file without conversion, and record the source
primary chromaticities if they are known. Color space transformation
at file conversion time is a bad idea because of gamut mismatches and
rounding errors. As with gamma conversions, it's better to store the
data losslessly and incur at most one conversion when the image is
finally displayed.

The alpha channel can be regarded either as a mask that temporarily
hides transparent parts of the image, or as a means for constructing a
non-rectangular image. In the first case, the color values of fully
transparent pixels should be preserved for future use. In the second
case, the transparent pixels carry no useful data and are simply there
to fill out the rectangular image area required by PNG. In this case,
fully transparent pixels should all be assigned the same color value for
best compression.

Image authors should keep in mind the possibility that a decoder will
ignore transparency control. Hence, the colors assigned to transparent
pixels should be reasonable background colors whenever feasible.

For applications that do not require a full alpha channel, or
cannot afford the price in compression efficiency, the tRNS
transparency chunk is also available.

If the image has a known background color, this color should
be written in the bKGD chunk. Even decoders that ignore
transparency may use the bKGD color to fill unused screen area.

If the original image has premultiplied (also
called "associated")
alpha data, convert it to PNG's non-premultiplied format by dividing
each sample value by the corresponding alpha value, then multiplying by
the maximum value for the image bit depth, and rounding to the nearest
integer. In valid premultiplied data, the sample values never exceed
their corresponding alpha values, so the result of the division should
always be in the range 0 to 1. If the alpha value is zero, output black
(zeroes).

Suggested palettes can appear as sPLT chunks in any PNG
file, or as a PLTE chunk in truecolor PNG files. In either
case, the suggested palette is not an essential part of the image
data, but it may be used to present the image on indexed-color display
hardware. Suggested palettes are of no interest to viewers running on
truecolor hardware.

When sPLT is used to provide a suggested palette, it is
recommended that the encoder use the frequency fields to indicate the
relative importance of the palette entries, rather than leave them
all zero (meaning undefined). The frequency values are most easily
computed as "nearest neighbor" counts, that is,
the approximate usage
of each RGBA palette entry if no dithering is applied. (These counts
will often be available for free as a consequence of developing the
suggested palette.) Because the suggested palette includes transparency
information, it should be computed for the uncomposited image.

Even for indexed-color images, sPLT can be used to define
alternative reduced palettes for viewers that are unable to display all
the colors present in the PLTE chunk.

An older method for including a suggested palette in a truecolor
PNG file uses the PLTE chunk. If this method is used, the
histogram (frequencies) should appear in a separate hIST chunk.
Also, PLTE does not include transparency information, so for
images of color type 6 (truecolor with alpha channel), it is recommended
that a bKGD chunk appear and that the palette and histogram
be computed with reference to the image as it would appear after
compositing against the specified background color. This definition
is necessary to ensure that useful palette entries are generated for
pixels having fractional alpha values. The resulting palette will
probably be useful only to viewers that present the image against the
same background color. It is recommended that PNG editors delete or
recompute the palette if they alter or remove the bKGD chunk in
an image of color type 6.

For images of color type 2 (truecolor without alpha channel),
it is recommended that PLTE and hIST be computed
with reference to the RGB data only, ignoring any transparent-color
specification. If the file uses transparency (has a tRNS
chunk), viewers can easily adapt the resulting palette for use with
their intended background color. They need only replace the palette
entry closest to the tRNS color with their background color
(which may or may not match the file's bKGD color, if any).

If PLTE appears without bKGD in an image of color
type 6, the circumstances under which the palette was computed are
unspecified.

For providing suggested palettes, sPLT is more flexible than
PLTE in the following ways:

With sPLT, there can be multiple suggested palettes. A
decoder may choose an appropriate palette based on name or number of
entries.

In an RGBA (color type 6) PNG, PLTE represents a palette
already composited against the bKGD color, so it is useful only
for display against that background color. The sPLT chunk
provides an uncomposited palette, which is useful for display against
backgrounds of the decoder's choice.

Since sPLT is a noncritical chunk, a PNG editor can add
or modify suggested palettes without being forced to discard unknown
unsafe-to-copy chunks.

Whereas sPLT is allowed in PNG files of color types 0, 3,
and 4 (grayscale and indexed), PLTE cannot be used to provide
reduced palettes in these cases.

More than 256 entries can appear in sPLT.

An encoder that uses sPLT may choose to write a
PLTE/hIST suggested palette as well, for backward
compatibility with decoders that do not recognize sPLT.

For images of color type 3 (indexed color), filter type 0 (None) is
usually the most effective. Note that color images with 256 or fewer
colors should almost always be stored in indexed color format; truecolor
format is likely to be much larger.

Filter type 0 is also recommended for images of bit depths less than
8. For low-bit-depth grayscale images, it may be a net win to expand
the image to 8-bit representation and apply filtering, but this is rare.

For truecolor and grayscale images, any of the five filters may prove
the most effective. If an encoder uses a fixed filter, the Paeth filter
is most likely to be the best.

For best compression of truecolor and grayscale images, we recommend
an adaptive filtering approach in which a filter is chosen for each
scanline. The following simple heuristic has performed well in early
tests: compute the output scanline using all five filters, and select
the filter that gives the smallest sum of absolute values of outputs.
(Consider the output bytes as signed differences for this test.) This
method usually outperforms any single fixed filter choice. However, it
is likely that much better heuristics will be found as more experience
is gained with PNG.

Filtering according to these recommendations is effective on
interlaced as well as noninterlaced images.

A nonempty keyword must be provided for each text chunk
(iTXt, tEXt, or zTXt).
The generic keyword "Comment" can be used if no better
description of
the text is available. If a user-supplied keyword is used, be sure to
check that it meets the restrictions on keywords.

Text stored in tEXt or zTXt chunks is expected
to use the Latin-1 character set.
Encoders should provide character code remapping if the local
system's character set is not Latin-1.
Encoders wishing to store characters not defined in Latin-1 should use
the iTXt chunk.

Encoders should discourage the creation of single lines of text
longer than 79 characters, in order to facilitate easy reading.

It is recommended that text items less than 1K (1024 bytes) in size
should be output using uncompressed text chunks. In particular,
it is recommended that the text associated with basic title and author
keywords should always be output with uncompressed chunks. Lengthy
disclaimers, on the other hand, are ideal candidates for compression.

Placing large text chunks after the
image data (after IDAT) can speed up image display in some
situations, since the decoder won't have to read over the text to get to
the image data. But it is recommended that small text chunks,
such as the image title, appear before IDAT.

Applications can use PNG private chunks to carry information that
need not be understood by other applications. Such chunks must be given
names with lowercase second letters, to ensure that they can never
conflict with any future public chunk definition. Note, however, that
there is no guarantee that some other application will not use the same
private chunk name. If you use a private chunk type, it is prudent to
store additional identifying information at the beginning of the chunk
data.

Use an ancillary chunk type (lowercase first letter), not a critical
chunk type, for all private chunks that store information that is not
absolutely essential to view the image. Creation of private critical
chunks is discouraged because they render PNG files unportable. Such
chunks should not be used in publicly available software or files.
If private critical chunks are essential for your application, it is
recommended that one appear near the start of the file, so that a
standard decoder need not read very far before discovering that it
cannot handle the file.

If you want others outside your organization to understand a chunk
type that you invent, contact the maintainers of the PNG specification
to submit a proposed chunk name and definition for addition to the list
of special-purpose public chunks (see Additional chunk types).
Note that a proposed public chunk name
(with uppercase second letter) must not be used in publicly available
software or files until registration has been approved.

If an ancillary chunk contains textual information that might be
of interest to a human user, you should not create a
special chunk type for it. Instead use a text chunk and define
a suitable keyword. That way, the information will be available to
users not using your software.

Keywords in text chunks should be reasonably
self-explanatory, since the idea is to let other users figure out what
the chunk contains. If of general usefulness, new keywords can be
registered with the maintainers of the PNG specification. But it is
permissible to use keywords without registering them first.

This specification defines the meaning of only some of the possible
values of some fields. For example, only compression method 0 and
filter types 0 through 4 are defined. Numbers greater than 127 must be
used when inventing experimental or private definitions of values for
any of these fields. Numbers below 128 are reserved for possible future
public extensions of this specification. Note that use of private type
codes may render a file unreadable by standard decoders. Such codes are
strongly discouraged except for experimental purposes, and should not
appear in publicly available software or files.