Pixel Loss from Saving Images in JPEG format

Michael D. Sullivan

The JPEG
file interchange format[1] is
frequently used both as a digital
camera output medium and as a final form for saving images.
The JPEG format is a "lossy" format, meaning that it does not save
precise pixel values, but instead saves data that allows
reconstruction of a close
approximation of
the original.[2]
It's a very good approximation for many types of images, having been
developed to compress and reconstruct photographic images. JPEG
compression algorithms provide for variable compression levels. In
practice, this means that devices and applications creating JPEG files
can give the user the ability to select greater image quality
and
larger file size or lower quality and smaller file size.

Because JPEG is a lossy format, it does not permit the exact
reproduction of the original image even when the highest quality
setting is used. This means that "pixels are lost" every time an image
is saved in JPEG format — more precisely, the original RGB
values
of pixels are discarded and not fully recovered. Every pixel is an
approximation of the original. The change in value of pixels
from
one generation to the next is greatest, at any given compression level,
in areas of high contrast — and particularly sharp edges and
lines. The approximations are least noticeable in areas of smooth color
or luminosity change.

Because of this aspect of JPEG, it is not a file format that is well
suited for incremental saves of images being edited. Every
time a
JPEG file is saved and then reopened, it becomes a slightly less
accurate reproduction of what has gone before.

One question that is frequently asked is how significant the loss of
pixel accuracy is, particulary with repeated saves. I
conducted
an experiment to analyze this.

Next, I opened the JPEGs in Photoshop as new layers atop the original
image. Then, for each JPEG, I created a new layer depicting the
difference between the original and the JPEG. I did this by turning off
the visibility of all other layers, setting the JPEG's blending mode to
"difference", using a threshold adjustment mask to turn all pixels that
had changed by a specified threshold
amount[4] to solid white color, and
then saving the result as a new layer. The thresholds used were
1, 2, 5, and 10 (where appropriate).

The results are fairly astounding. When the image above is
saved
at maximum quality (JPEG100), all of the pixels appearing white below
represent pixels that changed their RGB value by 1 unit:

The histogram palette, set to Luminosity, provides a count of the
pixels that are white, representing the pixels that had changed by at
least the specified threshold. Here are the results in
tabular
form (click on a given result to see the full-sized depiction of the
changed pixels):

As expected, the higher quality levels had much smaller changes in
pixel value than the lower quality levels. What is more surprising is
the level of further change when the JPEG files are resaved at the same
quality level or one level higher or lower, and then reopened.
Again,
click on a given result to see the full-sized depiction of the changed
pixels, but files have not been included to illustrate the cumulative
percentage of pixels changed from the original.

Percentage
of pixels changing value by ≥1 when resaved as same or different
JPEG quality,
from the starting JPEG image and the cumulative change from the
original image

As the table indicates, the percentage of pixels changing in a
second-generation save to the same JPEG quality level is much lower
than the percentage that changed when the file was first saved to JPEG,
but resaving the file at the next lower quality level or the next higher quality level
results in a much higher percentage of pixels changing value from one
generation to another than if the same quality level is used.

Despite the relatively high number of pixels changing value when a JPEG
is resaved, it is suprising that the number of pixels cumulatively
changed from the original image does not significantly increase.
This suggests that the vast majority of the pixel loss in a
second-generation save at the same quality level affects the same
pixels that had already changed in the first save as a JPEG
file.
The relatively few portions of the image that did not undergo
any
significant change when first saved as a JPEG largely remain unchanged
when resaved. The pixels that are undergoing the repeated alterations
upon resaving may be drifting farther and farther from the original
image, but it is beyond the scope of this experiment to evaluate this.
It is noteworthy, however, that the pixel change patterns
appear
to be "clumpier" upon a resave — evidence of "JPEG
artifacts";
this is particularly evident when the JPEG60 image is resaved at the
same quality.

Finally, the following table will illustrate the relative file sizes[5]
of the first and second generation JPEG files containing this image,
including the amount of compression relative to the original 1,700,000
bytes of image data (measured as the reduction in size):

Original JPEG

Resaved 100

Resaved 80

Resaved 60

Bytes

Comp.

Bytes

Comp.

Bytes

Comp.

Bytes

Comp.

JPEG100

641,324

62.28%

642,178

62.22%

296,362

82.57%

N/A

N/A

JPEG80

295,996

82.59%

405,120

76.17%

295,954

82.59%

161,026

90.53%

JPEG60

166,812

90.19%

293,430

82.74%

204,175

87.99%

166,810

90.19%

For persons wishing to examine the data in more detail, a RAR
compressed copy of the Photoshop file, containing all layers,
is available by emailing me, using my initials (mds) at this domain as the
email address.

Notes

1. Strictly
speaking, JPEG is not a file
format but a compression algorithm developed by the Joint Photographic
Experts Group for the ISO. It became an an
international
standard when it was adopted by the ISO in 1990 and has been published
as ITU Recommendation T.81; it's
available in Adobe Acrobat (PDF) format from the ITU.
JPEG compression is used in several file formats.
The most
common file format using JPEG compression is the JPEG File Interchange
Format (JFIF), which was created by the Independent JPEG Group for
storing single images that have been JPEG compressed. The
current
version of the JFIF specification is available
in PDF format from the W3C.
JFIF files are commonly known as JPEG images or JPEG files
and
typically have a .jpg or .JPG extension. For further
information
about the JPEG file image format, see the W3C page on JPEG FIF, the
JPEG FAQ,
and Wikipedia
on JPEG.

2. In the simplest
terms, JPEG compression
stores parameters for formulas that allow the construction of images
that look much like the original to the human eye, instead of storing
pixel values. The image is broken down into "tiles" that are
represented by formulas indicating the relative brightness and color of
different regions of each tile. Wikipedia
provides a good summary of the techniques used.

3. There are no
standardized names or
categories for the various JPEG quality/compression/file size settings.
As a result, every application or device uses its own names
for
its own proprietary settings. Photoshop even uses different terms in
the main program and in Save for Web. I settled on the terms
used
in Save for Web in Photoshop CS2.

4. Photoshop does
not specify what units
are used in the determination of thresholds; it appears to be
luminosity, which is determined by combining R, G, and B values in
accordance with a weighted formula. Thus, the use of the
threshold adjustment layer with a threshold value of 1 will not
necessarily detect all changed pixels, but only those changed pixels
whose luminosity is greater than or equal to 1; if a pixel's R, G, and
B levels change in such a way that the weighted luminosity of the
difference is not greater than 1, the pixel will remain black even
though the pixel has changed. A more accurate determination
of
pixel change would require creating separate threshold adjustment
layers for the red, blue, and green channels, and then combining them.
The method employed here is sufficient to illustrate the magnitude of
the pixel changes, but does not produce a mathematically rigorous
result.

5. The JPEG files
were saved without
profiles or metadata. Inclusion of profiles and metadata
would
make the files significantly larger.