A few scanning tips

www.scantips.com

What does JPG Quality Losses mean? What are JPG artifacts?

Lossless compression ensures that we will read back out of the file exactly the same data that we wrote into it. Lossless is a desired necessary feature in like our bank checking account software, or in our Word or Excel or text editor program. Zip files are always lossless, and some image formats are lossless (PNG and TIF LZW), but JPG is not lossless.

JPG is lossy compression, meaning to be able to do this truly heroic feat of shrinking the file data so extremely, the process has to take liberties with the data to accomplish it. Lossy compression means we don't get exactly the same thing back out of the file that we thought we put in. We always get the same count of pixels back, but those pixels may have slightly different colors (color is the only property of a pixel). One way the data differs is that JPG compression may lump 8x8 pixel blocks of similar pixels (sky for example) together as all the same color now, which were not really all the same color before, but because it is very efficient to do so. Another type of JPG artifact commonly seen is fringing around sharp edges (look around the added text in larger examples below). Subsequently resampling the image smaller may help hide some of it. It may still be "good enough" quality, if we don't go too far (stick with Fine Quality). When taking the image data out of the JPG file (uncompressing), we always get back the same number of pixels, but they may not be quite the same color that we thought we put into the JPG file... visible artifacts may be created. A pixel is just a color definition, the pixel data is RGB color data. Any image detail we see is just the difference in the colors of pixels. So this compression data loss is image quality losses. It can also cause visible artifacts in the image (evidence of processing, which really should not be there).

Digital cameras usually have options for two very different ways to create smaller JPG files. One way is to actually resample the pixels, to create actually smaller images, fewer pixels, of smaller image dimensions. These image size choices are usually named Large, Medium, Small. The image dimensions (pixels) become smaller, fewer megapixels.

And for any of these selected image sizes, then there are different JPG Quality options named Fine, Normal, Basic which are file compression choices. The only purpose of these JPG Quality choices is to create smaller file size of lower image quality (of still the same selected image size). The better the JPG quality choice, the larger the file. Less JPG quality is a smaller file (of lower image quality). The camera's default is Large and Fine, which is the biggest image of best quality, the most versatile for the widest use, and very highly recommended. You cannot change your mind for a better quality image after you have already taken the picture. Maybe if you are certain you won't ever consider cropping or printing them, maybe only then maybe consider Medium size, maybe. But do keep Fine Quality.

Consider a 24 megapixel image (to have a number). In these RGB images, there are three bytes (JPG is 24-bit color) of red, green, blue data for every pixel, so this one is 24 million pixels x 3 bytes RGB per pixel = 72 million bytes (uncompressed, in computer memory). That is simply how large this RGB data is, 24 megapixels and 72 million bytes, which is about 68 MB.

When stored in the JPG file, camera JPG Fine Quality compression squeezes this Large 68 MB data down to about 12 MB file size (cameras can vary, this is from the Nikon D5200 manual, page 241, at Memory Card Capacity). This is an average size, JPG compression varies some with individual image detail content (more below).
Anyway, this Fine Quality JPG compression reduces this 24 megapixel JPG file size to be 12MB/68MB = 18% of original uncompressed data size. This is a drastic file size reduction, but Fine Quality is usually still considered an acceptable quality for average good-enough JPG quality.

The camera JPG compression named Normal Quality squeezes the file down to 6 MB, which is 6MB/68MB = 9% of original size. This is pretty small, sub-normal really, it has had the dickens squeezed out of it, and the image quality suffers more. Some people apparently don't care, but it depends on how critical your use. And this same Large image as camera Basic Quality is 3MB, or 4% original data size. This is for when storage or transmission concerns are obviously greater than concerns about the image quality we actually have to look at. There is a little of it left, but a bigger JPG file is better quality, and smaller is ... not.

The next images below (a small crop from a snapshot on a cruise ship) are saved with JPG compression varying from 20 to 100 JPG Quality, and also one PNG (lossless compression). TIF would look the same as PNG (both are lossless), but browsers typically don't show TIF files. The JPG files were saved with Irfanview 4.33 menu File - Save AS - JPG. IMO, Faststone 4.6 gives same results and sizes, no differences I can notice - guessing that they may possibly share the same code library. However, the numbers are not absolute or unique, and since there are several parameters involved with JPG compression, different programs typically use different option parameters. The details are not just a single number, so the same numerical value is not necessarily the same exact effect in different programs. Irfanview JPG Quality runs 1..100, but Adobe Photoshop has two scales, 0..12 in Save As (more conservative, less compression), and another 0..100 in Save For Web, which is different itself. Generally if using JPG for your good photo images, do start at a high value, like Quality 9 or 90.

To know and understand JPG losses, we of course must learn to recognize JPG artifacts. There are a couple of types of JPG artifacts. One is 8x8 pixel blocks seen in smooth featureless areas (pixels made to be the same one color in the block, which compresses better). Another is harsh artifacts around sharp edges. Due to these artifacts, some pixels now have different color values. Viewing at 3x actual size helps to learn to see JPG artifacts. They are real, regardless if you are aware or not.

This image is a small 300x300 pixel 100% crop (actual size) from a 4288x2848 12-megapixel RAW image (Nikon D300). All are of course identical copies of the same one pristine image (from RAW, it had never seen JPG before), just with different text and JPG compression added. A PNG image is added for comparison, because PNG uses lossless compression, no JPG artifacts (and browsers can show PNG).

Most of the images above may look decent enough at first glance, but repeated next are the SAME seven files again, exactly the same small image files are simply shown a second time from your browser's cache (not a copy, but the same exact image files), but now with browser instructions to enlarge them 3x for a better look at the artifacts (this interpolation is necessarily a blurring operation, but all suffer the same).

Starting off at compression 20, you can easily see the 8x8 pixel blocks in smooth areas, and the false fringing around sharp edges. These are two types of JPG artifacts (due to JPG compression, and which are not a good thing). This is the meaning of JPG losses... quality losses. All of the detail we see in images is color variations of the pixels. Artifacts cause a different color (to be visible), so data losses are image quality losses.

This is how the images came out of the JPG files (but enlarged now, by interpolation, to recognize it easier). All were the same one image going into the files. The differences (between what we put in, and what we get out), is called "losses", due to JPG artifacts caused by lossy compression. Lossy compression means the process is allowed to change the data a little (the RGB data is the color of each pixel) to make compression more effective, simply ignoring difficult details - and the file can become quite small that way. There are of course diminishing differences with higher Quality, but the PNG just looks sort of "pure" (lossless). The added text is all the same font and the same identical operation for all, but PNG just comes out more pure. In contrast, lossless compression cannot be as effective making a tiny file, because it must honor higher standards, and absolutely ensure no pixel color value is altered in any way - what comes out of the file is an exact copy of the data that went into the file.

The examples above show what is meant by JPG losses. It is about losses of quality - meaning, the pixels that come out of the file are simply not as good to look at as the pixels that went into the file. The image still has the same count of pixels, and the size is still three bytes per pixel when opened again, etc, but now some pixels have altered colors, called JPG artifacts... like dirty pixels, which is quality losses. Remember, artifacts continue to accumulate, getting worse, every time you edit and Save the file as JPG again. Each Save is lossy compression again. And this loss is not repairable, so be aware, don't do this repeated times. There are better plans.

Make no mistake - JPG 100 is NOT 100%, Not a percentage of anything. It is always still JPG, and JPG Quality 100 is NOT a perfect copy. Above is the pixel difference in the original JPG 100 image and the PNG image, as shown by the Adobe Photoshop CS5 Calculations tool, when comparing only the blue channels (red channels look essentially the same as this, but the green channels are pretty dark in this one). This is enlarged some, but is shown at default brightness, the actual difference, no fudged adjustments. Seems to me the differences are in the anti-aliasing on the edges of objects (almost like a line drawing). These were all the same image with the same pixels, until compression, so my notion is that JPG apparently didn't consider saving all of these infrequent colors as important. Faststone has a menu that counts colors in images, and reports the PNG image has 35,046 colors, and the JPG 100 image has 23,368 colors (only 2/3 as many now.) There is more color count in the lower JPG values, which are the artifacts probably. Photoshop JPG does the same reduction of colors, in about the same degree. The JPG 100 file, when saved again now as PNG, is nearly 5% smaller than the original PNG file, because its modified data is easier to compress now. JPG 100 is a relative number, it does NOT mean 100%, and it is NOT lossless (but yes, 100 is still pretty good).

If you start your comparisons from an original JPG image, you may not see this much difference remaining (you have no base of reference then), but the image here started out as RAW, it had never seen JPG before. This may be minor visually, but there is no way we can argue that they are still identical images. We might argue "good enough", and Quality 90 probably is good enough (few artifacts still very visible). However, if this is the master copy of our prize image, we may not be willing to consider any argument at all about quality. But JPG Quality 100 is pretty good, and the file sure is small. I would have no concern about sending the JPG 90 or 100 images out to be printed (printers skip over small details too), but not to keep as my archived master copy, which might be edited or resampled again. The actual worry is about creating more differences next time it is saved as JPG. JPG artifacts accumulate at every Save as JPG (compressed again at every Save).

The only point here is that even JPG Quality 100 is definitely always still JPG, and it is never going to be lossless compression. In contrast, lossless means every pixel's data value (color) comes back out of the file exactly the same as what when in - no artifacts. That's always a good thing, if quality is a major concern. If you have read this far, I assume you are concerned about image quality.

Speaking on a 1..10 scale (Adobe), JPG Quality 4 might sometimes be "good enough" for some unimportant web page images, especially back in dial-up days. We could debate it, but Quality 10 may Not be good enough to archive your prized best pictures, at least not for mine. At bare minimum, for your good stuff, 1) you should always use a high value of JPG Quality, if not 10, at least 9. And 2) you should minimize any occurrence of additional Saves as JPG (each Save compresses again, and losses accumulate). When emailing pictures of the kids to Grandma, sure, resample a small copy for video, and JPG 8 is good enough for email. That is an expendable copy, but archives should be the good stuff.

The degree of JPG compression varies with the saving program, and varies with the JPG Quality factor selected, and also varies with individual image scene detail. Sort a large folder of several dozen camera images by file size, of many varied JPG image scenes, but all of same original camera size and size settings, and these will vary probably at least over a 2:1 size range (possibly much more, depending on scene content). Large blank featureless areas, like sky and walls, compress much smaller than extremely detailed areas, like a tree full of leaves.

For an extreme example, four images all the same size in pixels, all with same high JPG Quality, but a large JPG file size difference due to detail.
All are 450x300 x 3 bytes per pixel = 405,000 bytes = 395 KB (uncompressed when opened in memory), but the same JPG Quality varies from 2.9:1 to 21:1 compression ratio in the JPG file.
So... image detail level has effect, and JPG file size reduction is not necessarily always a meaningful measure of JPG image quality.

Much small detail everywhere, 140 KB JPG file, JPG 2.9:1

Fog, almost devoid of detail, 19 KB JPG file. JPG 21:1

Varied detail, a more "average" scene, 94 KB JPG file, JPG 4.3:1

Varied detail, a more "average" scene, 80 KB JPG file, JPG 5:1

That said (different scenes vary in file size, and programs vary too), and so speaking very generally, and barring extremes, an "average" JPG file size ought to be at least 10% of the data size to be "decent", but 20% is better quality (we should like higher quality.) I'd suggest JPG Quality 9 for your good stuff. I would consider High Quality as being 20% of data size (24-bit data size is three bytes per pixel). Repeated JPG saves are bad news, and if you ever saved it ONCE at lower quality, then lower is the best you've got now. The larger you can make the JPG file, the better quality the image will be (and the largest JPG is still tiny, compared to the size of the data). And BTW, another factor, excessive USM sharpening is another factor that can also cause false edges, and sharpening aggravates the JPG artifacts around sharp edges. If you want the smallest file with less artifacts, blur it the slightest amount before saving it. Routine sharpening enhances detail, and can make the JPG file a little larger.

JPG Quality 10 is pretty good indeed, but it is still JPG. JPG is used where small file size is more important than absolute image quality, like web pages or email, or small memory cards. Conversely, if absolute image quality is important, why consider JPG at all? JPG Quality 8 to 10 may be "good enough" for most "viewing" uses, EXCEPT there is always the distinction that JPG is not good for editing, involving saving repeated times - which accumulates and compounds the JPG artifacts each time saved as JPG. Your images are either important or not, and maybe absolute quality is not your biggest concern, but if we take liberties, experience knows the time will always come when it will matter. We cannot undo JPG damage.

There are always ifs and buts, difficult to quantify. There are lossless methods to rotate or flip JPG images, rearranging and rotating the 8x8 pixel blocks without uncompressing and recompressing (Irfanview plugins offers this), so there are no additional losses in those special lossless cases. Applications like Photoshop take heroic pains to try not to recompress image areas with no change (when possible, still has to be the same Quality level too). But bottom line, saving JPG again is a little like Russian Roulette - every time may not get you, but there certainly are risks, and when it does bite us, it always seems at the worst possible time.

Some people really never notice much difference, so then about anything is "good enough" (until it bites them bad, the only copy of their prized image is now full of JPG artifacts - from repeated edits and saves, and/or due to specifying a Quality factor too low). There are other more critical photographers (we all probably have been bitten before, once, at first) thinking that even the very best quality is barely sufficient for THEIR images. We cannot reason with either group. :) Certainly we cannot have the same discussion with both. Risk of course depends on how important your images are, and there are times and places for both approaches, but I lean towards the latter group. Photography is too much time and money invested trying for greater quality, so I really don't see any reason to intentionally select less than maximum quality. $200 for a huge disk is nothing compared to the (tens of) thousands of dollars of photo gear (plan on disk backup too). Certainly there are always a few exceptions, but regarding my more important images, I'm not interested in any debate at all about how little it might hurt. Why accept any risk? Internet transmission speed certainly is a size consideration, but which does not involve any of my archived master copies. Today, inexpensive disk space rules out any reason to compromise my image quality, certainly not my original copies. Indeed, there are better ways to go about it.

The Best Plan When We Need JPG

For those who critically care about their images, the best plan, and actually, the easy way, is first, to always keep and archive your unedited original image as the best master you will ever have. Otherwise, you can never get it back, so never write over it - always keep that original intact for just in case (whatever it is, especially if it is a JPG from the camera - and the camera should of course be set to create the finest possible image it can.)

Then second, when editing (saving only to a copy with a different file name, always preserving the original intact), always save your in-work image ONLY to a lossless format (TIF or PNG), EVERY time, UNTIL your last necessary FINAL one JPG save (at high quality level). For example, Photoshop, and the free editors Irfanview and Faststone (see Google), have batch modes to copy many files from JPG to TIF in one easy operation.
This would be pointless unless you are intending to edit them, because this TIF step will NOT remove any existing JPG artifacts, the data will of course still contain those original JPG artifacts - but subsequent TIF Saves will not add any more.
Computers and disks are big and fast and inexpensive today, this larger file size is a small issue today (and that is in fact simply how big your data is). Then as lossless TIF files, you can edit away, red eye and color and contrast adjustments, straightening and cropping and resampling, and saving again with wild abandon (just NOT as JPG), until finished, at that needed One Final JPG Save. So if it was JPG previously, now the total is only two saves as JPG (original camera save, and this one final save, is two), which is worse than one, but much better than six. Using camera RAW images would also eliminate the first JPG, and offer other advantages too.

And then third, when if any further need to edit it again comes up (resample size, whatever, anything requiring another Save As JPG), discard this second JPG (as an expendable copy), and start over from your better archived master you kept. Avoid using any image saved repeated times as JPG. JPG is lossy, which means we do not get back the same quality we put in. There are more losses every time it is saved as JPG. With lossless formats (PNG or TIF LZW), it does not matter if you save a jillion times. But it matters if saving to JPG. An extra unthinking save as JPG is never a good plan. And if it overwrites your only original copy, it is a terrible plan, you can never get it back.

Batch processing: Speaking of the philosophy of expendable JPG, and assuming you keep a high quality lossless archive, then there are easy ways to run off batches of expendable JPG for one-use viewing, temporary JPG copies sized for monitor viewing, or for HDTV viewing, or to upload to be printed, etc. Several programs have batch modes to do this.

Irfanview editor has its menu Files - Batch Conversion/Rename. Free from the internet.

Of these two free programs, IMO, Faststone is the better editor (more conventional and complete), but I like Irfanview as the better viewer (less cluttered).

Here is the trick: We always resample using "Preserve Aspect Ratio", so generally, if you want say 1800x1200 pixel size (to print 6x4 inches), you can specify the larger target dimension twice, 1800x1800, and then regardless if the batch contains mixed landscape or portrait shapes, the largest dimension will be 1800, as appropriate. Aspect ratio controls the smaller dimension, which can be however you prepared it.

Or the Photoshop batch has its Resize to Fit option, and the Faststone batch has its Switch Width with Height option to do this same trick more overtly. Computer speed today makes it trivial to just run off whatever size you want at the moment, and then discard those expendable JPG after that one use. You always have your high quality lossless archive copy, and can do this at will.