New cjpeg features

The -quality option has been extended for support of separate quality settings for
luminance and chrominance (or in general, for every provided quantization table slot).
This feature is useful for high-quality applications which cannot accept the damage
of color data by coarse subsampling settings. You can now easily reduce the color
data amount more smoothly with finer control without separate subsampling.
The resulting file is fully compliant with standard JPEG decoders.

A new -scale option is provided with cjpeg which complements the corresponding djpeg
-scale option. The supported range of scaling factors is 8/N with all N=1...16.
This means you can now easily alter the nominal spatial resolution of a given source
image while compressing to JPEG without additional resampling.
For example, if you have an image sensor providing an effective capture resolution of
2268x1512 pixels (HI resolution), you can now directly generate a MED resolution of
1512x1008 pixels (-scale 2/3) and a LOW resolution of 1134x756 pixels (-scale 1/2)
from the sensor source resolution with the library while compressing to JPEG without
additional resampling.
(An efficient 12x12 FDCT is used in the -scale 2/3 case, and an efficient 16x16 FDCT
is used in the -scale 1/2 case instead of the standard 8x8 FDCT inside the library,
ensuring high-quality downscaled results - the resulting file is fully compliant with
standard JPEG decoders.)

Application notes

Note that the -quality ratings refer to the quantization table slots, and that the last
value is replicated if there are more q-table slots than parameters. The default q-table
slots are 0 for luminance and 1 for chrominance with default tables as given in the
JPEG standard. This is compatible with the old behaviour in case that only one parameter
is given, which is then used for both luminance and chrominance (slots 0 and 1).
More or custom quantization tables can be set with -qtables and assigned to components
with -qslots parameter.CAUTION: You must explicitely add -sample 1x1 for efficient separate color
quality selection, since the default value used by library is 2x2!

In library use you set (after usual jpeg_set_defaults()) the new q_scale_factor[] fields
of the jpeg_compress_struct and then call the new API function jpeg_default_qtables(cinfo,
force_baseline) for using default tables with the given q_scale_factor[] values.
For custom q-tables you use jpeg_add_quant_table() with scale parameters as before.
See also new rdswitch.c and cjpeg.c for example.
Note that the q_scale_factor[] fields are the "linear" scales, so you have to convert
from user-defined ratings via jpeg_quality_scaling().
Here is an example code which corresponds to cjpeg -quality 90,70:

For cjpeg -scale application you just set the scaling factor similar to djpeg, with
complementary range 8/N (N=1...16), or via the new scale_num/scale_denom fields of the
jpeg_compress_struct:

cinfo->scale_num = 8;
cinfo->scale_denom = N;

Note the importance of always calling jpeg_set_defaults() first, since this assigns
defaults to the new fields in case you don't set them explicitely (1/1 scaling here).

Note that the scaling options of cjpeg and djpeg work complementary in that cjpeg
-scale M/N followed by djpeg -scale N/M gives the original image resolution
(with possible rounding deviations).

The downscaling options of cjpeg can thus be thought as a further compression
option/parameter in that you can now directly choose between reduction (quantization)
in the DCT domain and reduction (resolution downscale) in the spatial domain.
It is remarkable that all that works with the standard 8x8 DCT JPEG system.

Note that the high cjpeg scalings 8/1 and 4/1 are interesting in this regard from
one standpoint:
Since the used 1x1 and 2x2 FDCTs/IDCTs do not involve any transcendent or fractional
multiplication (with appropriate quantization), these modes can be used to simulate
a true lossless coding scheme within the standard DCT JPEG system!
In particular, cjpeg -scale 8/1 just sets each source sample as DC value
(properly scaled) in the 8x8 DCT block and all ACs zero.
While not very interesting for scaling purposes, you can decode that image
with djpeg -scale 1/8 (which just derives each output sample from the
DC value) and get an exact copy of the source image (with appropriate
quantization, which should be 8 for the DC in this case, and avoiding
YCbCr/RGB conversion).
This should also work under similar conditions (q-value=2 for the DC and 3 ACs)
with the cjpeg -scale 4/1 and djpeg -scale 1/4 case, because the 2x2
FDCT/IDCT are just simple sum/diff schemes without transcendent or
fractional multiplication.
It is to investigate how the entropy coding parameters can be adapted in these
cases for efficient lossless image coding...

Implementation notes

The -quality extension adds an array q_scale_factor[NUM_QUANT_TBLS] to the
jpeg_compress_struct for scale factor definition per quantization table slot.
rdswitch.c adds a function set_quality_ratings() for processing the new quality-ratings
parameter string.
The library API adds a function jpeg_default_qtables() for setting up default
quantization tables with separate scaling factors.

Adding -scale support to cjpeg was a major effort since there was no provision
for scaled DCT in the compress part of the library compared to djpeg and decompress.
I think that I've managed to get this together.

A new encoder capability option DCT_SCALING_SUPPORTED has been added in jmorecfg.h to
enable the new scaling support. Note it requires DCT_ISLOW_SUPPORTED.

The core implementation was done by extending the file
jfdctint.c. This file is HUGE now (160 KB, beware) and
contains lot of optimized FDCT routines with various input sample block sizes.
See the comment on top of this file for more details about method and implementation.

The interface (prototype) of the core FDCT functions had to be changed for support of
variable input sample block sizes, and the DCT manager module now controls a private
array of separate FDCT function pointers per component.
Much of the logic is similar (complementary) to the decompress part.

Note that the new 16x16 FDCT routine is also used for efficient resolving of the
common 2x2 chroma subsampling case without additional spatial downsampling while
normal (1/1) encoding in the library.
Separate spatial downsampling for those kind of files is now only necessary for -scale
8/N with N>8 cases.

Furthermore, separate FDCT routines are provided for direct resolving of the
common asymmetric subsampling cases (2x1 and 1x2) without additional resampling
and complementing the corresponding IDCT methods for decoding.

Due to additions in the compress master record the new library is not binary compatible
with the old one. Otherwise provision was made to retain the library API compatible
with existing applications.