Edge Thresholding

Edge thresholding, as produced by the
VST-1000 chip, results in more precisely formed characters, with
accurate stroke widths, which are more completely separated from
their background and neighboring characters. These
characteristics apply even in cases of lower contrast characters
or higher contrast backgrounds.

These benefits accrue to all systems using the VST-1000 chip
for producing binary images without any additional work being
required by the OCR vendor or systems integrator.

Several scanner manufacturers use these chips in their
scanners. This is the lowest cost solution. Ask your scanner
vendor if they use Picture Elements technology for their binary
images.

Other scanners, if they have grayscale outputs, can be
attached to the Picture Elements ISE Board to produce binary
images thresholded by the VST-1000 chip.

The ISE Board can accept the D3MST Daughterboard, which uses
three VST-1000 chips to simultaneously produce three
binary images, called Normal (N), High (H) and Low (L).

The Normal image, at normal contrast
sensitivity, is suitable for 95 to 99 percent of an
average document flow.

The High image, at a higher contrast
sensitivity, produces better rendering of characters
created with low-contrast marking devices, such as
faint pencil or light-colored pens.

The Low image, at a lower contrast
sensitivity, produces a less cluttered image for
documents where the background is of high contrast.

This permits the OCR engine to request the High and Low
sensitivity images from the scanning subsystem (on an exception
basis) for documents where statistically lower confidence levels
are seen in the recognition results from the Normal sensitivity
image. If this is done for only 1% or 5% of all documents, the
increase in needed OCR capacity or network throughput is small
(2% to 10%). This is the preferred method.

Alternatively, all three binary images can always be sent on
to OCR. While this results in a need for triple the OCR and
network throughput, it could result in somewhat better aggregate
OCR results, since a greater number of marginal cases will see
improvement.

The ISE Board can accept the DJPEG Daughterboard, which
provides hardware support for JPEG grayscale or color
compression, offering full resolution capture of grayscale
images simultaneous with the production of binary images
from the same data. Both images can be captured to disk on the
scanning workstation.

This permits sophisticated OCR algorithms to normally use the
binary data, but to request the use of the compressed grayscale
for a given document, whenever a particular binary image gives a
low confidence level.

This can enable three possible techniques for
exploiting the grayscale data.

The OCR engine (or an alternative engine for use on
these exception items) can directly use the grayscale
to segment the characters in the region of
the low confidence characters, reverting to the
binary image for subsequent steps. For a description
of a grayscale segmentation algorithm, see Seong-Whan
Lee, et al, A New Methodology for Gray-Scale
Character Segmentation and Recognition, IEEE
Trans PAMI, Vol 18, No 10, Oct 1996, pp 1045-1050.
After segmentation in grayscale, the corresponding
binary data can be used for recognition, since the
registration between the gray and binary images is
perfect.

The OCR engine can request, on an exception basis,
that the ISE Board in the scanning workstation re-process
new binary images (or snippets of binary
images) from the grayscale. Several of these can be
produced, with the recognition results having the
highest confidence being used.

The OCR engine can (for low-confidence exception
items) directly use the grayscale image for
the recognition step, not just for
segmentation. Unfortunately, some vendors claim to be
able to do this, but in fact apply a software
thresholding algorithm (perhaps iteratively) on the
grayscale data before working with it. The
sophisticated VST-1000 algorithm, operating at
several billion operations per second, will give much
better results in such a case (when using the
technique of re-processing binary images as described
in the previous paragraph).

Note that the key to this architecture is the use of
compressed grayscale. Uncompressed grayscale (being perhaps 80
times larger than a Group 4 compressed binary image) cannot be
written to disk at full production scanner speeds, forcing the
scanner to slow way down as soon as memory is full. Compressed
grayscale (at say, 20:1 compression) is only a few times larger
than the compressed binary image, requiring a much less heroic
effort to increase disk throughput.

Picture Elements has shown that the thresholding results
achievable from 10:1 to 20:1 JPEG compressed grayscale are
indistinguishable from those produced from raw scanner data.
Therefore we call this grayscale image the Surrogate Original
(tm), since it can stand in for the original paper document
to support re-processing without physically re-scanning. It is
the key to the No Re-Scan Scanning (tm) architecture
which results in significant scanning labor savings and improved
OCR performance.

Note: No Re-Scan Scanning and Surrogate Original
are trademarks of Picture Elements, Inc.