Next to
keypunching, Optical Character Recognition is the oldest data
entry technique in existence. Long before the first key-to-disk
system of CRT was used, Optical Character Readers were entering data in
commercial and government EDP installations.

The popularity of
OCR has been increasing each year with the advent of fast
microprocessors providing the vehicle for vastly improved
recognition techniques. This can be shown in OCR wands now
reading print that, over 10 years ago, large batch readers would
have rejected. There has also been tremendous improvements in
increasing both effective read rates and accuracy.Data Entry through OCR is faster, more accurate, and
generally more efficient than keystroke data entry. Desktop OCR
scanners can read typewritten data into a computer at rates up to
2400 words per minute!

How Does OCR Work?

There are two
basic methods used for OCR: Matrix matching and feature
extraction. Of the two ways to recognize characters, matrix
matching is the simpler and more common.

Matrix Matching
compares what the OCR scanner sees as a character with a library
of character matrices or templates. When an image matches one of
these prescribed matrices of dots within a given level of
similarity, the computer labels that image as the corresponding
ASCII character.

Feature Extraction
is OCR without strict matching to prescribed templates. Also
known as Intelligent Character Recognition (ICR), or Topological
Feature Analysis, this method varies by how much "computer
intelligence" is applied by the manufacturer. The computer
looks for general features such as open areas, closed shapes,
diagonal lines, line intersections, etc. This method is much more
versatile than matrix matching.Matrix
matching works best when the OCR encounters a limited repertoire
of type styles, with little or no variation within each style.
Where the characters are less predictable, feature, or
topographical analysis is superior.

OCR Fonts

What is a font? A
font is the term given to a set of characters, usually 0 - 9, A
through Z, and a few special characters. Each character within a
font will have a defined reproducible size and shape. For OCR,
these are defined by ANSI, the American National Standards
Institute.

OCR fonts, or
characters, that can be read by the lower speed, lower cost
systems we are discussing here require well defined character
shapes that are very reproducible and designed to be both machine
and human readable. These unique and well defined character sets
allow for greater accuracy.

Text input devices
are page readers or document scanners that scan entire documents
or large portions of documents. The source data is entered with
the intention of someone editing it during or after it is
scanned. Text input devices have varying degrees of automation
from hand fed to having automatic feeding, reading, sorting, and
stacking capabilities.

Data Capture
devices are designed to capture repetitive data and to perform
formatting functions on the data as it is being entered. The data
delivered from the scanner to the computer must be very accurate
because it is entered without the intention of being edited
later, so accuracy must be higher than text input.

Elements of a
Successful OCR Application

The elements of a
successful OCR installation include:

Proper
Media

Forms
Design

Data
Integrity and Output Processing

OCR Reader

Reasons for Using OCR

There are a number
of reasons for choosing OCR scanning over other methods of data
entry. Some of the more
significant include:

To reduce
Data Entry Errors

To
Consolidate Data Entry

To Handle
Peak Loads

Human
Readable

Can Be
Used with Many Printing Techniques

Scanning
Corrections

When is OCR Preferred
over Bar Code?

OCR is better
suited for data entry in a controlled environment for any number
of characters. For example, remittance processing where data on
utility bills or other turnaround documents need to be entered
into a system.

Some OCR scanlines
may contain more than 40 characters and a variety of valuable
information such as date the bill is due, account number, amount
owed, type of service, etc.

Bar code is best
suited where the primary function is to identify parts or items
in harsh environments or where the media is to be used over and
over again and consists of relatively few characters. For
example, identifying and tracking passenger luggage in the
Airline industry. Bar codes are very tolerant to rough handling
and harsh environments, but require much more space on a label or
document than OCR. Inch for inch, OCR can hold 6 times more
information than a standard bar code.