Dexter

The following is an excerpt from a poster presented at the
American Astronomical Society's 2000 Summer meeting in Rochester, NY

ADS' roughly 1,000,000
scanned pages contain numerous diagrams
and figures for which the original data sets are lost or inaccessible.
Having scans for the figures invites digitizing the data points to recover
at least a part of these data. Performing this digitization automatically
is still beyond the capabilities of current OCR systems, but the computer
can ease this process for a human.

This was the starting point for Dexter, a Java applet that runs in the
users' browsers and provides an interface for selecting the
part of the page that is of interest.
On that selection,
coordinate axes, points and error bars can be marked and, of course,
corrected. [...]

In the future, we plan to implement some recognition algorithms that would,
e.g., trace a line for the user or automatically search for markers.

Some recognition capabilities are present in the sourceforge release,
though you'd better not look at the implementation :-)

In the release on sourceforge, there is a rudimentary standalone version
of Dexter called Debuxter (the name should indicates that it was written to
ease development) and a more refined one called goucho. Read HOWTO.standalone
in the documentation or say make test.

If all you want is just run Dexter on some image or PDF provided
by you, consider using Standalone
Dexter at the GAVO data center.

In terms of documentation, there's the HTML help file used at ADS
included with the distribution. Also, we've written a paper on it
for the ADASS 2000 conference. It is available online, and
we also offer the original poster without
background.