> My primary consideration was, and still is, where might I make the > best investment of development effort. I decided that any attempt > to build recognizers that would handle all of the possibilities of > TEI (or other SGML) DTDs would require a very significant development > effort all by itself. So, rather than include what would effectively > be a full SGML parser into the system, I decided that we would use> existing SGML parsers (such as Jim Clark's) to reduce all of the > variations to a small subset.

For the record, lest the above lead to confusion, all of James Clark's
released SGML parsers (the earlier sgmls, and the more recent nsgmls) are
to the best of my knowledge fully functional, where by fully I mean they
handle all features of the standard found in 99.999% of the documents out
there. The later 'nsgmls' is used extensively in work with structured text
here at Sun, as it is, I suspect, at many other large companies.

I can assure you moreover that most of the tools that I and my coworkers use
most of the time for work related to SGML and XML are also free. They are
for the most part the same tools that I used when I worked for an
academic organization that depended on grant funding. Those that are not
free are either available to the academic user at a substantial discount (e.g.
Omnimark, DynaWeb) or can be replaced by free, mostly GNU, equivalents. Even
the operating system.

The rate of development of such tools seems to have accelerated with the
advent of XML. Yeah!

- Gregory Murphy
Software Engineer
Solaris Software
Sun Microsystems

Note: The opinions expressed herein do not necessarily reflect those of Sun
Microsystems, its subsidiaries or its contractors.

> >> Of course, it's also possible that humanists simply care a lot less> >> about reusability, sharing of resources, and the electronic> >> preservation of the cultural heritage than was thought when the TEI> >> was created.

I think it's safe to say that humanists (on this list, as well as more
generally) care very deeply about the preservation of the cultural
heritage. Still, there's no denying that humanities computing folk are
generally less enthusiastic about standards and open source efforts than
our less humanistically motivated colleagues.

Creating tools to deal with the entire TEI, to choose one example,
certainly involves an *enormous* amount of time and programming effort,
but I dare say that it is no more herculean a task than any of the open
source efforts currently under the GNU license banner. Would any TEI tool
demand more effort than what is required for creating and maintaining the
GNU C compiler? Or the Perl distribution? Or GNU Emacs?. These projects
are monuments to collaborative effort and the "sharing of resources," and
they are largely undertaken by a volunteer army of people with the same
time and budgetary restrictions that we have. Indeed, it's these very
restrictions which make "Going it alone" unthinkable.

There are many important exceptions to this rule, but I still see a lot of
humanities software people operating under a fading model of intellectual
property: proprietary formats, hidden code, and restrictive licenses. I
know of at least one program developed by one of our number that attempts
to ensure, in its license, that the product and its author are properly
cited in scholarly work because of the "original algorithms" included in
the code. I understand that Linux also contains some original
algorithms--all of which are visible to anyone who wants to see them. Or
better, improve upon them.

Willard, in his lovely birthday oration, called upon us to consider the
road ahead and the problems which need work. I, for one, hope that the
answers to those problems aren't pursued behind closed doors.