TechWhirl Sponsors

About TechWhirl

TechWhirl (TECHWR-L) is a resource for technical writing and technical communications professionals of all experience levels and in all industries to share their experiences and acquire information.

For two decades, technical communicators have turned to TechWhirl to ask and answer questions about the always-changing world of technical communications, such as tools, skills, career paths, methodologies, and emerging industries. The TechWhirl Archives and magazine, created for, by and about technical writers, offer a wealth of knowledge to everyone with an interest in any aspect of technical communications.

Re: HTML Pandora's box

Comments on Geoff's posting, with thanks for coming up with the very
descriptive title "HTML Pandora's box" which I wish I had thought of first!

> In this manner, we use the "open files in
> previous version's format" feature of the applications to do our
> maintenance for us. (This approach _will_ break down at some point,
> but as I noted, it's a reasonable interim solution.)

It might break down sooner than you think. We lived through the WordPerfect 5.1
to 6.0 debacle with one of our clients and a 1,000 page plus document that was
a key corporate legal document. I agree with you that it is worth trying, but
the wp/dtp manufacturers haven't been completely devoted to backwards
compatibility in their major releases.

> so it should be possible to convert the
> files into HTML automatically ...

Yes, this is straightforward.

> ...if we use plain vanilla SGML features likely to be supported in HTML.

You don't even have to do that. HTML is such a simple tag set that the SGML you
design for your documents will almost certainly be far richer in terms of added
content. In that situation, it becomes a matter of writing conversion code to
'dumb down' your SGML element structure to the simpler set of HTML formatting
tags.

> It should even be possible to do this in batch mode.

In fact, batch conversion is the only way that makes sense. You could use a
programming language like Omnimark (from Eoxterica; an application development
language written specifically to exploit the structure in an SGML environment),
PERL or, as you note, you could even do it with something as simple as
WordBasic macros. Given that you are going *from* a more complex markup
structure *to* a simpler markup structure, that would work.

> I'd suggest that we replace HTML with SGML if
> I thought we could convince the powers that be to make this change.

SoftQuad's Panorama Browser which is being bundled along with the next release
of SpyGlass Enhanced Mosaic is a general purpose SGML browser. There is some
movement in this direction.

> 3. The final solution that I'm contemplating is using a full-text
> database that also stores "binary large objects" (BLOBs), such as
> graphics files.

There are several of these available now or soon. I'll have to scrounge around
for the list. Most will store SGML as a tree structure of elements (which
actually let's you do some cool stuff) and wp/dtp files, graphic files, etc. as
blobs. So you can create systems that do mix-n-match from existing source
materials. You could put a Web server application in front of a database like
this and have a very powerful and maintainable source of information. Actually,
to see just such an approach, check out Novell's home page and look at the
manuals. They are stored in SGML and delivered to the Web using a product
called DynaWeb. I believe that some, if not all, of the information is stored
in a database rather than in specific 'document' files.

> most importantly that ASCII
> and UNICODE are eternal: this means no version control problems, and
> to maintain your files, all you need to do is edit the export filter,
> not the files themselves.