We are extensively converting our "legacy" html to xhtml,
using a combination of htdig, JTidy and locally written
JChemTidy.

One element we have focused on much is <object>. We
replace all instances of <embed> by< object>, because

a) <embed> is not well formed (ie it should be <embed />
b) it is not validatable. This is because the attributes of
<embed> are not defined by a DTD, but are instead implicit
in whatever attributes the plugin that <embed> resolves to
supports. Thus two users with different plugins may well
be running implicitly different DTDs for their document.
This is not good.

<object> solves both these problems.

Our only problem is that htdig 3.2 does not parse object.

A long time ago, we hacked htdig 3.1 to parse <embed> and
<object>, but these mods do not appear to have been incorporated
into htdig 3.2.

If someone could rescue them, we would be very grateful.
On this point, if htdig could also be persuaded to index the
title attribute of elements such as <object> it would be a great
help. As part of the xhtml conversion process, we build a title
if none exists, and it would be nice to have htdig pick it up!