In order to make this migration smooth, you require to have debiandoc-sgml version 1.2.20 or newer supporting debiandoc2dbk (wheezy version). If you are using squeeze environment, installing wheezy version directly onto your system is good enough (this is a Perl script).

How to convert DebianDoc SGML source into DocBook XML

Conversion from SGML to XML.

Let's assume we have followings in your working directory:

manual.en.sgml manual.xx.po manual.yy.po funky.ent

Please note you need to install

docbook-xsl moreutils libxml2-utils

Step 0: Prepare source

Copy example scripts in /usr/share/doc/debiandoc-sgml/examples to the working directories and make them executables.

If it is not too much trouble, please fix such problem in original PO files ?DebianDoc SGML (non-en). This will improve quality of conversion but is not critical if you do not care loss of these parts.

Step 5: Keep entities

The above conversion embeds all the entities into converted ?DocBook XML files and all the comments in the source is lost.

Here ia a bit more complicated way for conversion but automated with ./debiandoc2dbkpo.

In order to preserve *.ent, you create touched up version of it (them) by the following:

This should work in most cases but may fail while creating PO files (*.??.dbk.po and *.??.dbk.pox) and . I will discuss possible sources of problems later.

If this build PO and html file OK, you have converted ?DocBook XML of English and Language(xx and yy) without comments.

Since above PO creation uses msgtranslated to reset msgstr contents if they are the same as msgid contents, generated PO files contain untranslated strings. You may wish to run PO file editor such as poedit to check and touch up files.

Step 6: Keep comments

Normal conversion process will strip comments. The idea is to convert comments

<!--- comment ... --->

into

<p>=====COMMENT=====
comment ...
=====TNEMMOC=====</p>

so these can be restored later. As long as comments are located between normal paragraph, example script ./debiandoc2dbk-wrap does good enough job. You may need some manual edits prior to using this. All comments before <book> and after </book> needs to be removed to start with.

These [XXX_FIXME.*_XXX] can be recoverted in final ?DocBook XML and its PO files by your manual touch-up. Since these are so common, ./debiandoc2dbk-ent can handle recovery of them.

Sometimes, translator places additional contents within additional <footnote>...</footnote>. For this type, the PO file proofing script does best effort to retain translation by mangling tags within <footnote>...</footnote>.