A model dtx file

In my previous post, I’ve tried to give a very general overview of how the dtx file format comes about, from a combination of the syntax of DocStrip and ltxdoc. The problem with the bald details is that there are still lots of way to actually use the ideas to construct a dtx. So here I’m going to detail a model dtx, which is ready to be filled in with real code and documentation. The entire file is available here as demopkg.dtx: get it now if you are impatient!

The idea of constructing a dtx file in the way I’ll describe is that it lets us achieve several things in one go:

All of the files for a package can be derived from a single source (unless you need a binary, of course).

The README is included in the dtx, with this useful information at the start.

Running (pdf)latex <name>.dtx does the extraction then typesets the documentation. This way, the documentation always has the latest code available, and users don’t need to worry about which method they use to get stuff extracted.

Most of the ideas here are not mine: Will Robertson came up with a lot of this. I’m just going to give some details of what is going on. I’m going to present the source in order, with a section of the source followed by some comments explaining what is going on. I’m going to call the demonstration package ‘demopkg’: something easy for search and replace. Where ever possible, \jobname is used in the source so that the file name changes automatically when moving from one package to another.

% \iffalse meta-comment
% !TEX program = pdfLaTeX

The file starts off with an \iffalse which will mean that ltxdoc will skip all of this code when typesetting the document. I use TeXworks as my editor, so I include the special !TEX program comment so that it defaults to pdfLaTeX with all of my files: this does no harm so may as well be there. The same comment is also recognised by TeXShop.

%<*internal>
\iffalse
%</internal>

There is then a guard called ‘internal’: this is never extracted out, but lets us have an uncommented \iffalse in the code. which will mean that the next section will be ignored by TeX initially. The idea here is that we are going to have some text (the README), that TeX would otherwise try to typeset. We don’t want that, so need to skip it at the moment.

%<*readme>
----------------------------------------------------------------
demopkg --- description text
E-mail: you@your.domain
Released under the LaTeX Project Public License v1.3c or later
See http://www.latex-project.org/lppl.txt
----------------------------------------------------------------
Some text about the package: probably the same as the abstract.
%</readme>

This part is pretty obvious: the README file for the package, inside guards called ‘readme’. As you might expect, this will get extracted out later as the README file. In the initial TeX run, this text will be skipped (because of the \iffalse), but when DocStrip runs it will show up (as DocStrip will ignore the \iffalse, which is in a different set of guards).

Back with the special ‘internal’ guards, the \iffalse is ended and a check is made on the current format. For LaTeX, a group needs to be begun so that DocStrip can be loaded without later problems. For plain TeX, only the extraction is going to happen, so that is not an issue.

%<*install>
\input docstrip.tex
\keepsilent
\askforoverwritefalse

The next section, inside ‘install’ guards, is the instructions for extracting the code out of the dtx. Later, this will also turn into a stand-alone ins file. DocStrip gets loaded, then we tell it to do its job without asking for any conformation or printing too much stuff.

\preamble
----------------------------------------------------------------
demopkg --- description text
E-mail: you@your.domain
Released under the LaTeX Project Public License v1.3c or later
See http://www.latex-project.org/lppl.txt
----------------------------------------------------------------
\endpreamble
\postamble
Copyright (C) 2009 by You <you@your.domain>
This work may be distributed and/or modified under the
conditions of the LaTeX Project Public License (LPPL), either
version 1.3c of this license or (at your option) any later
version. The latest version of this license is in the file:
http://www.latex-project.org/lppl.txt
This work is "maintained" (as per LPPL maintenance status) by
You.
This work consists of the file demopkg.dtx
and the derived files demopkg.ins,
demopkg.pdf and
demopkg.sty.
\endpostamble

Some simple boiler-plate text, that DocStrip will add to the start and end of each extracted file. Of course, this can say what you like.

This section is the instruction to actually extract the LaTeX package file from the dtx. Each file to be extracted needs a line saying how to create it, so if there is a class to extract there would be a line for that, and so on. The \usedir instruction can be used to tell DocStrip how to lay files out: it is best to include it as some people use this. Normally, it will just specify tex/latex/<package>, but might change if there are lots of files to lay out in a structured way. For example, cfg files are often put in tex/latex/<package>/config.

%</install>
%<install>\endbatchfile

That ends what will get extracted into the ins file, so the install guard is closed. The second line is needed as the ins file needs to include \endbatchfile (for DocStrip), but we don’t want the same effect when the dtx is doing the extracting.

When extracting the dtx (with TeX or LaTeX), we need to generate the ins file and the README, which is done here. The ins file is quite simple: the the same process as the sty file. However, there are a couple of points about the README. First, we don’t want DocStrip to add any extra text, hence \nopreamble and \nopostamble. Second, DocStrip can only make files with extensions, so the file has to be called README.txt. (It can be renamed later: hopefully there is no loss of clarity.) If plain TeX is in use, that is the end of the processing, whereas for LaTeX the group containing DocStrip can be closed.

Next, the fact that DocStrip can process blocks in different places can be used for the same file. This part of the package does not really need to be printed later on, and done this way the version number is included near the top of the source. Things don’t have to be done this way: this section can always be left out if you like.

The next block is the driver: this is the information used to typeset the code and documentation. I normally load the package I’m talking about so that I can use it in the documentation, and load a few refinements (modern fonts, hyperdoc to get hyperlinks, and so on). There are a few ltxdoc-specific instructions here: they mean that we get a proper index and information linking macro use information to the code.

% \fi

This matches the \iffalse in the very first line of the file: it marks the beginning of material which will actually be typeset.

Here, the title is set up and printed. A few things to notice here. By using \GetFileInfo, the version and date information are picked up from the package itself: no repetition of the information is needed in the dtx. Also, we can’t use % as a comment character, and so ltxdoc sets up ^^A to do the job instead.

%\changes{v1.0}{2009/10/06}{First public release}

General changes (not associated with any particular macro) are best listed somewhere early on. These will be used by the \PrintChanges macro to provide users with a change log.

%
%\DescribeMacro{\examplemacro}
% Some text about an example macro called \cs{examplemacro}, which
% might have an optional argument \oarg{arg1} and mandatory one
% \marg{arg2}.
%

This is where the documentation goes. I’ve included an example macro with a couple of
arguments as reminders of the syntax.

%\StopEventually{^^A
% \PrintChanges
% \PrintIndex
%}
%

This macro marks the end of the user part of the documentation. The two functions in the argument
will be used either here (if the code is not typeset) or after the code (if it is typeset). As the dtx file is now,
the code will print. However, in the next blog post I’ll talk about printing only the documentation and
missing the code out.

% \begin{macrocode}
%<*package>
% \end{macrocode}

The lead off for the package code itself opens the guard for extracting the code. Normally, I like to have this on its own, to remind where what is going on.

Here we have some code: separated out using the macrocode environment. As I described in the last post, the \begin{macro} … \end{macro} block indicates that this is where \examplemacro is defined: indexing needs to know this. The \changes given in the code block only get printed if the code is typeset. They are therefore best used for low-level information, rather than usage changes that users need to know about.

% \begin{macrocode}
%</package>
% \end{macrocode}
%\Finale

The last part of the file: close the guard for the code, and call \Finale. This runs anything delayed from the earlier \StopEventually, so in this case the index will get printed here if the code is typeset.

10 thoughts on “A model dtx file”

Acknowledgements also to Scott Pakin who wrote the original “dtxtut”, which inspired most of my work here, and to Heiko Oberdiek from whom I adopted the “LaTeX for doc”, “TeX for ins” idea. None of this stuff is originally mine!

Thanks a lot for this writeup, Joseph. I particularly liked your explanation of the mechanics involved in using dtx/DocStrip combo. After reading “dtxtut” it wasn’t so clear to me how it all comes together.

One feature that I’m not so fond of in .dtx format is writing user documentation as comments. This is fine for documenting the implementation but quite awkward for longer stretches of text such as description of the user interface.

Anyway, I think that have you a lot of good information in these .dtx series already, enough for a nice tutorial. Together with your model .dtx file this could make a nice addition to Will’s dtxgallery (just a thought ;-).

There are other approaches that don’t require writing code as comments. The problem is that none of them are as widespread as the dtx format. (We had some discussion about the pros and cons of various approaches on the LaTeX3 mailing list. Some thought is probably needed to find the “best” way to create a successor to the current system.) I’d point out that a lot of editors (Emacs, TeXworks and WinEdt, at least) can automatically start each line correctly.

As I’ve said, I got the basic form of the dtx I’ve explained from Will! If he fancies adding the code and explanation to his gallery, I’d be quite happy.

By the way, any things that you’d like more on? I currently have two or three more posts in mind.

I know about codedoc class but I’m not sure whether it is any better. It’s certainly no less complicated. I finally settled on writing a standard document but with ltxdoc class to take advantage of the code description markup. Now I would like to add some code documentation, so I’m looking into the dtx thing again. One of the challenges for me is that I have a lua script in addition to regular LaTeX code.

After some experimentation I think I found a viable way of incorporating a regular document inside a .dtx file. The trick is to use the comment environment:

As for the successor to dtx, I would like it to be (re)usable beyond paper output. To give an example of what I mean by that: LEd editor has a very nice feature of displaying syntax hints for commands. This is accomplished by special definition files that have to be prepared for each and every package. If one could extract this kind of information directly from the package documentation, then the whole process could be automated and all packages would benefit from the same level of support without any extra work.

The idea of reusable information certainly came up. I think at the moment the LaTeX3 team have other things to think about, but its not forgotten. There’s probably a need to try to sketch out some very general “desirables” and then work from there, rather than starting with what is around and trying to modify it. Perhaps once we get a bit further on with the lower level LaTeX3 stuff, this might get revisited.

Nice howto, and nice link @karlberry. I hadn’t thought of tex and latex being used on the same file to get different results.

My tex IDE is TeXShop, and the trouble with editing large .dtx files with doc comments is that it doesn’t automatically tag the section commands that are commented out. Normally that’s a Good Thing but it makes it hard to navigate in this usage.