The docstrip::util package is meant for collecting various
utility procedures that are mainly useful at installation or
development time. It is separate from the base package to avoid
overhead when the latter is used to source code.

Like raw ".tcl" files, code lines in docstrip source files can
be searched for package declarations and corresponding indices
constructed. A complication is however that one cannot tell from the
code blocks themselves which will fit together to make a working
package; normally that information would be found in an accompanying
".ins" file, but parsing one of those is not an easy task.
Therefore docstrip::util introduces an alternative encoding
of such information, in the form of a declarative Tcl script: the
catalogue (of the contents in a source file).

Declares that the code for a package with name name and
version version is made up from those modules in the source
file which are selected by the terminals list of guard
expression terminals. This code should preferably not contain a
packageprovide command for the package, as one
will be provided by the package loading mechanisms.

Declares that the code for a package is made up from those modules
in the source file which are selected by the listed guard
expression terminals. The name and version of this package is
determined from packageprovide command(s) found
in that code (hence there must be such a command in there).

Declares the fconfigure options that should be in force when
reading the source; this can usually be ignored for pure ASCII
files, but if the file needs to be interpreted according to some
other -encoding then this is how to specify it. The
command should normally appear first in the catalogue, as it takes
effect only for commands following it.

Other Tcl commands are supported too — a catalogue is
parsed by being evaluated in a safe interpreter — but they
are rarely needed. To allow for future extensions, unknown commands
in the catalogue are silently ignored.

To simplify distribution of catalogues together with their source
files, the catalogue is stored in the source file itself as
a module selected by the terminal 'docstrip.tcl::catalogue'.
This supports both the style of collecting all catalogue lines in one
place and the style of putting each catalogue line in close proximity
of the code that it declares.

Putting catalogue entries next to the code they declare may look as
follows

% First there's the catalogue entry
% \begin{tcl}
%<docstrip.tcl::catalogue>pkgProvide foo::bar 1.0 {foobar load}
% \end{tcl}
% second a metacomment used to include a copyright message
% \begin{macrocode}
%<*foobar>
%% This file is placed in the public domain.
% \end{macrocode}
% third the package implementation
% \begin{tcl}
namespace eval foo::bar {
# ... some clever piece of Tcl code elided ...
% \end{tcl}
% which at some point may have variant code to make use of a
% |load|able extension
% \begin{tcl}
%<*load>
load [file rootname [info script]][info sharedlibextension]
%</load>
%<*!load>
# ... even more clever scripted counterpart of the extension
# also elided ...
%</!load>
}
%</foobar>
% \end{tcl}
% and that's it!

This command is a sibling of the standard pkg_mkIndex
command, in that it adds package entries to "pkgIndex.tcl"
files. The difference is that it indexes docstrip-style
source files rather than raw ".tcl" or loadable library files.
Only packages listed in the catalogue of a file are considered.

The dir argument is the directory in which to look for files
(and whose "pkgIndex.tcl" file should be amended).
The pattern argument is a glob pattern of files to look
into; a typical value would be *.dtx or
*.{dtx,ddt}. Remaining arguments are option-value pairs,
where the supported options are:

-recurseindirpattern

If this option is given, then the index_from_catalogue
operation will be repeated in each subdirectory whose name
matches the dirpattern. -recursein* will
cause the entire subtree rooted at dir to be indexed.

-sourceconfdictionary

Specify fileoptions to use when reading the catalogues of
files (and also for reading the packages if the catalogue does
not contain a fileoptions command). Defaults to being
empty. Primarily useful if your system encoding is very different
from that of the source file (e.g., one is a two-byte encoding
and the other is a one-byte encoding). ascii and
utf-8 are not very different in that sense.

-optionsterminals

The terminals is a list of terminals in addition to
docstrip.tcl::catalogue that should be held as true when
extracting the catalogue. Defaults to being empty. This makes it
possible to make use of "variant sections" in the catalogue
itself, e.g. gaurd some entries with an extra "experimental" and
thus prevent them from appearing in the index unless that is
generated with "experimental" among the -options.

-reportboolean

If the boolean is true then the return value will be a
textual, probably multiline, report on what was done. Defaults
to false, in which case there is no particular return value.

-reportcmdcommandPrefix

Every item in the report is handed as an extra argument to the
command prefix. Since index_from_catalogue would typically
be used at a rather high level in installation scripts and the
like, the commandPrefix defaults to
"putsstdout".
Use list to effectively disable this feature. The return
values from the prefix are ignored.

The package ifneeded scripts that are generated contain
one package require docstrip command and one
docstrip::sourcefrom command. If the catalogue entry was
of the pkgProvide kind then the package ifneeded
script also contains the package provide command.

Note that index_from_catalogue never removes anything from an
existing "pkgIndex.tcl" file. Hence you may need to delete it
(or have pkg_mkIndex recreate it from scratch) before running
index_from_catalogue to update some piece of information, such
as a package version number.

This command is an alternative to index_from_catalogue which
creates Tcl Module (".tm") files rather than
"pkgIndex.tcl" entries. Since this action is more similar to
what docstrip classically does, it has features for
putting pre- and postambles on the generated files.

The source argument is the name of the source file to
generate ".tm" files from. The target argument is the
directory which should count as a module path, i.e., this is what
the relative paths derived from package names are joined to. The
supported options are:

-preamblemessage

A message to put in the preamble (initial block of comments) of
generated files. Defaults to a space. May be several lines, which
are then separated by newlines. Traditionally used for copyright
notices or the like, but metacomment lines provide an alternative
to that.

-postamblemessage

Like -preamble, but the message is put at the end of the
file instead of the beginning. Defaults to being empty.

-sourceconfdictionary

Specify fileoptions to use when reading the catalogue of
the source (and also for reading the packages if the
catalogue does not contain a fileoptions command). Defaults
to being empty. Primarily useful if your system encoding is very
different from that of the source file (e.g., one is a two-byte
encoding and the other is a one-byte encoding). ascii and
utf-8 are not very different in that sense.

-optionsterminals

The terminals is a list of terminals in addition to
docstrip.tcl::catalogue that should be held as true when
extracting the catalogue. Defaults to being empty. This makes it
possible to make use of "variant sections" in the catalogue
itself, e.g. gaurd some entries with an extra "experimental" guard
and thus prevent them from contributing packages unless those are
generated with "experimental" among the -options.

-formatpreamblecommandPrefix

Command prefix used to actually format the preamble. Takes four
additional arguments message, targetFilename,
sourceFilename, and terminalList and returns a fully
formatted preamble. Defaults to using classical_preamble
with a metaprefix of '##'.

-formatpostamblecommandPrefix

Command prefix used to actually format the postamble. Takes four
additional arguments message, targetFilename,
sourceFilename, and terminalList and returns a fully
formatted postamble. Defaults to using classical_postamble
with a metaprefix of '##'.

-reportboolean

If the boolean is true (which is the default) then the return
value will be a textual, probably multiline, report on what was
done. If it is false then there is no particular return value.

-reportcmdcommandPrefix

Every item in the report is handed as an extra argument to this
command prefix. Defaults to list, which effectively disables
this feature. The return values from the prefix are ignored. Use
for example "putsstdout" to get report items
written immediately to the terminal.

An existing file of the same name as one to be created will be
overwritten.

This command returns a list where every even index element is the
name of a package provided by text when that is
evaluated as a Tcl script, and the following odd index element is
the corresponding version. It is used to do package indexing of
extracted pieces of code, in the manner of pkg_mkIndex.

One difference to pkg_mkIndex is that the text gets
evaluated in a safe interpreter. package require commands
are silently ignored, as are unknown commands (which includes
source and load). Other errors cause
processing of the text to stop, in which case only those
package declarations that had been encountered before the error
will be included in the return value.

The setup-script argument can be used to customise the
evaluation environment, if the code in text has some very
special needs. The setup-script is evaluated in the local
context of the packages_provided procedure just before the
text is processed. At that time, the name of the slave
command for the safe interpreter that will do this processing is
kept in the local variable c. To for example copy the
contents of the ::env array to the safe interpreter, one
might use a setup-script of

Unlike the previous group of commands, which would use
docstrip::extract to extract some code lines and then process
those further, the following commands operate on text consisting of
all types of lines.

The guards command returns information (mostly of a
statistical nature) about the ordinary docstrip guards that occur
in the text. The subcmd selects what is returned.

counts

List the guard expression terminals with counts. The format of
the return value is a dictionary which maps the terminal name to
the number of occurencies of it in the file.

exprcount

List the guard expressions with counts. The format of the return
value is a dictionary which maps the expression to the number of
occurencies of it in the file.

exprerr

List the syntactically incorrect guard expressions (e.g.
parentheses do not match, or a terminal is missing). The return
value is a list, with the elements in no particular order.

expressions

List the guard expressions. The return value is a list, with the
elements in no particular order.

exprmods

List the guard expressions with modifiers. The format of the return
value is a dictionary where each index is a guard expression and
each entry is a string with one character for every guard line that
has this expression. The characters in the entry specify what
modifier was used in that line: +, -, *, /, or (for guard without
modifier:) space. This is the most primitive form of the
information gathered by guards.

names

List the guard expression terminals. The return value is a list,
with the elements in no particular order.

rotten

List the malformed guard lines (this does not include lines where
only the expression is malformed, though). The format of the return
value is a dictionary which maps line numbers to their contents.

This command tries to apply a diff file (for example a
contributed patch) that was computed for a generated file to the
docstrip source. This can be useful if someone has
edited a generated file, thus mistaking it for being the source.
This command makes no presumptions which are specific for the case
that the generated file is a Tcl script.

patch requires that the source file to patch is kept as a
list of lines in a variable, and the name of that variable in the
calling context is what goes into the source-var argument.
The terminals is the list of terminals used to extract the
file that has been patched. The diff is the actual diff to
apply (in a format as explained below) and the fromtext is
the contents of the file which served as "from" when the diff was
computed. Options can be used to further control the process.

The process works by "lifting" the hunks in the diff from
generated to source file, and then applying them to the elements of
the source-var. In order to do this lifting, it is necessary
to determine how lines in the fromtext correspond to elements
of the source-var, and that is where the terminals come
in; the source is first extracted under the given
terminals, and the result of that is then matched against
the fromtext. This produces a map which translates line
numbers stated in the diff to element numbers in
source-var, which is what is needed to lift the hunks.

The reason that both the terminals and the fromtext
must be given is twofold. First, it is very difficult to keep track
of how many lines of preamble are supplied some other way than by
copying lines from source files. Second, a generated file might
contain material from several source files. Both make it impossible
to predict what line number an extracted file would have in the
generated file, so instead the algorithm for computing the line
number map looks for a block of lines in the fromtext which
matches what can be extracted from the source. This matching is
affected by the following options:

-matchingmode

How equal must two lines be in order to match? The supported
modes are:

exact

Lines must be equal as strings. This is the default.

anyspace

All sequences of whitespace characters are converted to single
spaces before comparing.

nonspace

Only non-whitespace characters are considered when comparing.

none

Any two lines are considered to be equal.

-metaprefixstring

The -metaprefix value to use when extracting. Defaults
to "%%", but for Tcl code it is more likely that "#" or "##" had
been used for the generated file.

-trimlinesboolean

The -trimlines value to use when extracting. Defaults to
true.

The return value is in the form of a unified diff, containing only
those hunks which were not applied or were only partially applied;
a comment in the header of each hunk specifies which case is at
hand. It is normally necessary to manually review both the return
value from patch and the patched text itself, as this command
cannot adjust comment lines to match new content.

The thefile command opens the file filename, reads it to
end, closes it, and returns the contents (dropping a final newline
if there is one). The option-value pairs are
passed on to fconfigure to configure the open file channel
before anything is read from it.

This command parses a unified (diff flags -U and
--unified) format diff into the list-of-hunks format
expected by docstrip::util::patch. The diff-text
argument is the text to parse and the warning-var is, if
specified, the name in the calling context of a variable to which
any warnings about parsing problems will be appended.

The return value is a list of hunks. Each hunk is a list of
five elements "start1end1start2end2lines". start1 and end1 are line numbers in the
"from" file of the first and last respectively lines of the hunk.
start2 and end2 are the corresponding line numbers in
the "to" file. Line numbers start at 1. The lines is a list
with two elements for each line in the hunk; the first specifies the
type of a line and the second is the actual line contents. The type
is - for lines only in the "from" file, + for lines
that are only in the "to" file, and 0 for lines that are
in both.