Introduction

Current status

Please describe current status of osis2mod, including a list of any outstanding issues or unsolved difficulties.

History of Changes

The following outlines in reverse, chronological order the major changes to osis2mod. When several changes were made over the span of a few days, they are lumped into the most recent date. Bug fixes are not mentioned.

Date

Revision

Feature

2012-03-24

r2693

Allow for XML comments to be in the document, but stripped from the module. This allows for large parts of a document to be commented out, which is especially important during development.

Allow for <p> elements to be in the <header> element but ignored. Previously, the p in a header, having been transformed to a div, was taken to be the start of the module's content.

2011-11-12

r2671

Restored pre-verse handling. Titles no longer need to be specially specified in OSIS.

2010-06-04

r2519

Removed pre-verse handling. Titles now have to be specially specified in OSIS. In OSIS there should also be no tags between verse elements except title and those marking book and chapter.

InterVerse Content refers to all content not contained by the verse element.[2]

Such content is divided between the prior and the current verse.

Content appended to the prior verse is not marked in any special way.

Content prepended to the current verse is marked with <div sID="pvX" type="x-milestone" subType="x-preverse"/>...<div eID="pvX" type="x-milestone" subType="x-preverse"/>.

Notes:

↑These transformations are all performed "under the hood" as it were. Tweaking OSIS XML files to fix problems with pre-verse titles, etc., was never intended to be done by module developers as part of the preprocessing before using osis2mod.

↑For OSIS files derived from USFM files, the implication of this requirement is the following rule:Do not place a title (or similar element) between a matching pair of verse milestones
except where the translation places the title somewhere within the verse text in the USFM file.

Handling of Introductions, Titles and Inter-Verse Material

SWORD for module, testament, book and chapter introductory material. Those introductions can have appropriate titles as well.
In SWORD 1.6.0 the handling of this material has changed.

Note:

In the following, the effects of the above transformations are not shown. The tagging of the pre-verse material is also not shown.

Module and Testament Introductions

At this time, osis2mod does not fully support module and testament introductions. A module introduction should be place into testament 0, book 0, chapter 0, verse 0. A testament introduction should be placed into testament 1 or 2, book 0, chapter 0, verse 0. Currently, these are placed into Genesis 0:0 or Matthew 0:0.

Book Introductions and Titles

Book introductions and titles are straight forward. It includes the start of the book and everything following it up to, but not including the start of the chapter. See OSIS Bibles for best practices in marking up titles and introductions.

For example:

<div type="book" ...>
... introductory material ...
<chapter"...>

will put the following into the book introduction:

<div type="book" ...>
... introductory material ...

Chapter Introductions

Chapter introductions and titles are a bit problematic. Between the start of a chapter and its first verse, we could have a chapter title, a chapter introduction and/or a start of a section of verses or a titled verse. Osis2mod now handles this in a predictable fashion. From the start of the chapter up to and not including a section div or a title that has a type that is not main, chapter or sub, the content is chapter introduction. After that, it is part of the verse.

Specifically, the following list gives the possible first elements following the chapter introduction.:

Note: The book files can be in any order. SWORD will order them correctly in the index.

Adding corrections to a Bible:

osis2mod /tmp/mymodule -a fixes.xml

Note: When fixes are put into the module they are appended to the data file and do not actually replace the verses. The index file is adjusted to point to the new place in the data file.

-z|-Z
A SWORD Bible can be compressed with Zip (-z) or LZSS (-Z). All of SWORD's Bible modules are compressed with Zip. This saves significant space over an uncompressed module. Uncompressed modules are useful for debugging.

-b 2|3|4
This setting is only useful for a compressed module. The choice as to whether to use Verse (2), Chapter (3) or Book (4, the default) level compression depends upon the amount of data in the block. A typical Bible is best compressed book by book. A commentary, chapter by chapter. If the commentary is very robust and the amount of text per verse is really huge, then verse compression might make sense.

All of SWORD's compressed Bible modules are compressed by book. Basically, all of the verses in a block are compressed and appended to the data file. For this reason, the datafile cannot be uncompressed by anything other than the SWORD and JSword libraries.

When creating the module by appending it is important to do so by whole compression block. That is, if blockType is Chapter, then the osisDoc needs to contain one or more whole chapters.

-c cipherKey
This is typically 16 characters in length, having no leading or trailing spaces, consisting of alternating sets of 4 alpha and 4 numeric characters, such as Aduf0274PjNq0328. The key is case-sensitive.

-N
All OSIS modules should be UTF-8 and all that are UTF-8 are also to be NFC. The default is to automatically detect the presense of Latin-1 (either cp1252 or iso8859-1) and convert it to UTF-8 and to normalize UTF-8 to NFC. This flag will turn off this behavior and is useful for creating Latin-1 modules or for modules for which the source text is already UTF-8 and NFC.

Note: this was added late Feb 2008 and requires ICU support when compiling.

-s 2|4
A value of 2, the default, restricts raw, uncompressed modules to 64K bytes per entry. A value of 4, breaks this barrier. This is needed for Bibles, having large introductory materials, and for commentaries with large entries. All compressed OSIS modules can handle large entries.

Note: this was added late Apr 2009 and will be part of the SWORD 1.6.0 release (formerly known as 1.5.11).

-v v11n
By default, osis2mod uses the KJV versification. The practical implication of this is that only books in the KJV canon are allowed and any text in an allowed book are retained. However, if the verse reference of a supported book falls outside of the versification it is appended to the prior verse in the canon. This flag allows for an alternate versification.

Note: this was added late Apr 2009 and will be part of the SWORD 1.6.0 release (formerly known as 1.5.11). With that release, only the Leningrad Codex will be supported, with -v Leningrad.

-d flags
The flag can be used more than once or the flags can be added together. For example,

-d 2 -d 4

is the same as

-d 6

To do verbose debugging use:

-d -1

For the most part these flags are not intended for debugging modules, but rather for debugging problems in osis2mod.

The -d 2 flag produces no output but puts milestones into the module where verses start and end. The form of the milestone is:

<milestone resp="v" [attributes from verse] />

The milestone will contain the osisID from the verse and also a valid sID or eID. The sID/eID indicates the start of end of the verse.

Note: the -d 2 flag might change at any time, or may even be removed.

Messages

Osis2mod has robust, mind-boggling messages. These are provided here in hopes that it will help problem diagnosis.

Exit Status

When an error occurs that causes osis2mod to exit without processing the entire input file, a non-zero exit status is supplied to the caller. Here are the codes that osis2mod uses:

This like the other indicates a versification problem, but shows where the text will be found. Osis2mod preserves all module content for supported books.

WARNING(V11N): New book is [ name ] and is not in [ v11n ] versification, ignoring

The name of the book was not recognized as belonging to the chosen versification, it and all of it's content is ignored.

INFO(WRITE): Appending entry: [ osisID ]: [ text so far ]

If osis2mod encounters text that needs to be appended to a verse that is already in the module. This could indicate that

the reference is in the input twice. This typically indicates a problem.

more text was found that needs to be added to the prior verse.

osis2mod is being run in append mode to fix a verse in the module.

INFO(LINK): Linking [ osisID ] to [ osisID ]

An osisID such as "Gen.1.1 Gen.1.2 Gen.1.3" was used and the latter are linked to the first.

ERROR(REF): Invalid osisID/annotateRef: [ invalid attribute value ]

This indicates that the SWORD library was unable to parse the osisID or annotateRef.

FATAL(NESTING): [ currentOsisID ]: tag expected

This indicates that the specified verse is not balanced with regard to its tags. Building a raw text module, looking in the module for the verse and pairing begin/end tags will help find the problem. Typically, this indicates an end tag that did not have a matching begin tag and all tags before it were properly paired.

This also indicates that the specified verse is not balanced with regard to its tags. Building a raw text module, looking in the module for the verse and pairing begin/end tags will help find the problem. It could be either a begin or an end tag problem.

-d 256
Osis2mod contains two stacks to validate proper nesting of BSP and BCV, respectively. This is an internal representation of the BCV stacks. It provides additional information to understand the diagnostic nesting messages.