Common BookBuilder Tagging Errors (Part III)

This is the third in what’s turning out to be a series of articles on proper HTML syntax in BookBuilder and some common tagging problems we see in user-created books. I didn’t think this was going to be a trilogy when I started, but there seems to be no shortage of tagging issues we can discuss. Here are a few more.

Improper placement of <pb_sync> tags (part II)

Last time I wrote about misplaced <pb_sync> tags in commentaries. I should’ve gone on to add that the same rules apply to dictionaries and devotionals. That is, the <pb_sync> tag needs to be inside the section to which you want the user to go when synchronizing or doing a go-to operation by word, date, or verse. That means it must come physically after the heading tag that defines the table of contents entry you want to go to.

One of the situations in which this can easily come up is when you’re converting an old PocketBible 2 devotional to the new PocketBible/MyBible tagging format. In the old days, the < !datesync> tag just had to be on the page you wanted to go to. The common style looked like:

…
<!page>
<!datesync>
<h1 index>January 3</h1>
…

PocketBible 3 and MyBible 4 do not have “pages”. As a result, if you’re not careful the <pb_sync type=date /> tag will end up outside the section to which it applies.

One solution to this for those converting PocketBible 2 devotionals is to use the TagConverter program that comes with BookBuilder Pro. It will move <!datesync> tags so that they follow the next heading tag. It doesn’t do the same for commentaries and dictionaries, though, so you have to be careful with those and do move the <pb_sync> tags manually.

Mismatched number of columns in table rows

We have recently seen tables in which the first row might contain three columns but a subsequent row contains only one or two. This doesn’t necessarily present a problem when rendering with the current version of the software, but it could cause unexpected results in the future. And it’s possible it will cause problems in the present. I haven’t looked extensively at what happens when you mix rows with different numbers of columns with <colgroup> tags, rowspan attributes, and various options for borders and rules.

(Since I mentioned the rowspan attribute I should point out that when you have a table data cell with a rowspan greater than one, subsequent rows will not contain a table data cell for that column. This is not a problem and isn’t the situation I’m talking about here.)

In general, tables require some complex algorithms for rendering and it’s a good idea to verify the appearance of every table in your book to make sure it looks right in the program.

Forgetting to use colspan for “headings” in tables

A common table format includes rows that contain a heading that spans all columns and serves as a heading for the data that follows. We’ve seen a variation on the problem mentioned above (mismatched number of columns in table rows) in which the author is trying to insert a header but forgets the colspan attribute.

In this example, the table has four columns. The first two contain the name and date of reign for the kings of Israel, the third and fourth contain the name and date for the kings of Judah. Obviously there’s a problem, because the first row only contains two columns, and while they appear to have the correct content they will appear only over the first name and date columns from the second row.

In its purest forms (ignoring CSS and other “style” features) HTML is intended not so much as a page layout language but as a way to functionally tag the text. For example, the <p> tag marks the beginning of a paragraph, but says nothing about indenting, gaps between paragraphs, or any other layout issues that a browser might consider when rendering a paragraph. Similarly, tags like <li> indicate list items and can control some aspects of how the list item is demarked, but it doesn’t specify how much (or if) it is indented, the exact character or symbol used for bulleted lists (beyond simple suggestions like “circle” or “disc”), or any attributes one might apply to the number if the list is ordered.

There are many aspects of rendering and layout that are left to the program designer. For example, the current implementation of PocketBible and MyBible do not indent paragraphs, but instead insert a little bit more line spacing between paragraphs than between the lines in a paragraph. The amount of indenting for <blockquote> and <li> tags is also set by the rendering algorithm, not by the author of the book.

It is important to remember when tagging books (or writing any HTML document, actually) that your book could eventually appear on a different device, a different screen size, and/or be rendered by a completely different algorithm than the one you’re using during your own testing. Furthermore, the user may prefer a different font, font size, or color scheme than you do. With this in mind you should never tag your book to look “exactly right” on your device.

This problem manifests itself in a couple of ways:

First, you may discover that when using font size 3 on your HP Pocket PC that if you insert an empty paragraph at certain places it makes it so scrolling the screen up or down by pages will never result in a partial line being displayed at the top or bottom of the screen. Needless to say, if your readers use font size 4 or even use a different font, your plans will be ruined. In fact, it could be that your document will look really odd when rendered on a different screen or with different options selected.

Similarly, you may find that inserting a non-breaking space or a <br> tag in a certain place causes the word wrapping to break up a sentence in a way you find more pleasing. Again, changes in device capabilities or user-configurable options in the program could cause the user to not only see something different, but to see something strange.

The second way this problem can show up in your books is if you embed some knowledge about the current PocketBible or MyBible program into your book. You might, for example, include instructions on what buttons to push in the program to do certain things with your book. You may think you’re clever because you include descriptions of both PocketBible and MyBible, but what happens when we release a reader for another platform, or we issue an upgrade to MyBible that changes the appearance or location of some user interface elements?

Another example would be to embed comments in the book such as “Please wait while the following table is loaded…”. While it may be true that a large table takes some time to load, what will you do next year when a new device comes out that is significantly faster than anything available today, or we make a change to the program that speeds up this operation?

The goal is to design your book in such a way that it can be displayed anywhere without the user thinking the book was designed for a different device.

The only exception to this rule I’d consider is in general, it’s better to think of your book being rendered on a “portrait mode” screen rather than a “landscape mode” screen. Wide content is, in general, harder to deal with than tall content. Given a choice between spreading material out horizontally or vertically, choose vertical.

2 Responses

First of all thanks for the 3 part blog on common BookBuilder mistakes. Most of these I’ve experienced myself as have most propably that have made attempts to design books themselves. (Although the project I’ve just wrapped up is I’ve finished up is for personal use only as the publisher of this series has their own much-much lesser known PPC Bible Software.)

My comments/thoughts today center more on another project that I’ve been working on and how it relates to my experience with BookBuilder. I’ve been cleaning up a couple websites I manage to bring them into compliance with XHTML. For those note familiar with XHTML it is a HTML format that combines XML struture requirements to force clean HTML code. As defined by a definition file known as a DTD. The similarities to a BookBuilder file are eerie.

Now to the question. With how close BB files are to XML structure requirements wouldn’t it be useful to write a DTD definition for BookBuilder that can be applied to enforce compliance with the structure like with XML.

To those reading that are designing their own BookBuilder I would like to second the recommendation made by Laridian to use Textpad for your editing. I work with eCommerce data at Wal-Mart and use Textpad daily because of the sheer power of regular expressions.

I agree in principle with the idea of a formal DTD for our PocketBible format, but it’s not something we’re going to spend any time on. The reason is that there’s only one application that reads these files, and that’s BookBuilder. So if the file passes through BookBuilder without error, it’s correct. If it doesn’t make it, it’s incorrect.

The purpose of a formal DTD is that it defines the file format both for authors and for those who would parse the language. In the general case of XML, there are multiple parsers and none are definitive. So in this case, the DTD and a validator can tell you if your file is correct, even if your parser doesn’t fully render all elements of the file or your parser has bugs.

We’ve defined the PocketBible format less formally to make it easy for our editors and our customers to read. BookBuilder is the validator. If it makes it through BookBuilder, it’s a valid PocketBible book.

We might be interested in the work you’ve done even though it’s a copyrighted text. We might be considering licensing the text, or we might get it just because we have your tagged file available. Contact Jim at bookbuilder@laridian.com and tell him what the book is and see if we might be interested in paying you something for it if we like what you’ve done with it.