Background

HTML is a general-purpose markup language used for electronic documents, mostly for onscreen reading.
Some content, however, is more suitable for other kinds of presentation and being able to reuse the same content for different media types has been a design goal or HTML and CSS.

It has been shown possible to use HTML as a format for book publishing. In the authoring process, it was helpful to use a set of class name on HTML element to further classify content. The classes, along with their associated structural elements, mostly served as hooks for the associated style sheet. In particular, the class names helped separate the content into different sections of a book.

The main motivation for developing a microformat for book is to encourage reuse of content for different media types. By offering people a sample HTML file and an associated style sheet, HTML can become a compelling format to use for book production. As such, the class names described in a book microformat are primarily hooks for style sheets to use, and secondarily machine-readable semantics.

Parts of a book

The user interface of books is fairly standarized. There is typically a front cover that includes the title of the book and the name of the author(s). Inside the cover, one will find a table of contents, chapters, and index and so forth. The table below lists commonly used section types.

Section type

Description

frontcover

The front cover

halftitlepage

The halftitle page is simple with only the title of the book, and perhaps the name of the authors

titlepage

The title page contains (at least) the book title, the name of the author and the name of the publisher

imprint

The imprint page typically starts with a copyright statement and also contains information about where the book is printed, its ISBN number etc.

dedication

The dedication page is where you find "for mom"

inspiration

Many books contain inspirational quotes by other authors

foreword

Many books contain a foreword written by someone other than the authors

preface

The preface is written by the authors and often contains an acknowledgement of other contributors

toc

Table of Contents

lot

List of Tables

lof

List of Figures

chapter

The content itself content is typically organized in numbered chapters.

uchapter

Many books contain unnumbered chapters (e.g., an introduction) thar are formatted similar to chapters

part

Some books organize sets of chapters into parts

afterword

An additional, often unnumbered chapter at the end of the book

references

References from the text of the book are often listed in a separate section

appendix

Additional information can be organized into appendices

bibliography

The bibliography lists other books and sources for further reading

glossary

The glossary defines terms used in the book

index

The index is a list of keyword with page references

colophon

The colophon page contains information about the production of the book

promotion

Promotional material from the publisher, e.g., a list of other titles in the same series

backcover

The back cover

In boom, the section names are used as class names on the <div> element:

Are there too many section types?

It may be argued that the list of possible section types is too long for a "microformat". While one should always strive for simplicity, a few things should be kept in mind:

the section names only affect on attribute on one element (namely, the class attribute on the div element)

publishing is an established industry and paper-based books are not likely to change. As such, the format describes something that already exists.

Nontheless, some of the proposed sections could be combined. for example, the forewords and the preface are often formatted in the same manner and there is no need to distinguish between the two in the style sheet. Another similar example is the list of tables and the list of figures. And having a colophon isn't that common, is it? However, all the proposed section types are in common use and the cost of listing one more type is small compared to the extra cost of differentiating between sections through other means than standardized class names.

Are there enough sections?

The list of possible section types is seemingly endless. For example, one could have a separate "acknowledgements" section instead of using the "preface" section for this. Also, one could have different types of sections for different types of promotional material. The postcard, which is often included in books, is formatted very differently from the list of other books in the same series. Thus, having several promotional elements would make sense.

However, in the interest of simplicity it is important to keep the number of section types at a manageable level.

In the end, determining the list of section types for a microformat is a judgement call.

Comparison with DocBook

DocBook docbook is an SGML/XML vocabulary which is been developed for "books and papers about computer hardware and software", but it can also be used for other kinds of books. DocBook is a complex specification; it contains around 400 different elements. Some of DocBook's elements are similar to the section types in the table above: