Typography Manual

Standard Ebooks is a brand-new project—this manual is a pre-alpha, and much of it is incomplete. If you have a question, need clarification, or run in to an issue not yet covered here, please contact us so we can update this manual.

The Standard Ebooks Style Philosophy

Standard Ebooks’ goal is to bring classic public domain literature into the digital era by making it accessible, attractive, and by maintaining a high standard of quality. To that end we’re not interested in slavishly reproducing the formatting quirks, transcription errors, publisher’s ephemera, or other inconsequential style decisions of the past. While we strive to be good custodians of the literature entrusted to the public, we recognize that the freedom of the public domain intersects with the Internet era in a way that allows us to present that literature in an attractive, modern, and high-quality way.

Your task

As an editor, proofreader, or producer of a Standard Ebook, part of your task is to take the source transcription of your ebook and apply this typography manual to it. This manual outlines various standardizations and modernizations to old typography practices, making older texts easier to read for modern readers.

Many of the rules below have been accepted standards for a hundred years or more. That means that many of the ebooks you produce may not need that much adjustment. The typogrify tool in the Standard Ebooks toolset also automatically takes care of some (but not all!) of these rules. Typically, the older the source, the more of these rules you’ll have to check during production.

Normalizing different formatting styles

Our general rule is: If it doesn’t affect the meaning of the work, then normalize it according to these standards.

Common problems to keep an eye out for

Some sources use a two-em-dash to interrupt dialog. You should replace such two-em-dashes with a single em-dash according to the section on dashes.

“Why, I never——” she cried.

“Why, I never—” she cried.

Note that a two-em-dash is also used to signify a missing or purposefully obscured word. This is correct, but you should ensure that instead of two consecutive em-dashes, you use the two-em-dash glyph (⸺ or U+2E3A) for partially-obscured words, the three-em-dash glyph (⸻ or U+2E3B) for completely-obscured words, and a single em-dash for partially-obscured years.

Sally J⸺ walked through the town of ⸻ in the year 19—.

Small caps are commonly used instead of italics for emphasis in older texts, and some transcriptions use all caps instead of italics. These should generally be converted to regular case and wrapped in <em> tags. If a text truly does call for extreme emphasis, the <strong> tag can be used—but think twice about using it, and use it sparingly. See the section on text in all caps.

That donut was DELICIOUS!

That donut was <em>delicious</em>!

General

Our general style guide is the Chicago Manual of Style, 16th edition, with a few tweaks outlined below. Work following a different style guide should be converted to conform to ours, unless it changes the meaning of the work.

Do convert from logical punctuation to American punctuation where possible.

Do convert from British quotation to American quotation where possible. The british2american script is helpful for automating most (but not all!) of this.

Section endings

Some older books end with “The End”, “Fin”, or some other equivalent. Remove these.

Some books also end individual sections or chapters with “The end of such-and-such section”. Remove these as well.

Chapter names and titles

In the body text, always use Roman numerals for chapter numbers instead of Arabic numerals. But in an individual file’s <title> tag, do use Arabic numbers instead of Roman numerals.

Do not use the Unicode Roman numeral glyphs, as they are deprecated; use regular letters.

Convert all-caps or small-caps titles to title case. Use the se titlecase script in the Standard Ebooks toolset for consistent titlecasing.

Remove trailing periods from chapter titles.

Omit the word “Chapter” from chapter titles.

Some ebooks should keep “Chapter” in titles if clarity is necessary: for example, Frankenstein has “Chapter” in titles to differentiate between the “Letter” sections.

Chapter 33

XXXIII

Italicizing or quoting newly-used words

When introducing new terms, italicize foreign or technical terms, but use quotation marks for terms composed of regular English.

English whalers have given this the name “ice blink.”

The soil consisted of that igneous gravel called tuff.

Don’t italicize English neologisms in works where a special vocabulary is a regular part of the narrative; specifically, science fiction works that may necessarily contain made-up English technology words. However, do italicize “alien” language in such works.

Including both italics and quotes, outside of the context of quoted dialog, is usually not necessary. Use one or the other based on the rules above.

Names and titles: Italicize or quote?

Names and titles are usually either italicized or quoted, but almost never both. Pick one or the other based on the rules below.

Older work may pick the opposite of the rules below; change such texts to match this manual.

Older work may use quotation marks around proper names, like pub, bar, building, or company names. Remove those quotes.

He read “Candide” while having a pint at the “King’s Head.”

He read Candide while having a pint at the King’s Head.

In general, italicize things that can stand alone. Specifically:

Magazines

Plays

Books and novels except “holy texts,” like the Bible or books within the Bible

Long musical compositions, like operas

Albums

Films

TV shows

Radio shows

Titles of artwork

Long poems and ballads, like the Iliad

Pamphlets

Journals

Newspapers

Names of ships

Names of sculptures

In general, quote things that are short or parts of longer work. Specifically:

Short musical compositions, like pop songs

Chapter titles

Short stories

Individual newspaper or journal articles

Essays

Short films

Episodes in a TV or radio series

Capitalization

General

Some very old works frequently capitalize nouns that today we no longer capitalize. In general, only capitalize the beginnings of clauses, and proper nouns in the way that you would in modern English writing. Remove archaic capitalization unless doing so would change the meaning of the work.

Text in all caps

Text in all caps is almost never correct typography. Instead, convert such text to the correct case and surround it with a semantically-meaningful tag like <em> (for emphasis), <strong> (for strong emphasis, like shouting) or <b> (for unsemantic formatting required by the text). Then, use font-weight: normal; font-variant: small-caps; styling to render those tags in small caps.

The sign read <b>Bob’s Restaurant</b>.

“<strong>Charge!</strong>” he cried.

Apostrophes

When addressing something as an apostrophe, “O” is capitalized.

I carried the bodies into the sea, O walker in the sea!

Spacing

Sentences should be single-spaced. Convert double-spaced sentences to single-space.

Italics

Italics should generally be used for emphasis. Some older texts make frequent use of small caps for emphasis; change these to italics. Italics indicating emphasis must be wrapped with the <em> tag.

Set individual letters that are read as letters in italics...

He often rolled his r’s.

Unless referring to a name that happens to be a single letter or composed of single letters, or if that letter is standing in for a name...

...due to the loss of what is known in New England as the “L”: that long deep roofed adjunct usually built at right angles to the main house...

She was learning her A B Cs.

Or if the letter is meant to be a comparison to a shape:

His trident had the shape of an E.

When using the ordinal “nth,” italicize the n, and do not include a hyphen:

She got off the metro at the Place de Clichy stop, next to the Le Bon Petit Déjeuner restaurant.

“Où est le métro?” he asked, and she pointed to Place de Clichy, next to the Le Bon Petit Déjeuner restaurant.

If a certain foreign word is used so frequently in the text that italicizing it would be distracting to the reader, then only italicize the first instance. However, wrap the following instances in <span xml:lang="LANGUAGE">.

Certain exceptions to italicizing foreign words can be made if a specific word is in Merriam-Webster, but in the producer’s opinion is still too obscure for the general reader and thus should be italicized anyway. In this case ask the Standard Ebooks editor-in-chief for how to proceed.

Punctuation in italics

If italicizing a short phrase within a longer clause, don’t italicize trailing punctuation that may belong to the containing clause.

“Look at that!” she shouted.

“Look at that!” she shouted.

However, if an entire clause is italicized for emphasis, then do include the trailing punctuation in the italics, unless that trailing punctuation is a comma at the end of some dialog.

Result

Code

“Charge!” she shouted.

“<em>Charge!</em>” she shouted.

“But I want to,” she said.

“<em>But I want to</em>,” she said.

Taxonomy

Binomial names (generic, specific, and subspecific) are italicized with the <i> tag and with the z3998:taxonomy semantic inflection.

Result

Code

A bonobo monkey is Pan paniscus.

A bonobo monkey is <i epub:type="z3998:taxonomy">Pan paniscus</i>.

Family, order, class, phylum or division, and kingdom names are capitalized but not italicized.

A bonobo monkey is in the phylum Chordata, class Mammalia, order Primates.

Modern usage requires that the second part in a binomial name be set in lowercase. Older texts may set it in uppercase. Use the style in the source text.

Indentation

Body text in a new paragraph that directly follows earlier body text is indented by 1em.

The initial line of body text in a section, or any text following a visible break in text flow, like a header, a scene break, a figure, a block quotation, etc., is not indented.

For example: in a block quotation, there is a margin before the quotation and after the quotation. Thus, the first line of the quotation is not indented, and the first line of body text after the block quotation is also not indented.

Punctuation

Spaces

Quotation marks

Quotation marks that are directly side-by-side must be separated by a hair space (U+200A) character.

“hairsp‘Green?’ Is that what you said?” asked Dave.

Words with missing letters should use the right single quotation mark (’ or U+2019) character to indicate ommission.

He had pork ’n’ beans for dinner

Ellipses

Use the ellipses (U+2026) glyph instead of consecutive or spaced periods.

When used as suspension points (for example, to indicate dialog that pauses or trails off), don’t precede the ellipses with a comma. Commas followed by ellipses were sometimes used in older texts, but are now redundant to modern readers; remove them.

Note that ellipses used to indicate missing words in a quotation still require keeping surrounding punctuation, including commas, because that punctuation is in the original quotation.

Place a hair space (U+200A) glyph before all ellipses that are not directly preceded by punctuation, or that are directly preceded by an em-dash or a two- or three-em-dash. Place a regular space after all ellipses that are not followed by punctuation. If the ellipses is followed by punctuation, place a hair space between the ellipses and punctuation, unless the punctuation is a quotation mark, in which case don’t put a space at all.

“I’m so hungryhairsp… What were you saying about eatinghairsp…hairsp?”

Dashes

There are many kinds of dashes, and your run-of-the-mill hyphen is often not what you should use. In particular, don’t use the hyphen for things like date ranges, phone numbers, or negative numbers.

Do not put spaces around em-dashes. Remove spaces if in the original text.

Use em-dashes (— or U+2014) to offset parenthetical phrases. These are usually the most common kind of dash.

Use an em-dash for partially-obscured years.

It was the year 19— in the town of Metrolopis.

Use a regular hyphen if only the last digit of the year is obscured.

It was the year 186- in the town of Metrolopis.

Some older texts use two em-dashes to indicate an interruption in thought or speech. Our style is to replace two em-dashes used as an interruption marker with a single em-dash. However, don’t replace two em-dashes used to indicate the omission of a word (like an anonymous name or an expletive); see below.

Use a two-em-dash glyph (⸺ or U+2E3A) to signify a purposefully partially obscured word. The two-em-dash glyph isn’t available in some fonts, but include it anyway; our build process will convert it later.

Sally J⸺ walked through town.

Use a three-em-dash glyph (⸻ or U+2E3B) for completely obscured words.

It was night in the town of ⸻.

En-dashes (– or U+2013) are used to indicate a numerical or date range; when you can substitute the word “to,” for example between locations; or to indicate a connection in location between two places.

We talked 2–3 days ago.

We took the Berlin–Munich train yesterday.

I saw the torpedo-boat in the Ems⁠–⁠Jade Canal.

Figure dashes (‒ or U+2012) are used to indicate a dash in numbers that aren’t a range, like phone numbers.

His number is 555‒1234.

Minus dashes (− or U+2212) are used to indicate negative numbers and are used in mathematical equations instead of hyphens.

It was −5° out yesterday!

Many older texts use archaic spelling and hyphenate compound words that are no longer hyphenated today. Use the modernize-spelling script to automatically find and correct candidates. Note that this script isn’t perfect, and proofreading is required after using it to make sure it didn’t wrongly remove a hyphen!

Do not use the modernize-spelling script on poetry.

Latinisms

Don’t italicize Latinisms that can be found in a modern dictionary, like e.g., i.e., ad hoc, viz., ibid., etc. except sic, which should always be italicized. Some older works might italicize these kinds of Latinisms; remove the italics.

Do italicize whole passages of Latin language (as you would italicize any passages of foreign text in a work) and Latinisms that aren’t found in a modern dictionary.

Latinisms that are abbreviations should be set in lowercase with periods between words and no spaces between them, except:

BC, AD, BCE, and CE should be set without periods and in small caps and wrapped with the <abbr class="era"> tag.

Initials and Abbreviations

An acronym is a term made up of initials and pronounced as one word: NASA, SCUBA, TASER.

An initialism is a term made up of initials in which each initial is pronounced separately: ABC, HTML, CSS.

A contraction is an abbreviation of a longer word: Mr., Mrs., lbs.

In general, abbreviations ending in a lowercase letter should be set without spaces and followed by a period. Abbreviations without lowercase letters should be set without spaces and without a trailing period. Always use a no-break space after an abbreviation that describes the next word, like Mr., Mrs., Mt., St., etc. A few exceptions follow.

Initials of people’s names should be separated by periods and spaces. Wrap such initials in <abbr class="name">.

Result

Code

H. P. Lovecraft

<abbr class="name">H. P.</abbr> Lovecraft

Compass directions should be wrapped in <abbr class="compass">, with periods and spaces.

For acronyms, initialisms, postal codes, temperatures, and abbreviated US states, remove periods and spaces. All of these are set in caps, except for temperatures and acronyms, which are set in small caps. The source code should represent the abbreviations in caps, but wrapped in an <abbr> tag.

All acronyms are set in small caps. Because acroynms are capitalized in the source code, use the CSS style font-variant: all-small-caps; to properly set them in small caps.

Chemicals and compounds

Elements should be capitalized according to their listing in the periodic table.

Amounts of an element should be set in subscript using the <sub> tag.

Result

Code

H2O

H<sub>2</sub>O

Temperatures

Use the minus glyph (− or U+2212), not the hyphen glyph, to indicate negative numbers.

Using either the degree glyph (° or U+00B0) or the word “degrees” is acceptable, but if a work uses both methods, normalize the work to use the dominant method.

If listing temperature as a digit followed by “F.”, “C.”, or another abbreviation, remove the trailing period and precede the letter by a hair space (U+200A). Wrap the letter in <abbr class="temperature"> styled with abbr.temperature{ font-variant: all-small-caps; }

Epigraphs in chapter headers

The source of the epigraph is set in small caps, without a leading em-dash and without a trailing period.

Bridgeheads in chapter headers

Bridgeheads are centered in the header.

Always include a trailing period at the end of the bridgehead.

Times

Times in a.m. and p.m. format should have the letters a.m. and p.m. set in lowercase, with periods, and without spaces. “a.m.” and “p.m.” should be wrapped in an <abbr class="time"> tag. If “a.m.” or “p.m.” are the last word in a sentence, omit a second period, but add the “eoc” (end-of-clause) class to the <abbr> tag.

Seperate times written in digits followed by a.m. or p.m. with a no-break space. If the time is written out in words, use a regular space.

Separate the hour and minute with a colon, not a period or comma.

Do not hyphenate times when spelled out, unless they appear before a noun.

He arrived at five thirty.

They took the twelve-thirty train.

Military time that is spelled out (for example, in dialog) is set with dashes. Leading zeros are spelled out as “oh”.

He arrived at oh-nine-hundred.

Result

Code

He called at 6:40 a.m., but she wasn’t up till seven a.m.

He called at 6:40nbsp<abbr class="time">a.m.</abbr>, but she wasn’t up till seven <abbr class="time eoc">a.m.</abbr>

Note how the last <abbr> contains the period for the entire sentence, and consequently also has the “eoc” (end-of-clause) class.

Ampersands in names

Ampersands in names of things like firms should be separated by no-break spaces.

The firm of Hawkinsnbsp&amp;nbspHarker.

Ligatures

Ligatures are symbols which combine two or more characters into one.

Some older texts use ligatures like æ and œ to represent dipthongs. The modernize-spelling tool will replace many of these for you, but keep an eye out for other instances, particularly in Latin phrases and in classical names such as Œdipus. These should be either be replaced with “ae” and “oe” or with alternative modern spellings of the word they are in (check Merriam-Webster for these).

It’s very unlikely that you will encounter stylistic ligatures such as ﬂ or ﬃ in the source text, but if you do they should be replaced by the individual characters they represent.

Numbers, measurements, and math

Roman numerals should not be followed by periods, unless the period is there for grammatical reasons. Some European texts include a trailing period after Roman numerals as a matter of course; remove them.

Fractions should be written using the Unicode glyphs (½, ¼, ¾, etc., or U+00BC–U+00BE and U+2150–U+2189), if a glyph exists for your fraction.

I need ¼ cup of sugar.

If a glyph for a fraction doesn’t exist, compose it using the fraction slash Unicode glyph (⁄ or U+2044) and superscript/subscript Unicode numbers. See this Wikipedia entry for more details.

Roughly ⁶⁄₁₀ of a mile.

Dimensions and equations should use the Unicode multiplication glyph (× or U+00D7) instead of the letters x or X.

The board was 4 × 3 × 7 feet.

Feet and inches in shorthand are set with the prime (′ or U+2032) or double prime (″ or U+2033) glyphs, not single or double quotes, and with a no-break space separating feet and inches.

He was 6′nbsp1″ in height.

Coordinates should be noted with the prime (′ or U+2032) or double prime (″ or U+2033) glyphs, not single or double quotes.

lat. 27° 0′ N., long. 20° 1′ W.

(Note that in the above example your font might render the two glyphs in the same way, but they’re different Unicode glyphs.)

Operators and operands in mathematical equations should be separated by a space.

Remember to use minus dashes (− or U+2212) instead of regular hyphens, both for negative numbers and for mathematical operations.

6 − 2 + 2 = 6

When forming a compound of a number + unit of measurement, and the measurement is abbreviated, separate the two with a no-break space, not a dash.

A 12nbspmm pistol.

Footnotes and endnotes

All footnotes should be converted to a single endnotes file. For more information on the structure of that file, see our structure and semantics manual.

“Ibid.” is a Latinism commonly used in endnotes to indicate that the source for a quotation or reference is the same as the last-mentioned source.

In the case where the last-mentioned source is in the previous endnote, we must replace Ibid. by the full reference; since ebooks use popup endnotes, “Ibid.” becomes meaningless in that context.

In the case where the last-mentioned source is in the same endnote as Ibid., we can leave Ibid. untouched.

The endnote reference number goes after ending punctuation. If the endnote references an entire sentence in quotation marks, or the last word in a sentence in quotation marks, then the endnote reference number goes outside the quotation marks.

Within an endnote, a backlink to where the endnote occurred in the text must be the last item. It is preceded by exactly one space.

Legal cases and terms

Legal cases are set in italic. Either “versus” or “v.” are acceptable; if using “v.”, a period must follow the “v.”

He prosecuted Johnson v. Smith.

Standard Ebooks is a brand-new project—this manual is a pre-alpha, and much of it is incomplete. If you have a question, need clarification, or run in to an issue not yet covered here, please contact us so we can update this manual.