John Barth is visiting Penn, and so I took the opportunity to catch up on his most recent meta-fictions, specifically The Development and Every Third Thought. I read the second one first, and will make no comment on it here, except to note that (while suitably Barthian) it lacked the feature that struck me so forcefully in reading the first one.

What struck me about The Development was the transformation of every single instance of the singular inanimate possessive pronoun "its" into (the orthographically regular but culturally deprecated form) "it's". Here are some examples from the first couple of pages, starting with the book's second sentence (emphasis added, here and throughout):

My projected history of our Oyster Cove community, and specifically the season of it's Peeping Tom, is barely past the note-gathering stage, …

… the whole structure, which anyhow some residents like for it's ornamental (or prestige-suggesting) value.

Since "its" is common, and Barth writes in a periodic and parenthetical style, it's common to find three or more examples of possessive "it's" in one of his sentences:

But then, it's only we who remember, for better or worse and as best some of us can, when the neighborhood was in it's prime: "built out," as they say, after it's raw early years of construction and new planting, it's trees and shrubbery and flower beds mature, the villas comfortably settled into their sites but not yet showing signs of "deferred maintenance" despite the Association's best efforts to keep things shipshape.

The book's last sentence boasts no fewer than five instances of its characteristic typo:

Nothing very momentous or consequential in the larger scheme of things: one small tree-leaf in the historical forest, it's particular spring-summer-and-fall no doubt to be lost in Father Time's vast, ongoing deciduosity. But just as, now and then, one such leaf may happen against all odds to be noticed, picked up, and at least for some while preserved—between the leaves of a book, say—and may with luck outlast it's picker-upper as the book may outlast it's author and even it's serial possessors, so may this verbal approximation of the residential development called Heron Bay Estates and of sundry of it's inhabitants survive, by some fluke, that now-gone place and it's fast-going former denizens—whether or not it and they in some fashion "rebegin," and even if this feeble re-imagining them of, like the afore-invoked leaf-pressed leaf, itself sits pressed and scarcely noted in Papa T's endless, ever-growing library—

Or, more likely, his recycling bin.

The experience of reading this work left me with the familiar problem of attributional abduction: trying to find the most likely explanation for some puzzling aspect of a published work. Could Barth have done this on purpose, perhaps to make a point about the arbitrariness of writing's cultural conventions? Did he mean to insert one apostrophe into a mis-typed "its" and mistakenly change every similar string in the whole manuscript? Or was the error (if it was an error) introduced by a copy editor or someone further along the chain of production?

None of the 12 people who reviewed The Development on amazon.com remarked on this striking feature of (my copy of?) the book. So I checked the print edition, and found its "its" apparently intact, from the beginning

to the end:

One of the amazon reviewers did note that

I read this on a Kindle. I do not know if the paper versions of the book had the same editing problems, but there were so many places where "off" was spelled "of." Considering Barth's interesting and literary writing style, it is possible, but I do not think so, that he intended "of" to be used where "off" is the only word that would make sense. There were other errors as well, but do not recall those.

I also read the Kindle edition, but my copy apparently lacked these transmutations of spatial displacement into prepositional possession. Or, at least, I didn't register them in reading it. So I checked again, and found just two examples:

I was tempted to explain and laugh it of, but held my tongue lest anyone get the wrong idea.

… the old ex-professor would most likely have kicked it ofwith one of those lefty-liberal rants that he used to lay on his Heron Bay friends and neighbors at the drop of any hat.

Elsewhere in (my copy of?) the book, however, there are nearly a hundred instances of off rendered conventionally as "off". Could that Kindle-reading amazon reviewer have caught a couple of missing f's, but missed hundreds of added apostrophes?

Or could different digital copies of The Development have been mischievously seeded with different typographical transductions?

The fact that I can semi-seriously entertain such questions, I think, reflects the temporary influence of (and thus is a kind of commentary on) John Barth's approach to creative writing. Or, as he sometimes puts it, "creative rotting".

40 Comments

Rod Johnson said,

In terms of: "the singular inanimate possessive pronoun "its" into (the orthographically regular but culturally deprecated form) "it's"" – well yes, regular if you compare "its" with "the dog's" but irregular if you compare the "it" form with other possessive pronouns: "hers, his, ours, yours, theirs". I guess the problem arises with "its" being used for the 'adjectival' usage too, where most of the other forms lack the 's' ("her, our, your, their", but still "his").

Clearly the answer is to remove apostrophes from all possessives!

Al Filreis said,

You would guess that these days "e-books" are made directly from good text files – the same files used to design the printed book. But is it possible in this case that at least some e-versions went through some kind of text-recognition process or that some sort of auto-correction was (with good intentions) brought to bear on it? If that's the case, we should consider that Barth's brilliant but unusual (sometimes archaic, sometimes neological) sentence structuring and grammatical patterning blew a few fuses during the conversion.

[(myl) I don't think it's very likely that OCR was involved — as you say, there would have been a digital master copy in Quark or whatever available to the publisher; and in any case, the pattern of errors is not at all typical of OCR problems.

Nor does it seem likely that this was the result of some allegedly "intelligent" spelling-corrector being confused by Barth's style — that sort of thing would not have created the observed outcome, which is that every single "its" was changed to "it's", with (as far as I can tell) no changes in the opposite direction.

The most obvious explanation is that someone involved in the e-book production process intervened to correct one "its" to "it's", and somehow turned this into a global search-and-replace, and then either didn't notice what had happened, or decided in a state of panic to ignore it.

Craig Horman said,

It is also possible that this is a manifestation of Amazon's US Patent 7,610,382, "System and method for marking content", which is apparently intended to identify the origin of (presumably de-DRMed) content:

"Method and apparatus for programmatically substituting synonyms into distributed text content. A synonym substitution mechanism may programmatically replace selected words in textual data with synonyms for the selected words. The modification to an excerpt performed by the synonym substitution mechanism may not significantly alter the meaning of the excerpt to a human reader. By replacing one or more selected words in an excerpt with synonyms for the words, illicit copies of the excerpt may be recognized by comparing a copy of the excerpt to the original. Particular permutations of synonym substitutions may be provided in excerpts to particular requestors. The particular permutations may be recorded and used to determine a requestor as the source of a copy of the excerpt. Synonym substitution may make programmatic excerpt chaining difficult by substituting different synonyms for the same word(s) in an overlapping portion of two adjacent excerpts."

[(myl) This is an interesting patent, which I didn't know about (though I believe that the method has been in use for a century or so, as a way to determine the source of leaked secret documents). But it's not plausible that the list of "synonyms" to be substituted would include "it's" for "its"; nor that the substitution method would be global-search-and-replace.]

Once the global search-and-replace was done (and those are perilously easy to do) then – unless the copy editor had suffered such a painful lesson in the past and saved the manuscript after every edit – it would have been very difficult and time-consuming to fix. So even if they noticed what they'd done, they might have decided to leave it alone (maybe people would blame Barth, after all).

Missy said,

Ugh. Bad enough that this post pummels our eyeballs with barely tolerable typos; but then you evoke an imaginary state of panic to explain them. I'm going to have to swear off Language Log for a few days while I recover…

I'm not an early adopter, and I haven't yet got a Kindle (or any dedicated e-reader device). And I care about textual accuracy. So this story has had a major effect on me: I have just been pushed a long way back from the point of making a positive decision. I had been a bit worried about how the Kindle performs with books that have odd symbols, diagrams, and mathematical formulae in them anyway. But I really don't want to spend a hundred dollars equipping myself with a device that will feed me randomly garbled and misspelled books! Find out how and where this error occurred, Amazon, because otherwise you are going to lose a customer who was thinking of maybe purchasing one Kindle and several dozen books.

Craig Horman said,

myl: Not sure why you consider it implausible that a naive implementation of the patent could consider "it's" a synonym for "its" and do so globally. An overly-simple implementation by a non-expert (any computational linguists at Amazon are probably working on sentiment analysis) would not be a surprise.

McLemore said,

@Geoff: I mostly read ebooks on the iPad now, and, sadly, have to report that in the interest of not struggling against a massive cultural shift I'm slowly lowering my expectations for edit quality in that realm. Even more shocking, a steady stream of typos in the NYT online has started cropping up. Please, go to battle.

Steve Rafferty said,

As I finished _Every Third Thought_ recently, I noticed on the last page an (apparent) typo: alledgedly. I had the same thoughts with respect to Barth's playfulness and invention, but I've been unable to come up with an explanation for what deliberate purpose it may have been included. However, it seemed unlikely to me that the copy editor could have missed such a typo. It practically screamed at me when I came across it.

[(myl) I missed that one, which is not surprising since I'm one of the world's worst proofreaders. But it's a slightly different case, since the same mistake is found in the (online "Look Inside!" version of) the print edition:

In contrast, as noted in the post we're discussing, the "its" -> "it's" transduction is in the Kindle edition, but NOT in the (online "Look Inside!" version of) the print edition.]

Dan Lufkin said,

I was flummoxed when I read on my Kindle William Miller's excellent book Losing It, a meditation on getting old and the Icelandic sagas (the concept works out better than you would think), to see that Brennu-Njál was rendered as "Njdl" and Hávamál as "Hdvamdl". Not only that, every ð became a "5".

I reported this outrage to the author, who checked and told me that (as I'd expected) the fault lay with Amazon's OCR processing. I suggested challenging the editor to a hólmganga. Alas, we have no word in English for an axe fight with both contestants standing on a islet in a stream. I haven't heard yet how that came out.

Ellen K. said,

@Martin: its is also irregular in relation to the other pronouns, in speech. It's the only possessive pronoun which, in speech, forms by the same rules other possessive nouns. So there's a logic to it doing so in writing to. It just happens that English language spelling uses a different logic.

Rebecca said,

I've noticed editing errors in ebooks I've bought through Apple's iBooks store, so, out of curiousity, I downloaded their free sample of The Development. Without checking the whole thing, it seems to have all the same unconventional "it's"s.

Is the original publisher providing faulty text to all the ebook vendors?

David said,

Not sure why you consider it implausible that a naive implementation of the patent could consider "it's" a synonym for "its" and do so globally.

The usual purpose for this sort of steganography is to mark each copy of the document uniquely (as Amazon does with many of its MP3s, hiding your account ID in the header), so that a leaked copy can be traced back to its source; global "it's" would hardly be useful for that.

[(myl) The idea of the patent is that for each of some set of effective synonyms X and Y, the publisher might or might not substitute some instances of items from X with the corresponding items from Y, and perhaps vice versa as well.

If X is (say) the word "purchased" and Y is the word "bought", then if there are (say) 20 instances of these words in the document, then a random 20-bit pattern represents 20 choices about whether (1) or not (0) to swap the words in each case. Since there are 2^20 = 1,048,576 possible 20-bit patterns, the chance of arriving at such a set of choices by chance would be small, and this would accurately tag a particular copy of the text as coming from a source with knowledge of the key. With a larger number of quasi synonyms, smaller bodies of text could be effectively tagged.

A general problem with approaches of this kind is that there are not very many (if any) pairs that are genuinely swappable under all circumstances. For example, "bought and paid for" is a sort of idiom, for which "purchased and paid for" might often sound odd. But I suppose you might try to make a cleverer selection of swappable environments.

However, I don't believe that Amazon is now using this method. If they were using it, I can't imagine that they would do so in distributing an e-book without informing the publisher and the writer in question, in which case I can't imagine that a belletrist like Barth would agree to having any his word-choices overridden by such a method. If they were, counterfactually, using such a method, it would swap some Xs for Ys and (perhaps) vice versa, not all Xs for Ys. And in any case, they'd have to be complete idiots to think that "its" and "it's" are such a swappable pair in the first place.

So, in sum, the chances are nil that this is the source of the observed pattern.]

Andy Averill said,

The Amazon patent mentioned above also allows for misspellings instead of synonyms:

"While embodiments of the synonym substitution mechanism are generally described as substituting synonyms for selected words in copies of excerpts from textual works, it is noted that other embodiments of a substitution mechanism may substitute content using other substitution criteria into copies of excerpts from textual works. For example, instead of or in addition to synonyms, one embodiment may substitute icons, images, symbols, foreign language equivalents, alternative spellings and/or misspellings for selected words."

So maybe it's not so far-fetched to think that this booboo faux pas resulted from a single instance of intentional misspelling that went into sorcerer's apprentice mode. What could be a better choice for such a scheme than a word that a lot of people get wrong anyway?

alex said,

If this is a generic failure of proof-reading processes in producing e-books, it would not be the first time that a new printing technology took some time to establish its own error-detection and error-correction mechanisms. When printing was first invented in Europe, the new printed books were far less accurate than the hand-written manuscripts they replaced for about the first century or so. Hand-written manuscripts used proven and effective error-detection and correction processes while printed books, at least initally, did not. See:

Andy Averill said,

PS Incidentally, it seems like there are a lot of other ways to insert unique identifiers into text without being so conspicuous. You could for example substitute a non-breaking space for the normal space character after the first word of a randomly selected paragraph. This would be invisible to the reader but could be quickly found by a text search.

[(myl) There are also multiple options for dashes, quotation marks, and so on, depending on character set and font options. But those, like spacing alternatives, would be relatively easy to defeat.]

GeorgeW said,

Are there many contexts in which ambiguity or confusion would result from it's ~ its? I don't think any of the contexts noted in the OP result in ambiguity. And, of course, we have no difficulty in speech.

[(myl) In this case, the problem isn't confusing the readers so much as annoying them.]

Nick Lamb said,

The Ridger, this is why anyone editing text should use tools meant for the job. A real text editor has infinite branchable undo, because storage (for text) is cheap. I can switch right now to a text editor that's been open all day, type

:before 2h

and see exactly the state it was in 2 hours ago, and if I realise that's too far I can type e.g.

:later 5

to see the state after 5 further changes, or I can move one step at a time back and forth, saving the results at any moment as I travel through time.

Lossless infinite undo is available to the people who edit movies, hours of sound and video which take up terabytes of storage, it's embarrassing not to use this capability for a mere novel.

Text is _easy_ if they can't even take this much care over your text, what are the chances the publisher's financial affairs are in order? Authors who see their novel mangled should assume that their royalty statements may be just as error-filled and probably not in their favour.

Mark Mandel said,

Clicking on "attributional abduction" led me to… the Google search home page, where I plugged in the phrase and found that the top three hits were, unsurprisingly, LL. But was that meant to be a more direct link, e.g., to an LL category that hasn't been established, or an LL search of the archives, or the Google search?

In writing that post, I found that the excerpts quoted at Sony's e-bookstore and Barnes & Noble do *not* include the tell-tale typo. It would be nice if someone dropping by here could confirm or deny.

Charles Stross says (link above in CLP's comment) that Amazon has its own notoriously buggy ebook-ification process; this suggests to me that Sony and B&N could make their more accurate processes a big selling point. I would be interested to see e.g. Consumer Reports study e-book accuracy rates, especially since publishers don't appear to give a damn about quality assurance on their own.

[(myl) I downloaded the "free sample" of the Barnes & Noble Nook edition, and it has exactly the same problem as the Kindle edition:

So I strongly suspect that this is the publisher's fault, or the fault of some contractor they hired to produce the e-book versions, and not amazon's fault.]

Thanks for doing that research! I'm sure we'd all be very interested to hear what Mr. Barth thinks about the problem, should you bring it up when you see him at Penn.

The errors could have arisen in-house at Houghton Mifflin, but it's more likely, as you say, that they outsourced the work of making the e-file to a contractor. It's also possible, though, that HM gave each e-book company the same, accurate file, but that Amazon, B&N, Sony, etc., all go through a single sub-contractor to produce their different formats of e-book.

I'm going to check the Sony version now, I'll report back in a bit.

Jeremy Weatherford said,

Automated conversion tools seem to be the most likely suspect. My hypothesis:
– Barth's source document used "smart quotes" or some other odd type of Unicode character besides the standard apostrophe expected by Amazon. Result: all apostrophes elided, then ham-handedly reinserted by a proof"reader"'s global search and replace
– Barth's source document used a special ligature f symbol for the double-f in off, which was ignored by Amazon. Result: off becomes of throughout. Are there any other double-f words that suffered the same fate? Perhaps Amazon already found and fixed this one, since you didn't report seeing that particular typo in your edition, or maybe it's a device-specific error. Who knows… too many computers, not enough eyeballs!

I see similar failures of formatting and sanity on Safari Books Online all the time, who seem to have even lower standards than Amazon if such a thing is possible.

I can't find a Sony e-reader preview page, but the same typos are there in Google e-books. Full-text search shows only one instance of "its" in the book, in the phrase "a past-its-prime computer" — which is indeed what we'd expect to see, if this was all caused by injudicious global search-and-replace "whole word only".

Searching the text for "it's" via Google Books gives a striking and harrowing set of results. Read 'em and weap.

[(myl) Returning to a Barthian inquiry into the nature of authorship, if any, we should consider the possibility that it was Mr. Barth himself who issued the fateful global search-and-replace command, with the results corrected in copy-editing on the way to the print editions, but not on the path to the e-book versions.]

D.O. said,

@Nick Lamb. Imagine that somebody made a mistake like discussed here. Then, after many hours of work on the text realized that this should be undone, but doesn't wish to undo also all subsequent work. This would be harder to analyze and some unexpected interference between edits may also happen. Of course, it is not an excuse to leave the text in the ridiculous form that Amazon did.

Seth Porter said,

The smart quotes hypothesis seems most likely to me in this case, possibly with someone manually 'fixing' them (maybe they forgot about "it's" when they did a set of global search and replaces, followed by the removal of all the broken smart quotes). In general, though, I have seen terrible OCR errors in Kindle texts – things like 'dam' for 'darn', which look alike after kerning to a man on a fast horse…

KevinM said,

Rod Johnson said,

I once went through a book manuscript and replaced all double spaces after periods with single spaces, in accordance with common typographic practice. Then the night before it went to press, a colleague, unforgivably, decided I had been wrong to do that and so did a global search and replace that inserted a space after every period, including decimal points. When she triumphantly announced she had "fixed" my "mistake" it was too late to undo. I still get a murderous little surge of frustration every time I think about that.

[…] editing practices of amazon and what it means for books and what the hell Barth actually wrote. Read it all here. You can also find discussions of The Development at The Fictional Woods, a forum for readers of all […]

Maureen said,

Kindle doesn't like "smart quotes" and "smart apostrophes", and makes them vanish. So it's very likely that, while trying to replace smart quotes and apostrophes with the vanilla ASCII version, the Kindle version editor created the problem you mention.

But there's no excuse. Kindle-version editors easily can do a temporary conversion, upload it to various Kindle devices, and then page through the book looking for typos and other problems. (That's how I found out about the vanishing smart quotes, when experimenting for myself.) You'd have to be a pretty lazy editor not to do it.

David Eddyshaw said,

I've wondered about whether the transcription of ebooks involved some sort of OCR too, especially after reading Amazon's version of "The Worm Ouroboros" (OK, don't sneer, we all have our guilty pleasures) in which "hath" (exceedingly frequent in Eddison's hifalutin style) gets replaced more often than not by "bath", which rarely seems to make sense in context … though it might make for an interesting derivative work …

There were quite a few in/m type confusions too IIRC.

Morten Jonsson said,

Presumably Amazon does use OCR for older books like The Worm Ouroboros. It's why I prefer to got to Project Gutenberg, where the texts have been typed by the actual fingers of actual people. You'll see typos in their books, of course, but not systematic machine errors.

But for current books, Amazon and the other ebook publishers should generally be using what the publishers give them. As Doctor Science and Professor Liberman have established (see comments above), the errors in Barth's book are almost certainly the publisher's fault, not Amazon's. Publishers are confused; the transition to ebooks has been traumatic, and there's no general agreement yet on how best to manage it. The programs for reading ebook files aren't consistent in what characters they can recognize, so a book that looks fine on your Nook might look terrible on your laptop, or vice versa. It's a strange and difficult period. We're past the time when it was new and exciting just having an e-reader, no matter how awkward it was to use. Something like the way it was exciting to go for a jaunt on your safety bicycle, knowing you'd have to patch the tires every few miles. Now we just expect the damn things to do what they're supposed to, and they don't. Not quite, not yet.

Rod Johnson said,

[…] often talked about unfortunate search-and-replace miscorrections, which now seem to be infecting poorly edited e-reader texts. The latest example, via Kendra Albert on Jonathan Zittrain's Future of the Internet blog, is […]