Self-enumerating pangrams

Pangrams

A pangram is a phrase or sentence that contains every letter of the alphabet at least once.

Which alphabet, you ask ? Good question ! In the following, we shall basically restrict ourselves to the 26 letters in the English language. But you will also see many examples in Dutch, not only since that language has a rich history in recreational linguistics, but because it plays a major role in this story.And in Dutch appears the special ligature `ij' which can sometimes be considered as a separate letter on its own. (Such as in the three-letter word `wijf', but not so in the eight-letter word `minijurk'. You see, Dutch is recreational by itself. But we digress already.) However, in most of the Dutch examples we adhere to the English alphabet and count the `ij' as two letters, unless explicitly stated otherwise.For the sake of completeness, it must also be noted that in general we don't differentiate between capital letters and lowercase letters. The `A' and the `a' are then two instances of the same letter.The traditional pangram example is the well-known typewriter test phraseThe quick brown fox jumps over the lazy dog

One of the pangram word games is to construct the shortest phrase that is still a pangram. This leads often to amusing yet rather bizarre solutions, but also to some really fine examples, such as

The five boxing wizards jump quickly

and in Dutch (using the `xyz' alphabet, the `xijz' alphabet, and the `xyijz' alphabet respectively)

Sexy qua lijf doch bang voor 't zwempak

Och, zwak vormpje blijft exquis ding

Doch 'n exquis gympje zwerft vlakbij

but here we are not really interested in those ones, and we will not elaborate on them any further.

Self-referencing sentences

A self-referencing sentence makes a true assertion about the sentence itself.

The following examples illustrate what is meant by this.This sentence has five words

Deze zin heeft vijf woorden

Here we have twenty-eight letters

Hier zijn achtentwintig letters

Note that the English and the Dutch sentence of the first example are perfect word-by-word translations of each other. However, it is probably even more remarkable that in the second example we are able to make a valid (i.e. self-referencing) translation at all.

Things become a bit complicated when we consider the (supposedly self-referencing) sentenceThis English sentence is difficult to translate into Dutch.

The straightforward translation of this sentence would be

Deze Engelse zin is moeilijk te vertalen in het Nederlands.

Does this Dutch sentence refer to itself ? That cannot be true, because it would then incorrectly claim to be an English sentence. So it must refer to the above English sentence. But that would make that sentence self-contradicting, because it was actually very easy to make the translation. We are stuck. Therefore a better candidate for a Dutch translation of the English sentence would be

Deze Nederlandse zin is moeilijk te vertalen in het Engels.

It is probably more appropriate to call this a `transcription', `counterpart', or `equivalent', than a `translation'. We will encounter this sort of difficulties again when we are later dealing with translations of pangrams.

A wealth of information is available on this topic, including many paradoxical assertions such as the famous Epimenides paradox. Again, we are not interested in them in general, but only in special cases about letters and words.

The beginning

In January 1982, a year after taking over Martin Gardner's column "Mathematical Games" in Scientific American, Douglas Hofstadter published in his column "Metamagical Themas" (note the anagram) solutions to the "problem of the self-answering question" posed in his maiden column. Among them was the following elaborate self-referencing sentence (from Howard Bergerson):

In this sentence the word AND occurs twice, the word EIGHT occurs twice, the word FOUR occurs twice, the word FOURTEEN occurs four times, the word IN occurs twice, the word OCCURS occurs fourteen times, the word SENTENCE occurs twice, the word SEVEN occurs twice, the word THE occurs fourteen times, the word THIS occurs twice, the word TIMES occurs seven times, the word TWICE occurs eight times, and the word WORD occurs fourteen times.

(A little side note: whereas in the English version the capital words are listed in alphabetical order, the capital words in the Dutch version are listed in the order in which they appear in the whole sentence. In a sense the sentence repeats itself in the capital words, at least partly.)The big question is, can something similar be accomplished by counting the individual letters in a sentence rather than the words ?

Self-enumerating sentences

A self-enumerating sentence is a self-referencing sentence whose text consists solely of the enumeration of its letter content. It is also called an autogram.

The answer to the question whether such sentence exists was given by Lee Sallows, and was published by Hofstadter in the same January 1982 column in Scientific American mentioned above. Here it is:

Unfortunately, the truth which this sentence expresses turned out to be different from what he expected. Lee Sallows noted three (hmm, I see only two!) miscalculations in the counts, making the sentence no longer self-enumerating. So, ironically, it is indeed true that "only a fool believes that I took the trouble to verify this sentence ...".

This sentence just enumerates its letter content and no longer the punctuation marks, mentions every letter of the alphabet, and consists of a minimum of words. The first clean example of a real self-enumerating pangram!

The article ended with the devilish remark that without doubt Lee Sallows wouldn't have trouble in finding a divine English translation.Lee Sallows set off to write a computer program (in Lisp) to search for possible solutions to this `standard' problem, but realized soon that the number of combinations to be investigated was prohibitively overwhelming and too time consuming for his program. He then took a radically different approach. Being an electronics engineer, he constructed a `pangram machine' in special-purpose hardware with circuitry designed to solve just this one problem.The pangram machine was unable to find a solution starting with the phrase ``This pangram contains ...'', being the proper translation of the Dutch ``Dit pangram bevat ...''. Other verbs were substituted instead of `contains', and after many unsuccessful attempts the machine eventually produced an Eureka!This pangram lists four a's, one b, one c, two d's, twenty-nine e's, eight f's, three g's, five h's, eleven i's, one j, one k, three l's, two m's, twenty-two n's, fifteen o's, two p's, one q, seven r's, twenty-six s's, nineteen t's, four u's, five v's, nine w's, two x's, four y's, and one z.

Still not completely satisfied, Lee Sallows got the inspiration to replace the word `and' at the very end of the standard sentence with `&', and on 22 November 1983 the pangram machine finally delivered the answerThis pangram contains four a's, one b, two c's, one d, thirty e's, six f's, five g's, seven h's, eleven i's, one j, one k, two l's, two m's, eighteen n's, fifteen o's, two p's, one q, five r's, twenty-seven s's, eighteen t's, two u's, seven v's, eight w's, two x's, three y's, & one z.

The pangram machine discovered many other solutions since then, with different starting phrases. Some interesting ones will be reproduced below. In October 1984 Ed Miller proved that there are just two solutions with the standard phrase ``This pangram contains ...'', and disclosed the second one:This pangram contains four a's, one b, two c's, one d, twenty-six e's, six f's, three g's, six h's, eleven i's, one j, one k, two l's, two m's, seventeen n's, fifteen o's, two p's, one q, eight r's, thirty s's, seventeen t's, four u's, four v's, six w's, six x's, three y's, & one z.

The challenge of Lee Sallows

The results of the above exercises, including details about the pangram machine, were reported in October 1984 by A. K. Dewdney in his column "Computer Recreations" in Scientific American.

It was also in this article that Lee Sallows made the following wager: "I bet 10 guilders nobody can come up with a self-enumerating solution (or proof of its nonexistence) to the sentence beginning ``This computer-generated pangram contains ... and ...'' within the next 10 years."How mistaken he was.This computer-generated pangram contains six a's, one b, three c's, three d's, thirty-seven e's, six f's, three g's, nine h's, twelve i's, one j, one k, two l's, three m's, twenty-two n's, thirteen o's, three p's, one q, fourteen r's, twenty-nine s's, twenty-four t's, five u's, six v's, seven w's, four x's, five y's, and one z.

Already in January 1985 A. K. Dewdney reported in his column "Computer Recreations" in Scientific American that this solution was found by four different competitors, independent of each other, and that it indeed was computer-generated in all four cases.

The four programmers were (in submission order) John Letaw, Lawrence Tesler, Ed Miller, and William Lipp. Later Michael Gayle and James Mittan submitted a fifth program, while Hans Buchwald and Robert Wolfson contributed proposed algorithms. Dewdney was surprised that all five solutions were exactly identical, and astonished about the remarkable differences in computer time used by the various programs. But there is an explanation.

Five overlooked pangrams

Only the program of Edward S. Miller employed an Exhaustive Search Method, with ingeniously designed cutoffs and search ranges to keep the execution time within reasonable bounds.

He could therefore prove that there is just one unique solution to the `10-guilder' problem, and also that there is no solution to the `standard' problem with the ``... contains ... and ...'' phrase, and exactly two solutions to the `standard' problem with the ``... contains ... & ...'' modification.In February 1985 Ed Miller came to the conclusion that not only the other pangram programs employed too narrow search ranges and boundary conditions (so that potential solutions could be skipped too early), but that even the pangram machine suffered from this. He examined seven variants of the `standard' problem (with the verbs "contains", "comprises", "consists of", "is composed of", "uses", "employs", "has") for which the pangram machine had found no solution at all. Wrongly, because the exhaustive search method revealed these overlooked pangrams:This pangram uses four a's, one b, one c, two d's, thirty e's, five f's, three g's, seven h's, eleven i's, one j, one k, three l's, two m's, sixteen n's, twelve o's, two p's, one q, eight r's, twenty-nine s's, sixteen t's, four u's, six v's, six w's, five x's, three y's, and one z.

This pangram uses four a's, one b, one c, two d's, twenty-six e's, seven f's, three g's, six h's, eleven i's, one j, one k, two l's, two m's, fifteen n's, fourteen o's, two p's, one q, eight r's, thirty s's, sixteen t's, five u's, four v's, six w's, six x's, three y's, and one z.

This pangram employs four a's, one b, one c, two d's, thirty-six e's, three f's, three g's, ten h's, seven i's, one j, one k, three l's, three m's, seventeen n's, twelve o's, three p's, one q, ten r's, twenty-eight s's, twenty-two t's, two u's, six v's, seven w's, three x's, five y's, and one z.

This pangram employs four a's, one b, one c, two d's, thirty-six e's, six f's, three g's, eleven h's, ten i's, one j, one k, three l's, three m's, eighteen n's, thirteen o's, three p's, one q, fourteen r's, twenty-six s's, nineteen t's, five u's, three v's, three w's, four x's, four y's, and one z.

This pangram has five a's, one b, one c, two d's, twenty-eight e's, five f's, three g's, seven h's, ten i's, one j, one k, one l, two m's, twenty n's, thirteen o's, two p's, one q, five r's, twenty-three s's, twenty t's, one u, six v's, nine w's, two x's, five y's, and one z.

Variations on a theme

As said before, the pangram machine could be loaded with different starting phrases. By extending the text a bit and carefully selecting the verbs, it was possible to generate an ordered list:

This first pangram has five a's, one b, one c, two d's, twenty-nine e's, six f's, four g's, eight h's, twelve i's, one j, one k, three l's, two m's, nineteen n's, twelve o's, two p's, one q, eight r's, twenty-six s's, twenty t's, three u's, five v's, nine w's, three x's, four y's, and one z.

This second pangram totals five a's, one b, two c's, three d's, twenty-nine e's, six f's, four g's, seven h's, ten i's, one j, one k, two l's, two m's, twenty-one n's, sixteen o's, two p's, one q, eight r's, twenty-eight s's, twenty-three t's, four u's, four v's, nine w's, three x's, five y's, and one z.

This third pangram contains five a's, one b, two c's, three d's, twenty-six e's, six f's, two g's, four h's, ten i's, one j, one k, two l's, two m's, twenty-two n's, seventeen o's, two p's, one q, seven r's, twenty-nine s's, twenty-one t's, four u's, six v's, eleven w's, four x's, five y's, and one z.

After a while it becomes almost more difficult to choose the verb than to construct the solution.

The entire list of the first twenty-five numbered pangrams

The standard problem revisited

The solution to the `standard' problem had the flaw that the Dutch word `en' was translated into `&' instead of the desired `and'. By trying variations for the noun in the starting phrase instead of the verb, the following came out of the pangram machine (again with two alternatives):

This autogram contains five a's, one b, two c's, two d's, thirty-one e's, five f's, five g's, eight h's, twelve i's, one j, one k, two l's, two m's, eighteen n's, sixteen o's, one p, one q, six r's, twenty-seven s's, twenty-one t's, three u's, seven v's, eight w's, three x's, four y's, and one z.

This autogram contains five a's, one b, two c's, two d's, twenty-six e's, six f's, two g's, four h's, thirteen i's, one j, one k, one l, two m's, twenty-one n's, sixteen o's, one p, one q, five r's, twenty-seven s's, twenty t's, three u's, six v's, nine w's, five x's, five y's, and one z.

The three Dutch alphabets

The Dutch examples are mostly written using the English alphabet, i.e. the special ligature `ij' counts as two separate letters, and the 25th letter of the alphabet is the `y'. The same classification is used by the standard Dutch reference dictionary "van Dale". Here is another example:

If the ligature `ij' is considered as a separate letter (which is strictly speaking more conforming to the nature of the Dutch language) the alphabet is extended to 27 letters, "tolerating" both `y' and `ij'. This also has a solution:

Purists may claim that the `y' is a actually foreign letter and does not formally belong to the Dutch language. The 25th letter is the `ij' ligature, and there is no `y' at all. This "formal" standpoint yields even a double solution:

There is even a fourth alphabet, which is only used by the Dutch PTT in telephone books. The ligature `ij' is here discarded completely, and replaced everywhere by `y'. This is rather absurd. For the sake of pangrams, this "PTT" alphabet is equivalent to the formal alphabet.

Magic pairs

A certain starting phrase may yield more than one solution. We have seen this already for the results of some variants of the `standard' problem. But if some of the initial words can be reshuffled, a rather magic effect can be reached.

This pangram tables but five a's, three b's, one c, two d's, twenty-eight e's, six f's, four g's, six h's, ten i's, one j, one k, three l's, two m's, seventeen n's, twelve o's, two p's, one q, seven r's, twenty-nine s's, twenty t's, five u's, six v's, eight w's, four x's, four y's, and one z.

But this pangram tables five a's, three b's, one c, two d's, twenty-nine e's, six f's, six g's, eight h's, eleven i's, one j, one k, three l's, two m's, seventeen n's, fourteen o's, two p's, one q, eight r's, twenty-eight s's, twenty-two t's, six u's, four v's, eight w's, four x's, four y's, and one z.

Mutually descriptive pairs

There are other, even more magic, pairs. On 22 March 1985 Ed Miller printed (on two separate sheets, lacking a double-sided printer) the magnificent

The sentence on the reverse side contains three a's, one b, three c's, three d's, forty-three e's, seven f's, two g's, nine h's, eight i's, one j, one k, two l's, one m, twenty-four n's, sixteen o's, one p, one q, eleven r's, twenty-seven s's, twenty-three t's, three u's, six v's, seven w's, two x's, five y's, and one z.

The sentence on the reverse side contains three a's, one b, three c's, three d's, forty-six e's, four f's, two g's, ten h's, eight i's, one j, one k, two l's, one m, twenty-four n's, fifteen o's, one p, one q, eleven r's, twenty-nine s's, twenty-three t's, two u's, seven v's, seven w's, three x's, five y's, and one z.

However, it turns out after all that something similar can in fact be printed one-sided.

The adjacent text utilizes four a's, one b, two c's, three d's, thirty-six e's, five f's, three g's, nine h's, eleven i's, two j's, one k, four l's, one m, eighteen n's, thirteen o's, one p, one q, eight r's, twenty-seven s's, twenty-four t's, four u's, four v's, seven w's, five x's, four y's, and two z's.

The adjacent text utilizes four a's, one b, two c's, three d's, thirty-two e's, nine f's, three g's, eight h's, eleven i's, two j's, one k, three l's, one m, seventeen n's, fifteen o's, one p, one q, eleven r's, twenty-six s's, twenty-one t's, eight u's, six v's, six w's, three x's, four y's, and two z's.

And the next pair belongs to the same category, but is even more mutually descriptive.

The righthand sentence contains four a's, one b, three c's, three d's, thirty-nine e's, ten f's, one g, eight h's, eight i's, one j, one k, four l's, one m, twenty-three n's, fifteen o's, one p, one q, nine r's, twenty-three s's, twenty-one t's, four u's, seven v's, six w's, two x's, five y's, and one z.

The lefthand sentence contains four a's, one b, three c's, three d's, thirty-five e's, seven f's, four g's, eleven h's, eleven i's, one j, one k, one l, one m, twenty-six n's, fifteen o's, one p, one q, ten r's, twenty-three s's, twenty-two t's, four u's, three v's, five w's, two x's, five y's, and one z.

A bimagic angram

A special case of a double solution is worth mentioning separately. It has magic of its own.

This angram contains four a's, two b's, two c's, one d, twenty-seven e's, eight f's, four g's, five h's, ten i's, one j, one k, one l, two m's, twenty n's, fifteen o's, one q, six r's, twenty-seven s's, eighteen t's, five u's, six v's, seven w's, three x's, four y's, one z, but no _.This angram contains four a's, two b's, two c's, one d, twenty-seven e's, eight f's, four g's, five h's, eleven i's, one j, one k, two l's, two m's, twenty n's, fifteen o's, one q, six r's, twenty-seven s's, nineteen t's, five u's, six v's, eight w's, three x's, four y's, one z, but no _.

We have now entered the realm of autograms that are no pangrams.

The solitary z

In case a certain letter does not appear in the text of the starting phrase, nor in any of the counting words, it has to be included in the enumeration with a count of `one' to make the sentence a pangram. So far we have seen this happen in all above examples. Would it be possible to get rid of these rather redundant standalone additions ?

Almost, so it seems, as we can produce the autogramThis sentence employs two a's, two c's, two d's, twenty-eight e's, five f's, three g's, eight h's, eleven i's, three l's, two m's, thirteen n's, nine o's, two p's, five r's, twenty-five s's, twenty-three t's, six v's, ten w's, two x's, five y's, and one z.

which has several trivial variations since `one z' could have been just as well `one q', but from an esthetic standpoint the `z' is the obvious choice. Nevertheless, it turns out to be possible to construct a pure solution after all.

This sentence employs two a's, two c's, two d's, twenty-six e's, four f's, two g's, seven h's, nine i's, three l's, two m's, thirteen n's, ten o's, two p's, six r's, twenty-eight s's, twenty-three t's, two u's, five v's, eleven w's, three x's, and five y's.

To emphasize this solitary occurrence of the `z', consider the following surprising triplet:This sentence contains three a's, three c's, two d's, twenty-six e's, five f's, three g's, eight h's, thirteen i's, two l's, sixteen n's, nine o's, six r's, twenty-seven s's, twenty-two t's, two u's, five v's, eight w's, four x's, five y's, and only one z.

Only this sentence contains three a's, three c's, two d's, twenty-nine e's, five f's, three g's, eight h's, twelve i's, three l's, nineteen n's, nine o's, six r's, twenty-four s's, twenty-one t's, two u's, five v's, eight w's, two x's, five y's, and one z.

This sentence contains only three a's, three c's, two d's, twenty-five e's, nine f's, four g's, eight h's, twelve i's, three l's, fifteen n's, nine o's, eight r's, twenty-four s's, eighteen t's, five u's, four v's, six w's, two x's, and four y's.

Self-enumerating palingrams

A palingram is a phrase that has exactly the same syllable, word, or letter sequence when you read it backwards. A letter palingram is normally called a palindrome.

As such they are obviously a bit off-topic here, but Lee Sallows donated a special palingram that is also self-enumerating, so it should definitely be mentioned. It deserves a page of its own, and goes without further words.The art of self-enumerating palingrams

Counting letters and letters

As an extension to the `standard' problem and the `10-guilder' problem, here are two pangrams which not only enumerate their individual letter content, but also mention their total letter count.

This pangram contains two hundred nineteen letters: five a's, one b, two c's, four d's, thirty-one e's, eight f's, three g's, six h's, fourteen i's, one j, one k, two l's, two m's, twenty-six n's, seventeen o's, two p's, one q, ten r's, twenty-nine s's, twenty-four t's, six u's, five v's, nine w's, four x's, five y's, and one z.

After eliminating some of the one-time letters, there is also a non-pangram version.

This sentence contains one hundred and ninety-seven letters: four a's, one b, three c's, five d's, thirty-four e's, seven f's, one g, six h's, twelve i's, three l's, twenty-six n's, ten o's, ten r's, twenty-nine s's, nineteen t's, six u's, seven v's, four w's, four x's, five y's, and one z.

So far, only the redundant one-time letters have been disposed of sometimes. But for constructing full sentences a starting phrase was needed, with a rather arbitrary text. Would it be possible to get rid of all such arbitrary elements altogether ?

The shortest self-enumerating phrase

In June 1985 the Dutch linguist Hugo Brandt Corstius wrote under his pseudonym (one of many) Piet Grijs the article "Humor and wiskunde". He cited one of the "Stellingen" appended to the PhD thesis of Jan Moors. This proposition states that

Vijf v's, vijf ij's, vijf f's, vijf s's.is the shortest self-enumerating phrase in Dutch. (Note that this time the `xijz' alphabet is used.)

Subsequently Eric Wassenaar challenged Ed Miller to construct the shortest English counterpart. In July 1985 Ed Miller came up withSixteen e's, six f's, one g, three h's, nine i's, nine n's, five o's, five r's, sixteen s's, five t's, three u's, four v's, one w, four x's.

which has many trivial variations (since `one g, one w' could be just as well `one a, one b' etc.). Without the arbitrary one-time letters the phrase becomes slightly longer.

Sixteen e's, five f's, three g's, six h's, nine i's, five n's, four o's, six r's, eighteen s's, eight t's, three u's, three v's, two w's, four x's.

Very close after that one comes with just one letter more

Fifteen e's, seven f's, four g's, six h's, eight i's, four n's, five o's, six r's, eighteen s's, eight t's, four u's, three v's, two w's, three x's.

The last two phrases are examples of what Lee Sallows calls a "reflexicon". This opens a whole new aspect in the field of recreational linguistics. But that goes beyond the scope of this document.

The longest self-enumerating pangram

By extending the leading phrase, one can try to create very long self-enumerating pangrammatic stories. How long was demonstrated by John Letaw. It needs its own page. All further comments seem superfluous and unnecessary.

Abraham Lincoln's Gettysburg Abragram

On hindsight, it contains one amusing error. The abragram claims 70 d's whereas there are actually 86. The pangram program probably didn't take into account the 2 d's in the 8 words `hundred' showing up in the enumeration. The letter `d' does not appear in number-words between 1 and 99. That fact is used by pangram programs to achieve considerable speedup.

Epilogue

This is not a static document. It will be updated as more information becomes available. It reflects what is known to me at the date that is mentioned in the header. All mistakes and errors are mine.

This epilogue contains three a's, one b, two c's, two d's, thirty e's, four f's, two g's, six h's, ten i's, one j, one k, two l's, one m, twenty-one n's, seventeen o's, two p's, one q, six r's, twenty-seven s's, twenty-one t's, three u's, five v's, nine w's, three x's, five y's, and one z.