Search

Subscribe

Crypto Puzzle and NSA Problem

The NSA had an incinerator in their old Arlington Hall facility that was designed to reduce top secret crypto materials and such to ash. Someone discovered that it wasn't in fact working. Contract disposal trucks had been disposing of this not-quite-sanitized rubish, and officers tracked down a huge pile in a field in Ft. Meyer.

How did they dispose of it? The answer is encrypted in the story's text!

The story sounds like it's from the early 1960s. The Arlington Hall incinerator contained a grating that was to keep the documents in the flames until reduced to ash. The grate failed, and "there was no telling how long the condition had persisted before discovery."

Comments

The story can also be found in http://www.governmentattic.org/2docs/Hist_US_COMSEC_Boak_NSA_1973.pdf (A History of US Communications Security (The David G. Boak Lectures), as amended for public release). This page detailing that incident is the last one of text. The preceding few pages contain much additional amusement; I have not yet read further back.

In general its true, but why did he start counting terrorism deaths since 2002? I think he should have averaged that over the same time period we have for the other maladies. It would be more fair, and seem less biased. Even if you start at 2001, that averages to something like 300 deaths a year due to terrorism. Much less than their figures for smoking, ect.

Typical government QA. What stories text was the answer encrypted in? I suppose you could call this a case of breaking the discrete "trash-o-rithmic" problem! But was this on an elliptic curve? Rubbish! (Notice it is spelled correctly?)

I did try for a very short time but failed, I am not very skilled at such stuff and something considered "easy" by the crypto geeks from Ft. Meade is probably very hard for anyone who isn't as skilled as they are.

Jan and Eric, The only mistake in the text is the misspelling of the word "fity" in "fifty-two pickup." That is the only grammatical error. There are many references "in the vernacular" that lead one to assume that a steganographic message is incorporated into the text. "Fity" is the 73rd word, and the 50th word is "burned." Try starting there and see where you get.

I love any story where:
"for years the screen at the top of the stack had a habit of burning through and then it would spew partially burned classified COMSEC and SIGINT materials round and about the Post and surrounding neighborhood."

Fun!
Just started looking at the puzzle. (I'm a sucker for these sorts of things.) Any one solve it yet? Here's my speculation after an initial pass: Boak gives a clue at the end with the phrase "...one error has been deliberately incorporated, because that is par for the course". Par in golf is typically 72 (but can be 73). There's a typo in the 73rd word of the document ("fifty-two" is spelled "fity-two") which is presumably deliberate.

There are at least four other similar typos that I saw in the document. From the hints in the last paragraph, it appears that these, along with (perhaps) the numbers in the third paragraph?, may encode the coordinates of letters in a matrix -- I'm guessing it has 26x2=52 letters in it in total, maybe lower and upper case.

I'm new to this blog and my fascination with security is well beyond my ability to work this out.
So, Mr Schneier, if I come back later will you be providing the answer?
I hope so!
That way I'll be able to try to fathom the next one...

the code is beyond me so how about the solution?
They could turn the dump into a tank testing ground and let the tracks churn up the documents (too slow - too risky)
or they could turn the dump into a waste paper recycling yard - the tons of extra waste would make looking for classified material futile?

I must admit when the doc was first released I read it and had a look to see if there was a message there and if I could spot it.

However after noticing one or two problems it struck me that the text was not very "styalized".

Often what gives away a hidden message within a plain sight text is the need to get the plain sight to fit with the hidden message. This usually produces "odd / unnatural reading" plain text.

It just did not feel "odd" enough to give me confidence there actualy was a message to find in a relativly short period of time.

However I'm British and not a golf player so I did not realise for instance that "par" had a real value that could be used.

Which brings you around to the "context" the message was in and the film BAT42 was based very firmly around the idea that his potential rescuers understood golf where as the enamy trying to capture him did not.

At the very end, he says that the puzzle is very easy - this could be a clue that he has used the same system he describes in the last paragraph to encode the message. A second clue might be in the following sentences:

"Then those coordinates are buried in the text. About another million ways - a myriad - are available for that last step."

There's a mistake here: a myriad is 10,000, not a million. This might be a hint that the coordinates are buried in the text in the form of spelling and grammatical mistakes, of which there are plenty.

In addition to the ones noted by Mark above, there's also "Once you know the rules, solution is easy." (Missing "the" after the comma.)

It seems to me that the last two paragraphs have an odd style. The earlier text flows easily, but the last two paragraphs have very short and very long sentences, peculiar punctuation and some passages that are badly composed ("...a truly admirable example of how a special talent combined with a most fortuitous circumstance eventually allowed us to get all that stuff disposed of."). It looks to me as if this part of the text was heavily manipulated to get certain letters into certain locations. I spent a few minutes looking at, say, the Nth letter of a sentence N words long, but had no luck.

Here's the full text of the relevant section of the document, OCR'd and checked against the original scan. All errors (baside, tabluations, etc.) were preserved. Happy hunting!

CLASSIFIED TRASH

One day, back in the '60's, one of our people was poking about in the residue baside the Arlington Hall incinerator. The incinerator had been a headache for years: the screen at the top of the stack had a habit of burning through and then it would spew partially burned classified COMSEC and SIGINT materials round and about the Post and surrounding neighborhood. Troops would then engage in a giant game of fity-two pickup. This day, however, the problem was different - the grate at the floor of the incinerator had burnt out and the partially burned material, some the size of the palm of your band, was intermixed with the ash and slag.

There was no way of telling how long the condition had persisted before discovery, so we thought we had better trace the ash to the disposal site to see what else was to be found. The procedure was to wet down the residue for compaction, load it on a dump truck, and haul it away. In the old days it had evidently been dumped by contractors in abandoned clay pits somewhere in Fairfax County (and we never found them); but the then current practice was to dump it in a large open area on Ft Meyer, South Post, adjacent to Washington Boulevard.

Our investigator found that site, alright, and there discovered two mounds of soggy ash and assorted debris each averaging five feet in height, eight to ten feet wide, and extending over 100 yards in length. He poked at random with a sharp stick, and thought disconsolately of our shredding standards. Legible material was everywhere - fragments of superseded codes and keying material, intriguing bits of computer tabluations; whole code words and tiny pieces of text. Most ere thumb-size or smaller; but a few were much larger. Other pokers joined him and confirmed that the entire deposit was riddled with the stuff. Some of it had been picked out by the wind and was lodged along the length of the anchor fence separating the Post from the boulevard.

Our begrimed action officer was directed to get rid of it. All of it. Being a genius, he did, and at nominal cost. How did he do it?

The solution to this problem was most ingenious - a truly admirable example of how a special talent combined with a most fortuitous circumstance eventually allowed us to get all that stuff disposed of. I won't tell you the answer outright: instead, I will try to aggravate you with a very simple problem in analysis of an innocent text system. Innocent text systems are used to send concealed messages in some ordinary literature or correspondence. By about this time, you may suspect that perhaps I have written a secret message here by way of example. That, right, I have! What's here, in fact, is a hidden message which gives you the explanation of the solution we accepted for disposing of that batch of residue. If we ever have to do it that way again, it will be much more difficult for us because the cost of everything has escalated, and I doubt we could afford the particular approach we took that time.

If you are really interested in how innocent text systems are constructed, he advised that there are twenty-jillion ways to do it - every one of them different. Some of them may use squares or matrices containing an encoded text with their values represented by the coordinates of each letter. Then those coordinates are buried in the text. About another million ways - a myriad - are available for that last step. In fact, the security of these systems stems mostly from the large variety of methods that can be used and on keeping the method (the logic) secret in each case. Once you know the rules, solution is easy. So now, find my answer above - no clues, except that it's very simple, and one error bas been deliberately incorporated, because that is par for the course.

Well, I extracted all the numbers out of the text and got:
1 60 1 50 2 5 8 10 100 20
Which gives the words
One Post one burned day the of people size Hall
Which might become
One day the Hall people burned one sid[z]e of the Post.

Nate, I picked up a couple of errors in your transcription by comparing it to my own - which also had errors :( Here's the result:

CLASSIFIED TRASH

One day, back in the '60's, one of our people was poking about in the residue baside the Arlington Hall incinerator. The incinerator had been a headache for years: the screen at the top of the stack had a habit of burning through and then it would spew partially burned classified COMSEC and SIGINT materials round and about the Post and surrounding neighborhood. Troops would then engage in a giant game of fity-two pickup. This day, however, the problem was different - the grate at the floor of the incinerator had burnt out and the partially burned material, some the size of the palm of your hand, was intermixed with the ash and slag.

There was no way of telling how long the condition had persisted before discovery, so we thought we had better trace the ash to the disposal site to see what else was to be found. The procedure was to wet down the residue for compaction, load it on a dump truck, and haul it away. In the old days it had evidently beem dumped by contractors in abandoned clay pits somewhere in Fairfax County (and we never found them); but the then current practice was to dump it in a large open area on Ft Meyer, South Post, adjacent to Washington Boulevard.

Our investigator found that site, alright, and there discovered two mounds of soggy ash and assorted debris each averaging five feet in height, eight to ten feet wide, and extending over 100 yards in length. He poked at random with a sharp stick, and thought disconsolately of our shredding standards. Legible material was everywhere - fragments of superseded codes and keying material, intriguing bits of computer tabluations; whole code words and tiny pieces of text. Most were thumb-size or smaller; but a few were much larger. Other pokers joined him and confirmed that the entire deposit was riddled with the stuff. Some of it had been picked out by the wind and was lodged along the length of the anchor fence separating the Post from the boulevard.

Our begrimed action officer was directed to get rid of it. All of it. Being a genius, he did, and at nominal cost. How did he do it?

The solution to this problem was most ingenious - a truly admirable example of how a special talent combined with a most fortuitous circumstance eventually allowed us to get all that stuff disposed of. I won't tell you the answer outright: instead, I will try to aggravate you with a very simple problem in analysis of an innocent text system. Innocent text systems are used to send concealed messages in some ordinary literature or correspondence. By about this time, you may suspect that perhaps I have written a secret message here by way of example. That, right, I have! What's here, in fact, is a hidden message which gives you the explanation of the solution we accepted for disposing of that batch of residue. If we ever have to do it that way again, it will be much more difficult for us because the cost of everything has escalated, and I doubt we could afford the particular approach we took that time.

If you are really interested in how innocent text systems are constructed, he advised that there are twenty-jillion ways to do it - every one of them different. Some of them may use squares or matrices containing an encoded text with their values represented by the coordinates of each letter. Then those coordinates are buried in the text. About another million ways - a myriad - are available for that last step. In fact, the security of these systems stems mostly from the large variety of methods that can be used and on keeping the method (the logic) secret in each case. Once you know the rules, solution is easy. So now, find my answer above - no clues, except that it's very simple, and one error has been deliberately incorporated, because that is par for the course.

Haven't cracked it, but I'm making a couple assumptions:
1) Presumably, the secret message must be fairly long - at least a few words - in order to give a good description. This means there must be a means of encoding a lot of information.
2) The misspellings are probably relevant, but so are the odd sentence structures.

I agree with Beta's observation: The message is probably contained within the last two paragraphs. My immediate intuition about the awkward syntax and the abundance of pointlessly long words is that the author is trying to achieve a particular sequence of word lengths. The strange sentence structure would then be a byproduct of trying to shoehorn a semblance of English grammar onto a predetermined sequence of word lengths.

I would make the additional assumption that the steganographic scheme is designed to be able to hide an already encrypted message; such a modular approach would probably seem more realistic to the author of the puzzle even though we assume the hidden message here to be unencrypted. Under this assumption, we should not expect the letters of the hidden message to correspond to single letters in the story; that would require the steganographer to find use for too many z's and q's if he were hiding an random-looking ciphertext.

If these assumptions are right, most of the typos must be red herrings.

As for length, I doubt there is room to hide a satisfying punchline in the two last paragraphs if each letter takes much more than two or three words to encode. I think there is enough information to express a letter for each pair of words even if all that matters is their length, but I have yet to discover a scheme of this sort that gives meaningful results.

I'm currently concentrating on these two sentences: "What's here, in fact, is a hidden message which gives you the explanation of the solution we accepted for disposing of that batch of residue. If we ever have to do it that way again, it will be much more difficult for us because the cost of everything has escalated, and I doubt we could afford the particular approach we took that time."

Taking the first sentence very literally he's saying the hidden message is right here.

But taking the second sentence literally the information about the cost of everything would seem to be a clue. Yet at the end he says there are no clues. So this sentence might actually contain the hidden message, or the key to the message.

Although all of those numbers are in the text, not all of them are expressed the same way.

Assuming that all typos in the story are deliberate (in other words, fity is spelled that way to exclude it from careful analysis), we have two numerals (60, 100) and a sequence of very short numbers written out in text form:

one, two, two, five, eight, ten

I didn't get anywhere by counting words or letters in the last two paragraphs. Perhaps someone more mathematically inclined might see something significant.

@Tim - I've not verified it, but that seems promising. Perhaps they bought several shipments of dime paperbacks and shredded them for security through obscurity? Does that qualify as "special literature systems"?

Looking at google Eart there's no golf course nearby. Just building and some large parking lots.
I examined some other pages from the same document to discover that those are allmost typo-free. So the typo's in the story are very likely deliberate. Also the typo's are not random characters, they seem to have been chosen to go undetected unless one reads it carefully. So that would probably mean that it's the mispelled word (so not the character) - or it's position - that is the carrier of the message. The idea that 'par for the course' points to word 72 and so to the misspelled 'fity' may be a clue, perhaps this is also the deliberate error. The word numbers of the typos may thus be the way to go...

@Frank,
You may be on to something with the idea that the typos were chosen to be hard to notice. I've read that story quite a few times, but never noticed ANY of them, even the "fity", until everyone started analyzing it here.

I'm with Tim, I don't think the intention is to deliberately mislead and the "deliberately incorporated error" @ "par for the course" seems to point the way forward. But I think there are a few more steps, and I would expect the solution to be longer, or at least less ambiguous.

On the other hand, the (slightly artificial) inclusion of so many numbers in the text is intriguing as well.

Shock, any more flashes of insight? You definitely made the longest inroad so far. :)

To solve the code, we'd need coordinates and a matrix.

We've already got two possible sources of coordinates - numbers mentioned and the positions of errors.

Perhaps we should think more about the matrix. As well as the erroneous number at position 73, position 1 = one. If it's really so simple, could it just be a vector matrix... ie counting words from the beginning to the end of the entire text?

Otherwise, sticking with the "simple" theme, coordinates could very well refer to [line, word], [paragraph, word], or [paragraph, sentence, word].

All that said, someone made the very good point above that "simple" is a relative term...

This is a typed document, I'd say it's relatively safe to assume you ignore whitespace on any concealment cipher that isn't conveyed electronically, as there is really just no way to know how many spaces you're supposed to count between each word, punctuation mark, newline, etc. unless you happen know the font+kerning values of the make+model of the typewriter and/or word processor that produced the original, et al.

I think everyone is thinking a bit too hard about this, although 'simple' *is a relative term...

I don't think there's anything about the actual letters in the misspelled words. It's tempting to read too much into "which letter is wrong" but only some of the misspellings have a single letter changed. One is missing a letter, and one has 2 transposed letters. It's a little harder in those cases to see an obvious position associated with the error.

Also, if the individual letters in the misspelled words were there in order to force the message into the text, I don't think the misspellings would have been so ideal ("he" instead of "be", "beem instead of been"). Each misspelling is pretty much the most subtle possible way to do the job, if the only requirement is that the word needed to be wrong. That says to me at least that he had his choice of exactly how to misspell each word, and that if they mark anything, it's only the words themselves (most likely marking positions).

The awkwardness of some of the sentences does make it seem like they were manipulated specifically for something like a word count. Some sentences have little parenthetical phrases stuck in odd places, as if increasing the word count without changing something else in the sentence. Some sentences that could have stood on their own were joined together with semicolons and other punctuation to make for very large word counts.

It's hard to ignore some of those odd sentences, but without a better idea how he would write if he weren't constructing a puzzle, it's probably hard even to tell what's forced and what's not.

@Roger,
I believe there's a lot of his writing available for comparison. I've seen this passage in a PDF Bruce linked when the NSA declassified several. I was under the impression that the one that had this story was written entirely by the same guy.

I think it is a simple innocent text system, as advertised, with only one twist that slowed me down. Of course, I may be completely wrong.

I think the solution is: "The telling ash was simple buried" Allowing for one error, as mentioned in the text, this ought to be: "The telling ash was simply buried."

There are six errors that can be corrected with a letter and six paragraphs on the page. I found that interesting. There was also one transposition error, which has no single unique letter. Hmm, not sure what to do with that - maybe that's a typographical error (and thus the one deliberate mistake).

Selecting one word from each paragraph, whose position corresponed to the letter's alphabetic position, I obtained "The telling ash was with you." The first four words made me think I had the solution, but the 5th and 6th words don't make sense.

So I reconsidered the possibility that the transposition error isn't a deliberate, non-cipher error. I think, instead that it's an indicator that the transposition code for the system is changed somehow.

Figuring out the shift is a challenge, because there are many possibilities, once the simple alphabet position is dropped; I decided to work backwards a bit. It seems that the only word in the sixth paragraph that is a really good candidate to finish the sentence is "buried." Nothing in the fifth paragraph leaps out as well, but "simple" seems close to a word that popped into mind immediately after reading the first four words, "simply."

Wishful thinking, perhaps. But I tried to see what I could work out for the position of these words. If I start counting from the second sentence of these respective paragraphs, the position is 18 for "simple" and 27 for "buried." If you take the mod(26) operator for these, the two numnbers, 18 and 1, seem suspiciously close to the corresponding letter positions for the errors, 19 and 2, and have the correct spacing.

It is possible to arbitrarily define a shift to the transposition code after the transposition indicator, to make this fit the target text. The shift I posit is:
It is possible to arbitrarily define a shift to the transposition code after the transposition indicator, to make this fit the target text. The shift I posit is:
1. start counting words from the second sentence
2. add one to the basic alphabet cipher (or rotate the transposition so that B = 1)
3. add 26 to the cipher number up to (or through) the letter L

This provides the solution I offered. I think the change after the transposition error is a bit arbitrary, which makes it a little unsatisfying, but if you believe the code is as simple as one word per error, it's about the only sentence that makes sense.