Category Archives: Metadata

Cribs were fundamental to the British approach to breaking Enigma, but guessing the plaintext for a message was a highly skilled business. So in 1940 Stuart Milner-Barry set up a special Crib Room in Hut 8.

Foremost amongst the knowledge needed for identifying cribs was the text of previous decrypts. Bletchley Park maintained detailed indexes of message preambles, of every person, of every ship, of every unit, of every weapon, of every technical term and of repeated phrases such as forms of address and other German military jargon. For each message the traffic analysis recorded the radio frequency, the date and time of intercept, and the preamble—which contained the network-identifying discriminant, the time of origin of the message, the callsign of the originating and receiving stations, and the indicator setting. This allowed cross referencing of a new message with a previous one. Thus, as Derek Taunt, another Cambridge mathematician-cryptanalyst wrote, the truism that “nothing succeeds like success” is particularly apposite here.

4) It won’t work. Apple sees no need for sensible organization in its App Store, and without Apple and its iOS devices, that’s a big part of the user population left in the dark. There will be no tributary from Apple feeding into the stream.

5) It won’t work. Amazon doesn’t use the standard for books, ISBN, preferring to assign its own internal ID scheme for books. If we can’t agree on something librarians have nailed down to a precise science — books — forget it ever working for everything else.

6) Google Now and Siri are both nice but those aren’t the future. How do you compile a concise and relevant report in response to querying them? See metadata and spam.

7) Reputation matters. Gelernter dismisses sites, but sites have humans behind them and sites have track records. (Some even have explicit ethics.) A quantum of information is worthless without being able to trust the source. (See even those who seem trustworthy burn others on Twitter with lies.)

And that’s what I can think of immediately. I’m sure there’s much more. But without an underpinning of trustworthy and standardized metadata to begin with, nothing can work.

As Jaynes recognized, it is not a matter of simple analogy, but rather something far more subtle. The theories are similar because the ideas that lead to the theories are similar. These ideas are based on the quantification of order.

And:

Most exciting is the range of theories that have been successfully derived using this foundation: measure theory, probability theory, information theory, quantum mechanics, and special relativity. These results provide strong support for the claim that Information Physics, which relies on information about our descriptions of reality to derive physical laws, is a potentially useful general approach. With these positive examples as guideposts, we now aim to use these techniques to quantify new problems and derive new physical laws.

Fast-forward to the present, and SoundExchange is now showing an unpaid artist account balance of $294 million. Here’s how SoundExchange breaks down this latest monstrosity:

$111 million in ‘unpayable’ funds.

Of that…

$43 million are ‘unclaimed funds.’$23 million stuck due to ‘bad metadata.’
$23 million stuck due to ‘unclaimed money by foreign collecting societies.’
$22 million stuck due to ‘account issues and paperwork.’

Boldfaced emphasis added by me.

Now it could be, given their history here, they are outright lying about the metadata.