Thursday, September 5, 2013

Does information equal entropy?

I think both Schneider and Yockey have a problem with the term "negentropy", and it is somewhat understandable.

If someone says that information = uncertainty = entropy, then they are confused, or something was not stated that should have been. Those equalities lead to a contradiction, since entropy of a system increases as the system becomes more disordered. So information corresponds to disorder according to this confusion. --Schneider

This is in one of the several pages where Schneider makes it quite clear that information can and does only mean one thing: some reduction in uncertainty defined on some receiver somewhere.

What you will see in many anti-ID writings are assertions that "meaningful" text is always less jumbled than random character sequences.

Sending compressed messages maximizes the uncertainty that is reduced with the reception of each character (Shannon information per character), and also minimizes the ratio of Kolmogorov information to length of message. But the compression will undo the sense of the message.

Ultimately, the "sense" of the message is the macroscopic effect it has on the receiver of the message, and if that isn't what Schneider means by saying that random messages have zero information, that such messages tend to put the receiver in the highest entropy state (no net reduction of uncertainty)?

Like I said compression undoes the sense of the message unless it can be decompressed into the form where the inherent redundancies in the text allow the text to "mean something" to the receiver, i.e. exhibit operational semantics.

Shannon information is maximized through a comm channel as each transmitted symbol tends to offer about the same amount of "surprisal" as any other symbol. If a symbol has a lower probability of being received, it conveys more information than the other symbols. It perhaps represents a much more interesting macrostate in the transmitter than they other symbols do, but in that case each of the other symbols will pay a cost of conveying less information. If symbols {a,b,c} convey the same information but {d} represents a rare event/macrostate, then the information conveyed by a or b will be closer to log 3 bits than to 2 bits. That's a hit to the communication rate.

The leveling of symbol frequencies for optimized comm rates is analogous to the leveling of macrostate probabilities in a phase space. Schneider seems to acknowledge this. Reading an unlikely character conveys a lot of information, and a system/machine being in a tiny/rare macrostate in its phase space indicates that an event more more interesting than equilibrium has occurred. There is a correlation even though Schneider is terse about this and much more concerned with the sin of conflating entropy and information.

Going back to his idea of a random text conveying no information, . . .