Anticryptography: The Next Frontier in Computer Science

Example 1: Teaching addition using a set of
examples
1?1=2
1?2=3
2?2=4
2?5=7

Conclusion: ? means "+"

The problem with Lincos and similar languages was their limited scope.
While they could be used to describe a concept in purely mathematical terms,
many ideas do not lend themselves to such a neat and tidy description. But
what if you extended this concept to create a rudimentary programming
language?

What Can You Say with Computer Programs that You Can't Say
with Simple Equations?

A lot.

Think about all the different tasks that a desktop computer
can perform. You can use it to communicate (using text, images, and
sound). You can use it to process large volumes of data. You can
use it to create a virtual environment (a flight simulator, for
example). The number of potential uses for computers is limited
only by the creativity of software developers. Yet, for all of
their complexity, the basic language used by computers is very
simple and consists of a few dozen to, at most, a few hundred basic
machine instructions.

To create a rudimentary programming language, all you need is
about two dozen symbols. To do this, you need:

Delimiting characters (e.g., "(", ")")

Basic arithmetic operators (e.g., ADD, SUB, MULT, DIV)

Boolean arithmetic and logic (e.g., TRUE, FALSE, AND, OR, NOT)

Comparison operators (e.g., GT, GTE, LT, LTE, EQ, NEQ)

Variables (e.g., STORE{n}, RECALL{n})

Branching (e.g., JUMP{n}, GOSUB{n}, IF {x} DO {y})

All of these symbols can easily be taught to someone who
understands math. While the behavior of a program may be complex,
the individual symbols used to build it are easy to teach.

How Would You Teach Another Civilization to Recognize this
Language?

Because the basic vocabulary is small (about two dozen symbols),
and because the symbols are based on universal mathematical
concepts, they can be taught using sets of true and false examples.
This process can be fairly straightforward. By sending examples
similar to those in Example 1, the sender can train the recipient
to associate numbers with specific symbols or commands.

This may seem like a big assumption, but any civilization that
is capable of doing interstellar communication will need to understand
radio or lightwave communication. This implies that any civilization
must also understand math and geometry (both of which are key
to understanding the physics of communicating using electromagnetic
radiation).

Structure

The first step in the process is to train the recipient to recognize a
special series of digits or carrier signal states as delimiting characters.
These are used as open- and close-parenthesis symbols, and are used to add
structure to a linear (one dimensional) series of digits. They are repeated
constantly throughout the message and appear more often than any other
symbols.

Recognizing these symbols enables the recipient to parse the message into
discrete parcels and subparcels of information, as shown in the example in
Table 1. This is a critical step. Without it, the recipient will not know
where to begin the process of analyzing and decoding the message.

Table 1: Example series of digits with and without
delimiting symbols

Without delimiting symbols

011011100101110111

With delimiting symbols

(((0)(1)(10)(11))((100)(101)(110)(111)))

which can be read as

(((0)(1)(2)(3))
((4)(5)(6)(7)))

a two-dimensional array

The example shown in Table 1 illustrates the importance of delimiting
symbols. Even if the recipient can recognize only two symbols, the equivalent
of '(" and ")', they can see the general structure of the message. The example
in Table 1 shows what a two-dimensional array looks like with and without
delimiters. Without delimiters it is impossible to see the structure in the
message. With delimiters, it is obvious that the message contains a series of
numbers that are nested within a two-dimensional array.

At this stage, the recipient's goal is not to learn the meaning
of the message, but only its basic format. A good analogy is learning
the basic encoding format employed in DNA. We learned fairly quickly that DNA
encodes information using four base pairs (two bits per character), and that
base pairs are grouped into three character words called triplets (for a total
of six bits per word). Each word codes for either an amino acid, or as a
special "stop" word that signifies the end of an instruction set. Knowing
this does not reveal the meaning of the message encoded in DNA, it merely
tells us how to parse the data into manageable parcels and subparcels of
information.

Dewey Decimal System for Ideas

The next step in this process is to create a table of numeric
symbols that are used as placeholders for the arithmetic and logic
symbols we want to teach. This is very similar to machine language,
where numbers are used to represent commands to be executed by a
CPU. The numbers themselves are meaningless, and are used to provide
a unique address to each idea or symbol.

Table 2: Example set of
symbols and their numeric addresses

Table 2 depicts a partial list of the symbols needed to define a
basic programming language. Each symbol is associated with a unique
number that is used as its placeholder. Think of this number as an
IP address for an idea (instead of a workstation). The number itself
is meaningless, and serves only to prevent different symbols from
being confused with each other.

Each symbol is taught by sending a set of examples where there is one
unknown (as shown in Example 1). The recipient is prompted to recognize
a pattern within the example set, and by doing so learns the meaning of
each new symbol. This process is repeated for each symbol in the basic
lexicon.

Once the recipient has learned to recognize the basic symbols used
to build the programming language, they can then capture and run
programs without manually analyzing each instruction within each
program. Programs can be built by linking to instructions in other
programs (much like most software built today is based on modular
design). Once they reach this point, all they need to do is archive programs
and execute them to see what they do (much like you use desktop publishing
software without having to understand the function of every DLL or class
library used to build it).

Living Symbols

Programs are useful as a system for communication because they
enable the sender to condense a lot of knowledge into a compact
package. Take images as an example. A typical, uncompressed
high-resolution image requires 8 to 32 million bits of data
(for a 1000-by-1000 to 2000-by-2000 pixel image, 8 bits per pixel).
Since the signal will have a finite information-carrying capacity
(see the "Solving the Entropy Paradox" section later
in this article), this is an unacceptable waste of space. Just as we have a
strong incentive to compress images prior to transmitting them
across networks, so too will the sender of an interstellar message.

Instead of sending the raw, uncompressed data used to describe an
image, the sender can send efficiently compressed data to be
processed by a previously transmitted algorithm (the equivalent of a
JPEG image viewer, for example). What would otherwise require several
megabits of data would then require a fraction of that amount.

What is especially interesting about using programs to communicate
is that they enable the sender to deliver what are, in effect, living
symbols. Instead of manually picking through every character in the message,
the recipient would simply archive a program, run it, and observe what it
does. The recipient could even interact with programs while they are running
(by using variables as an input/output interface).

Imagine for a moment that you must define a set of symbols to describe
the concept of gravity. This is a difficult idea to describe because it is
an invisible force that acts over time. One way you can do this is to write
a program that simulates the effects of gravity on two or more bodies. This
program would run through a wide range of scenarios (e.g., small body
orbiting a large body, many small bodies interacting with each other, etc.).
The output from this program would, in turn, be associated with numeric
symbols related to gravity (e.g., gravitational force, various types of
orbits, etc.). One small program could encapsulate a lot of knowledge about
this phenomenon and could be used to define additional symbols for use in an
abstract language.

Figure 1: The output of a program is used to define or
amplify the meaning of abstract symbols

This is just one example. The variety of ideas that can be described
using this general technique is limited only by the author's creativity. If
you doubt this, just consider the variety of tasks a desktop computer can
perform. This machine, at its core, is merely a very fast calculator.

Programs are useful as a foundation for communication not only
because of their versatility, but also because they can solve the
so-called entropy paradox. The entropy paradox arises because
efficiently encoded information is nearly indistinguishable from
random noise.

Imagine for a moment that you are engineering a system for communicating
with another civilization. You know nothing about the recipient, so you want
to format the message so that the receiving party can decode it as easily as
possible. You are also forced to work within the following constraints:

You have a finite amount of energy with which to power your transmitter.

Because of (1), you have a finite amount of energy with which to
encode each bit of information within your signal.

Because of (2), your signal has a finite information-carrying capacity
(bandwidth).

Because of (3), you must encode information as efficiently as possible
within your message. This enables you to maximize the number of symbols
(useful information) encoded per bit.

Because of (4), this forces you to eliminate redundant information from
the message. This increases the apparent randomness of the raw data within
the message.

It's the last item in this list that leads to the entropy
paradox. In order to make a message easy to decode, it is necessary
to make the structure of the message obvious to the recipient.
Doing so requires the inclusion of redundant information that
guides the recipient in parsing and decoding the message. The
problem is that this undermines the primary goal of encoding the
message as efficiently as possible (which requires the
elimination of redundant information from the data stream).

Algorithms can solve this apparent catch-22 by enabling the
sender to embed instructions for decompressing data within the
message itself. We routinely use algorithms to compress data and to
detect and repair damage caused by transmission errors. The same
basic idea can be applied to an interstellar message. To do this,
the sender would transmit a small program, not unlike the widely
used PKZip program, which decompresses data that has been encoded for
compression and forward error correction. To decode efficiently
coded data, the recipient would run the bulk of the message through
this program.

Using this technique, the sender only needs to send a small part
of the message (the basic primer and the decompression program) in
uncompressed "plaintext." The rest of the message is transmitted in
the compressed, error-corrected format that the "pkzip" algorithm
is designed to handle. As a result, the sender gets to have his cake
and eat it too. Most of the message is transmitted in an efficient,
compressed format that maximizes the amount of information that can
be sent across the link. Yet it also contains a primer section that
is easy to capture and decode.

Locating the Primer

How would the recipient know where to look for the primer, the
part of the message that teaches them to recognize the basic lexicon of
symbols? This is not as difficult as it first seems. This technique adds a
clear signature to the message that reveals the location of the plaintext
information. This is betrayed by variations in the randomness of the data
stream over time. Figures 2 and 3, depict how this signature will
reveal itself.

In the nearly random (efficiently encoded) segments of the message,
there will be no correlation between a digit and those that follow
it. Conversely, in the plaintext segments of the message, there will
be many repeating patterns in the data. The degree of randomness in
a series of numbers can be expressed as a value H (a measure of
entropy). The lower the value of H, the more redundancy there is in
the data.

By graphing the value of H over time, the recipient will be able to see
which regions in the message contain a lot of redundant information. These
are the regions where the primer information is most likely to be.

Figure 2: Graph of H (randomness) over time

Figure 3: Graph of H (randomness) over time

Communicating in Depth

Using programs that are based on a minimum number of symbols,
it will be possible to communicate in greater depth than most people
assume is feasible. Because programs can be used to perform a wide
variety of information-processing tasks, they can be used to
describe many different types of objects and phenomena, and they can
be used to process and render efficiently encoded information
(e.g., compressed images, motion video, etc.). Like your desktop computer,
programs designed for the purpose of communication perform a wide variety of
tasks.

Furthermore, the output from these programs can be used to
describe or amplify symbols that can then be used in an abstract
language that is based on the same principle of using numbers as
words. Images produced by programs can be associated with numeric
symbols that represent objects or categories of objects. The output
from a simulation program can be used to describe processes or
forces (such as the gravitation example depicted in Figure 1).
Interactive programs (which interact with their users by passing
information through variables) will allow the recipient to explore
the behavior of a program in detail.

The variety of information and symbols that can be described in
this manner is essentially unlimited.

Anticryptography on Terra Firma

The idea of communicating with an extraterrestrial civilization
may seem outlandish or irrelevant to software development here on
terra firma, but there are many interesting applications for these
techniques that are waiting to be discovered. One of the applications is
a framework for distributing reusable software components via networks.
Another is a framework for delivering compressed multimedia content across
computer networks.

One of the key features of mathematical languages is the use of
numeric addresses as placeholders for symbols or ideas. This is not
a new idea.
René Descartes first proposed the idea in the
sixteenth century as the basis for a system for multilingual communication.
Though it's an old idea, it has been largely ignored, and can be used to
build a better system for designing and disseminating reusable software.

In this system, imagine that every software component has a unique
numeric address. Think of this as an IP (intellectual property?) address for
software. The number itself is meaningless, and serves to prevent one
component from being confused with another. This is a subtle difference from
the current system, in which software components are assigned short
alphanumeric names, typically on the whim of their authors. Though it's a
subtle difference, it has significant implications, as shown in the
hypothetical program in Example 2.

# declarations
Use lang.math.sine[100.1.2.3] as sine
Use graphics.2d.drawline[95.2.1.1] as drawline
Use lang.math.pi[100.1.2.1] as pi
# body of program
FirstPoint = true
OldY=0
For X = 0 to pi() step 0.01
OldY=NewY
NewY = sine(x)
If FirstPoint = False Then
Drawline(((x-0.01)*100),(oldy*100),(x),(newy*100))
End If
Next X

The hypothetical program in Example 2, while similar to
conventional programs, has one important difference. Notice how
several reusable components are invoked at the beginning of the
program. Each of these components has a uniform mnemonic or
numeric address that uniquely identifies it. When this program is
compiled, the resulting executable contains only the topmost layer
of instructions.

A computer running this program would determine, at run-time, what lower
level components were required to execute it. This computer would determine
that the components 100.1.2.3 (lang.math.sine),
95.2.1.1 (graphics.2d.drawline), and 100.1.2.1 (lang.math.pi)
were required to execute the program. It would obtain these components from a
nearby object server (similar to a domain name server) if they had not
already
been cached locally. The computer would then inspect these components to
determine what components they, in turn, needed to run. It would determine
that component 100.1.2.3 (lang.math.sine) needs the component
100.1.1.4 (lang.math.factorial) to compute the sine of an angle, and
would repeat the process of obtaining additional low-level components as
needed.

Figure 4: Graphical depiction of the hierarchy of
components used by the hypothetical program

This may seem like a trivial difference compared to the way programs and
operating systems are packaged today, but it has important implications if
it is ever widely adopted. The important difference is that the computers
running these programs can automatically acquire and cache the low-level
instructions required to run them. This approach would largely do away with
the bloated, multimegabyte installation packages that are commonplace today.
This can also be used to solve problems with component version control
(through a simple subaddressing scheme), and to reduce the size of operating
systems (additional components are obtained only on an as-needed basis).

Survival of the Fittest Codec: LGM (Little Green Men) Media Files

Video compression is a specific example of where these techniques
can produce big dividends by overthrowing the "tyranny of standards."
Industry standards for encoding images, such as the widely used JPEG format,
are important because they allow systems from different vendors to
interoperate, but they also tend to freeze the industry at a particular stage
of development.

By migrating to a system where the instructions for decoding an image (or
video clip or sound clip) are associated with the media itself, it is
possible to create a system that can evolve much more quickly. In the
current paradigm, we tend to stick with a particular system for encoding
content until it becomes so obviously obsolete that people begin adopting a
new system en masse.

Imagine a new system where the media and the instructions for
viewing it are intertwined, as shown in Example 3.

In this framework, the viewer automatically learns new
decompression algorithms as it encounters them. This would free
developers and multimedia to use whatever tools worked best for
encoding a particular type of sound or image (or region within an
image). This would also enable them to use multiple compression
schemes concurrently (e.g., use one codec to process unfocused background
regions in an image, another to process highly detailed regions
within an image). The important point is that it puts the decision
of which algorithm to use in the hands of the author, who will be
free to choose the algorithms that produce the best trade-off
between compression and quality.

Boldly Going Where No Algorithm Has Gone Before

The basic goal in anticryptography is to create messages or programs that
can be used by a recipient who has little or no prior knowledge about how
they were created. This is a subtle but important difference from the way we
do things today.

Most software today requires the user, or the client program downloaded by
the user, to have prior knowledge about the system or information it is
designed to access. For example, in order to view a QuickTime video clip, a
user has to download viewer software specifically for the purpose of
processing and rendering these files. Without the viewer software, the
information contained in the .MOV file is unintelligible.

By applying techniques from anticryptography to the systems and software
in use today, we can build software that automatically learns new procedures
as they are required and that allows designers to decide which components to
use based on technical merits instead of market share.

Sign up today to receive special discounts, product alerts, and news from O'Reilly.