The task.
We use the same perceptron-like network and the same input data as above.
The goal is different, however.
The task is to find weights such that
equals the sum of the positions
of on-bits in , for all
possible .
Again, the task will be made difficult by providing only very
limited training data.

The solution.
The only solution to the problem is: make all
equal to .
Like with the example above, there are short and fast
programs for computing the solution.

The training data.
The 3 training inputs , , and
from the previous task are used.
The target values are different, however.
Obviously, the target for input vector is 108.
The target for input vector is 126.
The target for input vector is 221.
Again, success is binary:
only if the solution candidate fits
the 3 training examples exactly,
the solution is evaluated on the test data.

Why conventional neural net algorithms fail
to solve this problem: for the same reasons they fail
to solve the previous problem (see section 4.3). The training
set is too small to obtain reasonable generalization on the test set.

RESULTS.
Programs fitting the training data were found in 10 out of runs,
using up a total search time of
time steps.
8 of the 10 successful runs led to perfect
generalization on the
161,697 unseen test examples.

The first weight vector fitting the training data was
found after 6,902,963 runs.
Again, the corresponding program
was a pretty wild one. But it led to perfect
generalization on all the test data.
Before halting, the program used 502 out of 8192 allocated time steps.
Its time probability was .
Its space probability was .
Table 3 shows the used part of the
storage after execution.
What the program does is this:

Table 3:Used storage after execution of a program for the adding perceptron.

(2) Get the contents of the input field (see list of instructions in section 3)

at position 11 (which is 0), and write it into address -2.

(5) Write the contents of address -3 onto the weight pointed to by WeightPointer

and increment WeightPointer. Halt if WeightPointer is out of range.

(7) If the contents of address -3 is less or equal to the contents of address 9,

goto address 11. Otherwise goto address 11.

(11) Increment the contents of address -3.

(13) Goto address -1.

(-1) If the contents of address 7 is less or equal to the contents of address 3

(always true), goto address 5.

The instructions beginning at the addresses (2), (7), and (-1) are useless.
But at least they are not
catastrophic. Essentially, the program first allocates space
for a variable (initially zero) on the work tape
(recall that the
program tape is ``read/execute'' only, and cannot be used for
variables). Then it executes a loop
for incrementing and writing the variable contents
onto the network's weight vector.

After runs, a
faster and rather elegant (nearly minimal) program was found.
Table 4 shows the used storage after execution.
The program ran for 302 out of 512 allocated time steps.
Its space probability is .
Inspection will reveal the operation of the program.

Table 4:Used storage after execution of a more elegant program for the adding perceptron.

Addresses:

-1

0

1

2

3

4

5

6

7

8

9

Contents:

100

7

1

1

-1

8

-1

0

1

2

2

Using different sets of 3 training examples
(obtained by randomly permuting the input units that are never on)
led to very similar generalization results.