here are some musings on intelligent self-modification, based on past
experience prototyping similar things in Webmind Inc., and our plans for the
future with the Webmind AI Engine

It's a long and winding train of thought, but if you follow it to the end,
you may find it interesting...

In Webmind, the approach we use to schema learning (procedure learning,
program learning) is unique, but the best way to cast it in non-Webmindish
terms is to call it "a fusion of evolutionary programming and probabilistic
inference and neural-nettish association spreading." Of course it's really
nothing so awkward as a pasting-together of these three things, but it does
involve these three aspects.

In ev. comp. terms, one of the issues that arises is schema learning is
brittleness. Mutating and crossing over most working programs leads to
useless or even meaningless programs. Of course, some programming languages
are brittler than others; LISP is much less brittle in this sense than C++,
for example.

On the other hand, DNA, which is the programming language that is used to
encode living organisms on Earth, is remarkably robust and un-brittle in
this sense. DNA programs can be mutated and crossed over relatively freely
(yes, there are constraints, there are 'hot spots' where crossover is more
likely, etc.) with a decent probability of producing a functioning, though
different offspring. The reason for the robustness here is that the
"compiler" (the network of protein-protein interactions that build cells,
cell interactions that build bodies, etc.) is very, very tolerant of errors,
and has a lot of clever ways to make things work and come out interestingly
instead of erroneously. The DNA program is fairly interpretable as the
initial condition of a complex self-organizing system (the developing cell)
rather than as a series of instructions for a logical system. (Both
interpretations shed some light, actually.)

How is this kind of robustness to be achieved in the mind, in the context of
schema learning?

I believe that in the mind there needs to be at least 2 levels of
representation for procedures. This is not a new idea, it occurs in various
forms in the "AI planning" literature, but it's usually mixed up with
various other things.

1) "pseudocode" level, where the basic procedure being carried out is
outlined in "almost precise" terms
2) "program code" level, where everything is really laid out in detail

Of course, this is a very limited and flawed analogy, but there may be
something to be learned from it. That is, there may be some small aspects
in which the analogy is valid.

The idea is that, on the pseudocode level, mutations and crossovers lead to
algorithms that may be useless, but are generally semantically meaningful.
One can freely play around with pseudocode.

Getting from pseudocode to program code is a matter of "procedural
refinement", one aspect of procedural cognition -- and this can be very hard
thinking, of course, involving invocation of relevant experiences, formal
knowledge of algorithms, knowledge of the operating conditions of the
program, etc...

My feeling is that, in the human mind, we

a) come equipped at birth with the shell of a "pseudocode" representation of
procedures
b) fill out this shell into a complete internal pseudocode framework through
experimentation with procedures of physical action
c) through the experience of translating pseudocode representations of
real-world actions into detailed real-world action programs, we gain an
intuition for procedural cognition (including procedural refinement, as well
as the learning of new pseudocode programs)
d) later on, take our pseudocode and procedural cognition frameworks and
extend them to abstract procedural domains

A very common example of the distinction between what I'm calling the
pseudocode level and the program code level is *timing*. A plan, in the AI
literature, rarely includes precise timing of events; a schedule, on the
other hand, does. Planning involves figuring out what steps to take whereas
scheduling involves figuring out exactly when to take them. Filling in
timings is one example of something that happens during the transformation
from pseudocode to program code.

In Webmind terms, the pseudocode level basically consists of a node-and-link
representation of schema *without* any of the schema-specific bells and
whistles (like token-passing to indicate that one procedure can't start
until another has finished, etc.). The program code level consists of a
fully-specified node-and-link representation of a schema.

*IF* one had a procedural refinement approach that worked -- i.e. a set of
mind-processes that take in pseudocode and spit out program code (in the WM
case, the program code is a fully specified schema) -- then one would be
well-positioned to tune the system to learn new pseudocode to approach new
problems. Because for each new candidate bit of pseudocode, the system
would know how to create a new program code embodying the ideas outlines in
the pseudocode, thus being able to test the pragmatic validity of the
pseudocode.

If we're to use the human case as an analogy, then procedural refinement is
going to have to be learned or minimally parameter-tuned by the system via
experimentation in a simple action domain (simpler than the domain of
modifying its own source code for example). Here is where all the classic
AI problems of playing games, moving blocks around, etc., become potentially
interesting. If a system can't translate pseudocode into program code in
simple contexts like this, having it do so in the context of modifying its
own behavioral schema is obviously pretty hopeless...

The difference between these thoughts and our past WM work on schema
learning is basically the strictness of the breakdown into two levels.
Before we had

--> SchemaConceptNodes, an abstract level for reasoning about procedures
--> SchemaNodes, for actually carrying out procedures

and we actually had relations between SCN's and SN's on different levels of
granularity, but we hadn't formalized a distinction between two granularity
levels of schema relationships (here posited as pseudocode-ish and
program-code-ish)

This is actually a point that was raised many times in past (WM-internal)
discussions on schema learning, in the guise of the need for a "plan
optimization" phase. But the idea there, I think, was that the first phase
of schema learning would result in an inefficient but detailed program, to
be refined into an efficient one. what I'm suggesting here is that the
first phase actually doesn't result in the learning of a detailed program at
all, but just a pseudocode sketch.

OK -- that's all for now. The next step in the train of thought goes into
the details of pseudocode versus program code for WM schema, and I won't
type that stuff in now.