Generic Object-Oriented Namespaces

Late in 1999, a discussion on the python community's types-sig lead me to
make some suggestions, motivated to some degree by my much earlier work for
Malcolm Sabin at Fegs, for a self-bootstrapping class structure based on a
minimalist conception of namespaces, with magic attributes to implement
particular functionalities, which I view as the heart of python's
goodness. The python community went a different way, for good reasons, and
I've been left with the ideas to play with. As a tip of the hat towards
python's naming, and towards four of the finest comedians in history, I've
chosen the name GOON for the language that's evolving out of my ideas chasing
up on this. It's generic, it's object-oriented, and namespaces are the
fundamental idiom that makes it all work.

My thoughts on GOON are preliminary as yet (Spring 2003) and build
on what I wrote up in the course of thinking about python
and its potential for liberalisation. I intend that GOON do some form of
primitive boot-strapping which it can, at least, pretend it did in GOON; that
it support some structure for telling it, in the course of parsing a module,
to change the rules by which it parses the module (ideally, with enough
flexibility that it can flip into parsing TeX, XML, python, C, Lisp or ALGOL,
but perhaps FORTRAN would be asking a bit much) and even the semantics of what
it parses; that it support documentation strings in XML, complete with an apt
schema/DTD suitably tied to the introspection data of the entities
whose documentation it is; and that the language (in general, not just the
docstrings) be canonicalisable to XML in some guise, with its plaintext
form being merely a natural presentation (e.g. via a style sheet) of its
canonical XML form.

I intend GOON's primitive form (which might be over-rideable by telling it
new parsing and semantics) to use a more extreme one namespace variant
on the simple two namespaces approach of python before 2.2; I intend to
unify all types, notably making class not be a key-word, but merely a
particular creation operator for which various alternates may readily be
defined, with perturbed semantics.

One Namespace

In python, any piece of code comes equipped with two namespaces: one
termed local, the other global. The local namespace pertains to the immediate
piece of code being executed, the global belongs to its context - almost
always the module in which the code appears. This is, in some ways, limiting:
at the same time, however, it lends a clarity and simplicity which is
immensely powerful. In python 2, this has been abandoned in favour
of nested scopes; I do not intend that GOON follow suit - rather, it
shall solve various related problems by dealing with various issues
differently.

One of the issues motivating nested scopes is the desirability of granting
(for example) a function access to some values from the context in which it is
defined. A standard python-programming idiom for dealing with this is to give
the function a formal parameter with a default value - the context-supplied
value in question - and documenting that the function's callers
should not supply a value for the given parameter; such a parameter
is known by various names, but I call it a tunnel between the context
and the function. Crucially, it enriches the local namespace into
which it tunnels; it has no effect on the global namespace.

Tunnels and Tunnelling

The problem with tunnelling via the parameter list is that it can get
messed up by argument passing, hence complicates what can and can't be done
with calls to functions exercising it. My 1999 thoughts on python proposed
some syntactic complications to the existing python argument list as a means
of getting round this: the more brutal approach I intend to take with GOON is
to provide a mechanism for explicit tunnelling. While this imposes a burden
on the author of code, it should make it easier for maintainers to see which
parts of one context are, or might be, accessed by a nested one.

As an example of how this could be handled consider extending
python 1 by replacing

in which the specification of a tunnel-list looks a lot like
that for a parameter-list, albeit with different semantics. Use of an
empty tunnel-list should be equivalent to leaving it out altogether (it
was optional) and leave the function definition meaning what it would in
python 1. By separating the tunnels from the parameters (whose defaults,
where given, will serve to import further values; but these may be over-ridden
by arguments) we ensure that the tunnels don't get messed up by
arguments.

The other problem with tunnelling, central to the motivation for nested
scopes in python 2, is that it's a one-way mechanism for passing
values into the function; it doesn't provide for the function to rebind any
names from its context. While it is possible to get round this - use a
one-way tunnel to pass in a mutable object (list, dictionary, or instance) so
that context (or other functions defined in it and passed the same object) can
consult that object for modifications - it would certainly be nicer to provide
access to context's namespace directly by the use of its names as if they were
locals of the function. It thus makes sense to so specify the meaning
of tunnel-list that it provides for this kind of access to context's
namespace.

We thus have two types of tunnelling to support: a one-way mechanism,
providing a snapshot value obtained from context when the function was
defined, in exactly the same manner as a parameter with default, save that it
cannot be over-ridden by callers of the function; and a two-way mechanism,
providing mutable access to context's name-space, allowing the
function's suite to re-bind names visible outside the function. It is
kinder to the compiler (or, more specifically, the garbage-collector and the
optimiser) to include an explicit statement of which names are tunnelled in
the second manner (so that the interpreter can tell when some of
context's names are never going to be referenced again, so can
be del'd). It is thus entirely natural to have plain names in
the tunnel-list (resembling positional parameters with no default)
specify two-way tunnelling of the given names,
allowing name=expression tunnels (resembling parameters
with defaults) to provide one-way tunnels. It is desirable to segregate the
two types, so let us require all two-way tunnels to appear before all one-way
ones, which will make tunnel-list's syntax agree with that
of parameter-list, except that the *name
and **name forms are not supported in
a tunnel-list.

The above probably makes parsing a bit tiresome; an alternative approach
would be to define

tunnel-spec:
'tunnel' '(' tunnel-list ')'

and prefix both function-definition
and class-definition with an optional tunnel-spec, as

and likewise for class definitions. One possible way to do this
would be to have tunnel-spec take the form of a magic decorator,
e.g. @tunnel(tunnel-list).

A possible implementation of this last provides another potential
approach: allow each namespace to have a magic
attribute __context__ to which the namespace falls back for
handling names it doesn't, itself, bind. This could then proceed recursively,
falling back to the __context__ of each namespace found in this way
that has this attribute. The result could function much like the lexical
scoping presently defined in python, with the module and each function
providing its own namespace as __context__ to each function defined
(possibly nested within arbitrarily many layers of class
statements) in its scope. Then @tunnel() would merely be setting
the __context__ of the function it wraps to a namespace it
constructs, substituted in place of that of its lexical context.

Pseudo-Tunnels

It remains to decide what, if any, semantics to give to
the *name and **name pseudo-tunnels, analogous
to the equivalent pseudo-parameters. Note that my 1999 ruminations provided
for parameter-lists using name-less variants on these; no parameter
appearing after * can get its value from a positional argument, no
parameter appearing after ** can get its value from a keyword
argument; so nameless variants make it a type-error to supply more positional
arguments than there are parameters before the nameless *, or to
supply keyword arguments whose name doesn't match a (non-*) parameter
before the nameless **.

Now, clearly, the * and ** pseudo-tunnels should serve
to specify tunnelling of all names from context into the function: albeit the
optimiser may well thin this down to only those names actually referenced by
the function; and it will presumably skip any names used in
the parameter-list or elsewhere in the tunnel-list. One
pseudo-tunnel will provide for mutable two-way access to them, the other will
freeze a snap-shot of context's namespace, when the function is defined, with
which to initialise, each time the function is called, the namespace in which
its suite is executed. The former is equivalent to listing all
context's names early in the tunnel-list; the latter is equivalent to
including name=name, for each of
context's names, later in the tunnel-list. The latter is just
like a from context import * statement, so let
the name-less * pseudo-tunnel support it. This naturally
leaves the ** pseudo-tunnel to provide two-way access to the whole of
context's name-space.

It must be a syntax error to bind any name in both
the parameter-list and the tunnel-list, or to bind any name
twice in either list; thus any name bound explicitly in either must be skipped
from the names implicitly bound by any pseudo-tunnel, of either kind. For the
one-way pseudo-tunnel, this should be no problem: any name from context whose
value the function needs to access can always be given an alias via an
explicit one-way tunnel. However, for the two-way pseudo-tunnel, this may
present problems - for example, class APIs may require certain methods to take
keyword parameters, thereby requiring that the relevant names appear in
parameter lists; such a method, as context, cannot get out of using that name
and, if it defines a function to be used in similar manner, must use that name
as a parameter of the function also; yet it may need the function to have
two-way access to its own use of that name (one can get out of this by use of
a named ** pseudo-parameter and a little hard work; but that's
irksome in its own way).

This situation can readilly be handled by supporting a named version of
the pseudo-tunnel, providing a dictionary or object to package context's
namespace and binding that to the pseudo-tunnel's name. Since this would be
mutable, it is only really suitable for the two-way pseudo-tunnel;
fortunately, as noted above, it is unnecessary for the one-way
pseudo-tunnel.

There remains the issue of which names the name-less **
pseudo-tunnel propagates: since it doesn't tell us which names it contributes
to our inner context, any reference or re-binding in the inner
context to a name that isn't a parameter or named tunnel could be construed as
referencing the outer name-space, so that only the parameters and
named tunnels would genuinely be local names of the inner suite. This
would appear somewhat excessive. So, instead, restrict ** to only
those names actually bound (in one way or another) by context itself; save
that we must now, in addition to assignment and del, count ordinary
two-way tunnels as binding operations - by the context executing the statement
with the tunnel - on the names tunnelled.

Classes (and their kin)

As discussed in my 1999 ruminations, a python class is created after its
suite has been executed; and this suite sees the module's name-space as
globals, giving it no means, if defined within a function, to access locals of
the function. I would certainly change both of these: create the class first,
then execute the suite using the class' namespace as locals; I proposed
previously that this suite should use, as globals, the locals of the most
closely-enclosing function suite, if any, else the module as in
python 1. An alternative to that proposal (either scrapping globals
altogether or retaining python 1's choice of module as globals) would be
to provide for the class definition to be able to tunnel values into the class
namespace. To this end, it would make sense for the class statement to allow
an optional '('tunnel-list')' following the
existing '('bases-list')', which should probably be
required in this case (omitting it is equivalent to supplying an
empty bases-list, which would now be required explicitly), even if
we're only allowing the tunnel-list to contain one-way tunnels -
i.e. name=expression entries and/or the *
pseudo-tunnel.

Is there any sense to two-way tunnels into a class (or kindred) statement
? This would give the class a rather odd relationship with the namespace in
which is defined: this namespace serves somewhat like a base, in that some of
its names - albeit only a specified subset, unless the **
pseudo-tunnel is used - are visible in the class namespace, with the curious
twist that attempting to re-bind these names in the class namespace (or,
presumably, as attributes of the class) would actually change the relevant
name's value in context's namespace, rather than concealing that value behind
a value in the class's own namespace (as when re-binding an attribute of an
instance conceals a like-named attribute of the class from which it is
derived). This potentially confusing complication might fairly be taken as an
argument against two-way tunnels into classes.

However, if a method of a class is to be able to hold a two-way tunnel out
to the context which defined the class, as would apear desirable, we must
allow the class to have pulled in the two-way tunnel if only so that it can
forward it to the method.

Two-Way Tunnels Are Transient

Particularly when a class only takes in a two-way tunnel so as to forward
it to some methods, but possibly in other contexts, it may be desirable to
have tunnels go away once used. This can't be implemented by
applying del to the tunnelled name, since the whole point of a
two-way tunnel is to let re-binding (including un-binding) operations on the
name apply themselves to the context from which the name was tunnelled. So,
if we don't want the names tunnelled through a class namespace to be left over
in the class as a side-effect of tunnelling, we need some other way of making
them go away.

Furthermore, while rebinding a two-way tunnelled name in the suite of the
class is properly redirected to affect context's version of the name, my main
dislike of two-way tunnels into a class arose from the fact that, as
attributes of the class, they would behave very strangely; re-binding
the given name on the class after execution of the class statement
would take its effect on the remnant of context's name-space that got
preserved to carry the names two-way tunnelled out of it. In all other
situations, re-binding a name on a class affects the attribute dictionary of
the class itself: if a class inherits some attribute from a
base, deling that attribute from the class does not affect the
base. Having a strange exception to that might be fascinatingly useful, but
it would also certainly be confusing (and liable to make for some bugs which
would be very hard to track down).

These two problems may be resolved by one simple expedient: make two-way
tunnels affect the suite into which they tunnel without affecting
any persistent namespace being created by it. Inner suites into
which a two-way tunnelled name is forwarded as a two-way tunnel will (in any
case) be accessing the name in the origin namespace from which the name's
two-way tunnelling journey begins (i.e. the context of the outermost statement
in the nested chain of two-way tunnels leading to it), not from the immediate
parent who only had access to the name via a two-way tunnel; so there is no
conflict with the forwarding of tunnels (whether via classoids or functions),
except in the case where the **name pseudo-tunnel is
used. This would, with the given resolution, only provide name's
name-space with entries for the actual locals of context, without any two-way
tunnels context received. This would be at odds with the name-less
variant, which would forward all two-way tunnels context received. I'm not
quite sure what to make of this, in all honesty, but it doesn't worry me
much.

This would make a class statement's suite see the two-way tunnels
of the tunnel-list, and enable it to forward them to methods as
required, without contributing those names to the class namespace once
constructed.

Sanity check: suppose a method, of a class conforming to some API, defines
a class that must conform to that API, so (as discussed when introducing the
named pseudo-tunnel) has a method with a parameter with the same name as
method which defined the class; if the inner variant on the method needs
two-way access to the outer variant's parameter with the relevant name, it has
to get it via a named pseudo-tunnel; can it ? We can two-way tunnel the name
directly into the class but then we can't named-pseudo-tunnel it into the
method; so we'll need to named-pseudo-tunnel it into the class, then forward
the name used for this pseudo-tunnel to the method - either as a one-way
tunnel or as a two-way one - which will work, so we win.

If a class has a base with an attribute which coincides with a name
two-way tunnelled into the class, the base's attribute will be concealed for
the duration of the class statement (which will be unable to over-ride the
base's value for that name) but will subsequently provide the relevant
attribute for the class. If a class needs to define a method with two-way
access to what context provides under the same name as the method must have,
the class will have to use a named pseudo-tunnel to bring in all context's
names, then forward the pseudo-tunnel's name to the method (as either kind of
tunnel) so that the method can access its eponymous attribute off the named
pseudo-tunnel object.

No globals

The two-way form of tunnel should not depend on the relevant name having
yet been bound (when the function tunnelling it is defined) in the context
which defines the function; otherwise, defining recursive functions will
require special treatment, and defining a mutually-recursive collection of
functions would be severely tiresome. With this proviso, however, it becomes
possible to entirely do away with the global namespace.

It is perhaps sensible to retain the built-in namespace; but even that
could be delivered via some base-class from which module, class
and maybe even function evaluation namespace are derived, so that every
module can see the built-in names simply as part of this common heritage. Any
suite whose locals do not inherit the built-ins would need even these names
tunnelled in explicitly; arranging for every suite to be able to see the
built-ins (either by having them as a separate name-space, or by having all
namespaces inherit from a built-in-carrying base) is justified by
functionality delivered by the built-ins being universal, generic and
ubiquitously needed. The advantage of having the builtins provided to suites
during execution by the interpreter, rather than having them inherited off a
base of every name-space, is that the latter would cause the builtins to
appear as attributes of every object, which may fairly be deemed
over-kill.