(Nevertheless, there are still some interesting ideas in these first several posts)

My goal is to build (a proof of concept for) a computer system in which the entire runtime is composed of objects that can be inspected and manipulated by the user at runtime, to the extent that even the language/compiler/system itself can be inspected & modified/redfined at runtime. This can be accomplished using a very lightweight self-defined object language. This idea came to me after reading The Deep Insights of Alan Kay and watching his video about Programming and Scaling, which inspired me to create language & system consistent with the Kay/Smalltalk idea of objects being a recursion of the notion of "computer" (and vice versa). I'm also borrowing from the DCI philosophy that true "OBJECT"-Orientation is about the objects themselves (classes are not necessary).

I invite anyone to explore/steal my ideas and code on this topic freely without my permission. I just want to further the potential that this approach may offer to the future of software.

The core language will define simple objects (which consist of mappings values, functions, or other objects), and will initially be implemented in JavaScript wherein objects an be compiled directly to JSON.

The compiler will compile objects from source code into a form that allows them to be manipulated at runtime (e.g. add mappings to objects, get mappings by name, etc.), and will also generate a runtime interface for viewing/manipulating/creating all objects at runtime (both by the user and by code that has been compiled). The compiler & runtime will initially be in JavaScript, which already provides JSON as an ideal representation for the compiled objects (though the runtime interface will still be needed for the user).

I will then re-code the compiler in my object language and have it compile its own code (bootstrapping). This will allow the language to be extended or modified using only the language itself.

(At this point, I can now add code to the compiler for it to compile programs to various plaforms (e.g. JVM, CLR, z80), and then I could compile the compiler itself to various platforms. This step can be done at any point in this flow, though it would be cool to have it working on multiple platforms from the start and all using the same exact code.)

From here, I can modify the compiler to compile its own code (excluding the code for this part) into the program being compiled, so that the output program has the compiler built-in to its runtime framework. I'd also modify the runtime interface for the compiled objects to utlilize the compiled-compiler so that when objects are created at runtime, they can contain function-code that can be compiled on the fly! (Note: the code for the compiler should be small, because the language is small)

An interesting paradox occurs at this point:

The compiler and runtime-interface now exist as runtime objects, which means that they could be modified at runtime via the runtime interface, and the definition of the system & language are now self contained and modifiable within the running system. Though the compiler and source code that exist outside of the executing runtime remain as they were, it is now possible to keep program code within an executing runtime and compile it into execution (though perhaps this is pointless if you can just create and modify objects directly, anyway).

I'd have to have some UI features built-in to the underlying architecture to facilitate things like typing, drawing, I/O interfactions, etc. (and how these are approached would differ depending on implementation: JavaScript, .NET, etc., but I'd try to have the underlying system provide the same interface to my runtime). Initially I might code up a JavaScript UI to provide the user access to the runtime-object interface; though once that is in place, I might use that UI interface to create runtime objects that provide a different view of the data, making (some of) the "built-in" UI obsolete. Or perhaps I'd find ways to code up parts of the same interface using my object language, and then it's all modifiable internally.

I've been batting variations of this idea around for a while now, so let me try (again) to nail down what I'm after:

The goal is a system/language/environment (as a proof of concept for Alan Kay's "Objects" vision) in which:

1. The user can mold the system like clay: EVERYTHING is an object (in the OOP sense), and the user can inspect/modify/create them directly. Not through an API or traditional "programming", but through some direct interface. For example, I just "Make" a clock, or a tool, etc. by directly viewing/editing/creating objects AT runtime!

2. The computer / system / language itself is (as much as possible) defined in terms of runtime objects, so the user (or any code) can modify/extend the system/language directly, including the code that controls how objects are wired, how code is interpreted, etc. For example, there will be pre-coded tools just sufficient to provide access to viewing/editing/etc. of objects, but then those can be used to create better replacements, or change the interface for such, etc. (There will have to be some base set of functionality for graphics/IO/etc., but the goal is to keep that minimal, and see what can end up being done in objects).

3. I'd also like to be able to bootstrap the entire system/language: that is, once I code up the base of it all, I'd rebuild the whole thing in it's own language/objects. This (1) provides the ability to "drop" the compiler/kernel into the runtime itself so all of the system can be modified and inspected (at some level) and (2) once the source is all in it's own language, I can code different backends to port it to different environments. (I'll start with JavaScript though, and then consider bringing it elsewhere: JVM, .NET, z80, MC68K, etc.)

Alan Kay's vision of "Objects":

THIS is my interpretation of what Alan Kay's "object vision" was about: "Objects" were originally meant as a tool to allow PEOPLE (even CHILDREN) to dynamically interact with the computer, making the computer a means in which HUMAN mental models can be created and explored (in fact, usable software is software that captures the human mental model, and not this thing called "the programmer's mental model").

"Object Oriented Programming" was spurred by this vision; but because its inventors failed to fully understand this vision, OOP became a tool for PROGRAMMERS to do dynamic things to CODE. OOP should be called "Class"-Oriented programming, because it's structured around polymorphism/inheritance/etc. of DATA-TYPES, and this structure does NOT match the runtime structure of the OBJECTS. (JavaScript is a good "Object" Oriented language).

Implementation strategy:

I've come to realize two things that may differ from my original approach, but that would make a key difference in meeting the goals above:

A. Develop a runtime model, not a language. Because the user would be inspecting/modifying/creating runtime objects (rather than the "code" behind them), I just need to code a runtime system of objects and an interface for interacting with them. Do not worry about syntax and language, worry about interface and operations.

B. Code (objects have methods) is not compiled, but is stored in a format that is interpreted by the runtime. This is important to allow code to be modified/specified through a runtime interface. As per goal #2 above, the interpreter itself should be coded in terms of the same code / objects, so that the interpreter/language can also be modified/extended at runtime.

B.i. As per strategy #A, code is stored in an object representation rather than as text. Instead of ever "compiling" code, you'd use some interface to modify/create the AST directly. As per goal #2 above, this interface would be coded in terms of objects and modfiable/replacable. (You could even code a compiler/interpreter to read from text).

B.ii. The "language" of the code should be very small and orthogonal to simple runtime operations on objects. This will allow code to be concise, interpreted quickly, and easy to understand in terms of the system/objects.

Other than that, I'd still support a model of "objects" similar to JavaScript, where each "object" contains mappings from strings to other things (values,objects,functions,etc.). This allows new entries to be added to objects (Foo.x = 5, whether Foo previously had an x or not), and for entries to be looked up by string (Foo["x"] in JavaScript).

Message-Passing approach:

I've decided that a "message-passing" paradigm (as with SmallTalk) based on a series of symbols would suit these needs very well, and tie member-access (Foo.bar), method-invokation (Foo(bar)), object definitions into a uniform pattern, while also allowing most of the "language" to be defined/modified/extended in terms of runtime objects (which, like all others, can be accessed/modified at runtime).

Essentially, code would consist of a series of symbols, where each symbol passes a message to the result of the previous evaluation. Generally, the first symbol gets passed as a message to the global (or local) context, which responds by returning the object or value or function defined with that name (or the null/"empty" object); the next symbol gets passed as a message to that, etc. When an object is passed a message, it returns the object/value/etc. defined with that name as a member of the object. When a function is passed a message, the function is evaluated with the message as an argument, and a value is returned. Multiple arguments can be passed to a method by grouping multiple expressions (separated by a comma symbol) in parenthesis. Parenthesis also starts a new expression, and the result of the entire expression is passed as a message to the previous symbol. A Comma also starts a new expression.

An example might look something like this (Where Foo[x] means Foo.(contents of x), e.g. Foo["a"] is Foo.a, and I'll put comments after // just for this):

A big idea here is that, because an entity decides how to respond to a message, then an object can be defined as a function which returns certain items for specific symbols, and symbols like ":" and "," and "+" as functions which perform the appropriate operations. This allows symbols to be defined however you want (kind of like Forth), so you could defined DSLs like {email blah@blah "The message" cc Joe}. When a function is executed, it could provide a new context in which to look up and define symbols (local vars), and I could chain contexts together so that global/outer contexts provide defaults behaviors. I could also treat objects as contexts which chain to the current executing context, or even allow this link to be defined explicitly, thus allowing an object-prototyping mechanism!

The root of this all would be some global context (which would be the whole system, I supposed), which would be pre-set to respond to certain symbols a certain way; but this global object would just be another object in the system, so it could be modified directly, and the default behavior/language changed in the runtime! Also, the behavior of even looking-up a symbol would be coded as behavior of some object, so even that can be modified.

(I must stop here and go to bed. Otherwise I'd have edited the above to be less prose-like and more spelled out).

You can see in the sendMsg function that I am treating arrays as functions, but also attaching a hook from the "global" object to "window", and treating native functions as functions too, so that there is an easy direct way to define new things into the language (for JavaScript, anyway). I'll see where this takes me, and I might revisit how I put the code together. Ideally I'd switch some of those direct checks for '(', ')', ',', etc. into simple entries within the "global" object, which would allow them to be redefined at any level. I am still figuring out how exactly I want to chain contexts, and pass arguments.

The real goal behind the message passing language suggestion in the previous post was that the code in objects needs to be easily viewable/editable/runnable, and there are really multiple ways that can be accomplished. Here are some of those ways:

1. Store the syntax tree and have a built in interpreter for it. Everything is then editable, etc, but aside from this interpreter and a small kernel, nothing is ever compiled down into the underlying "machine" code. This makes the kernel and interpreter unmodifiable.

2. Similar to 1, but code is represented/stored in the same form as data (as lists, blocks, objects, etc.) Rather than having a special representation for code. This is like Lisp, Forth, and Rebol. Forth never compiles down to machine code, but claims to make programs much smaller and often faster & more efficient due to the fine gained level of control it provides using just a few building blocks.

3. Add Just-In-Time compilation to either 1 or 2. When code is defined/edited (or when data is used as code), it is compiled and the compiled code/function is cached with the source code/data. When "code" is changed, it is recompiled (or perhaps just party of it is). This WOULD allow the kernel and interpreter to be visible and modifiable by its own runtime environment. This probably cannot be used backwards (i.e. JIT decompile) to expose external code to the runtime, because not all external code might map onto the provided code constructs, and (for JavaScript) this would break closures.

I also have to worry about closures/context. That is, if code (a function) refers to some "x" that is defined in the surrounding context (but not within the function itself), what happens when that code (function) is passed elsewhere and executed outside of that context? If "code" is stored as an array of symbols for the interpreter (or some object that it is passed as a message to) to evaluate, then does it see "x" and look it up in the current context, or does it see a reference to some object (that was just called "x" only when the code was built)?

In JavaScript, you can "edit" a function by toString()-ing it (which gives you it source code WITHOUT the function declaration), edit the code as a string, and then "eval()" it inside new function-declaration string to get a new function; but this breaks closures: if the function originally referred to some "x" in its code, that "x" gets mapped to whatever x was in the context it was originally created in; but when the function is "edited", then "x" is re-mapped to whatever it means in the new context. The underlying representation of the function is thrown out and recreated all over again!

When I provide a means to edit code directly in runtime objects, the underlying structure needs to be preserved. If captured variables are stored as references, then they remain intact if that part of the code is unmodified, even if surrounding code changes.

The problem with storing references though is that you lose the names and drift away from message passing, and it becomes harder to serialize runtime objects as data (cannot easily "save" things in the system; it all resets when the whole thing shuts down). This might be alleviated by having code have a context object associated with it, and then symbols are looked up (or passed as messages to) the context object. There can be a hierarchy of contexts, too. This also exposes the "closure" mechanism, which is good, because then perhaps even those workings can be something that is defined/modifiable as part of the runtime.

1. Create a small runtime "Objects" API:
- JSON Objects (and their properties) can be created & modified on the fly at runtime (think JavaScript).
- Code is represented (as ASTs) as such objects ("Code-objects").
- A built-in interpreter (or JIT compiler) that can run such code.
- The API exposes all such objects and operations at runtime.

2. Create the runtime API in terms of itself:
- Code-objects are directly tied to natively-compiled version of themselves, and are JIT-recompiled when modified.
- All native code for the API is tied to pre-initialized code-objects, thus exposing the API (and runtime) itself for runtime modification.
- Modify/expand the API at runtime (can even expand its native interface & add code for decompiling native code into runtime code-objects).

3. Expose the runtime API to the user through a simple UI:
- The UI should dynamically expose all objects and API operations.
- The code for the UI is tied to code-objects, and thus exposes itself (and everything else) for runtime modification by the user.
- Modify/extend the runtime & UI directly (no more need for development tools!)

4. Now you have a system that can be modified/viewed/etc. DIRECTLY!
- Once started, the runtime is kept running, and any further changes occur at runtime through the API & UI.
- You can now view and modify everything within (and about) the system from a user perspective.
- Other views for the runtime can now be created in addition to (or replacement of) the initial UI & API.
- Can now even remove portions of the runtime that were only needed to create the initial runtime (perhaps even by code pre-initialized to do so).

5. Create a distributed environment based on this runtime:For this to work, requirements need to be added (or conventions followed), as follows:
- All objects should be self-contained (e.g. not contain external references). This allows them to be serialized and transferred (e.g. over a network) as values between environments.
- An "execuction environment" is modeled as an object / objects may contain their own (nested) execution environments:
. . - Allocations, execution state, etc. are all properties of the environment-object.
. . - An executing program can thus be serialized from an object and transferred as a value. An execution environment can now be inspected as a value, or nested within another execution environment.
- This is what Alan Kay meant by "objects", and how they are a recursion on the notion of "computer" all the way down.
- The distinction between networked computers and networked objects within a computer becomes much more trivial.

Implementation Notes:
- Do not get stuck on specific implementation choices that can be changed later, since everything will be openly modifiable.
- It is OK to build a layer that is not modifiable, because such a layer must exist (this is either the hardware, or an object-based "soft-machine"). The main importance is that things WITHIN the runtime are open, and that the runtime objects (and the API operations for them) are exposed and modifiable.
- Do not get stuck on LANGUAGE: Objects and operations are exposed directly rather than through a syntax. There is no "source" representation of entities, just the runtime entities themselves.
- Code (within code-objects) would invoke the runtime operations to create objects rather than "declaring" entities. This may include copying pre-existing ("static") objects stored directly within the code (or elsewhere).
- Code-objects may contain direct references to objects, rather than having to "identify" them indirectly. However, this makes it hard move (or supply) code to different contexts and makes it harder to reason about code, so loose coupling is preferred.
- Loose coupling within code can be achieved by having code-objects keep a reference to an activation-record object, and mapping identifiers in code to entries in this object.
. . - Activation Records would contain allocations for "local"/temp values within the code.
. . - Activation Records would act as a "root" node from which the code can reference out into the runtime.
. . - Code-objects and their Activation Records can be modified (externally or from within their own code) as any other objects.
. . - This mechanism for mapping code-identifiers to objects in the Activation Record might be controlled by code that modifiable at runtime. Perhaps code invokes the mechanism directly, and thus other code can be provided for custom look-up behavior (thus allowing a form of DSL implementation).
- All objects should be part of some tree, so that the runtime structure of all entities is accessible (externally or within code). For example, in JavaScript you cannot change a function without breaking references to the closure-context; but if this is exposed in the form of an Activation Record, then it an be kept in tact and/or edited directly.
- Function arguments can be passed by setting values directly into the function's activation record. Multiple instances / stack-frames of the same function can have their own activation record objects, but reference the same code-body object.

Edit - Modeling Execution State:
Execution state should be stored separately from the code-objects being executed. There would be one object containing a list of stack-frames, representing the call-stack, with each frame containing a reference to the code being executed (function & position) and an activation record of local variables & arguments. Code would be allowed to access/manipulate the execution stack directly. This allows for the following benefits:
- Code can be modified/serialized separately from it's execution state.
- Serializing execution state thus becomes a separate matter (or a non-concern).
- This makes it easier to model execution in the environment itself, because it all comes down to modifying the execution stack object, which only requires the use of the basic object operations (e.g. get/set property). For example, handling an "if" by setting the execution point to the next piece of code within or after the block. Perhaps instead of having an "if" or a "while", etc., code would just directly specify the operations associated with those things.
- This opens the possibility of having multiple execution threads at once, each with a references to parent/child threads. (Whether or not more than one thread executes at a time is a separate matter. JavaScript has threads while only supporting single execution). Per the previous point, code might even accomplish this explicitly, whether or not there was "built-in" support for it to begin with.
- This also allows closures to be modeled as code-objects which populate the stack-frame with the stored data before calling the main code-object.

Edit: ...and again. Added a mechanism to O.exec to "break" out of (an arbitrary number of layers of) code, and to abort execution with an error message. Currently, this is not exposed to the code being evaluated within O.exec, and I don't see a way to do so that does not require hard-coding the instruction into the exec function itself (otherwise the indicator would just get nested as the "result")

Here's another update (I'm going to stop editing the same code above).

Changes:
* Updated the "break" instruction to break at a label instead of by an given number of levels of nested execution. Each code entity may contain a "label" property for this comparison.
* Added a "return" mechanism. The exec code considers a "call" to be a code-object that was accessed by a lookup, rather than directly nested within the "outer" code.
* Wrapped object-literal definitions in calls to O.copy, which constructs empty objects using Object.create(null). This results in a CLEAN object without any prototype chain.

Changes:
* I actually tested stuff, and fixed some errors
* Added a prototype mechanism and updated the "get" and "has" behaviors accordingly
* Updated "exec" to return an error (instead of the code object) if the specified "op" does not exist
* Currently in the process of deciding how to link contexts between calls. For now, inner calling-contexts have a prototype chain to the outer calling contexts by default. Still unable to "get" the current context using the existing mechanisms (unless you have a direct reference to begin with). I will fix this and update it later.

...I'm getting the feeling that I'm more or less just reinventing LISP. However, the next step will be to build a UI out of the same constructs, which can then be modified in place using itself, and (through iterative changes) result in the goal of a "malleable" software system

Changes:
* Replaced the {op:opname, args:arglist} convention with just {name:arglist}
* Added ability to "get" the current executing context, code-object, or arguments. Also, the executing context now contains references to the args and parent-code by default (so you can get the context and then get those properties of it; but the built-in "args" and "code" commands are simpler).

To Do:
* Perhaps I can tweak the "return" behavior to just break out of the "code" reference within the current executing context, rather than trying to determine the base "function" by how it was called (though it essentially results in the same thing)

...I'm getting the feeling that I'm more or less just reinventing LISP.

LISPers are convinced that this statement accurately describes the last 30 years of work in programming languages. You're going to need some good metaprogramming facilities next, which means you might want to read this.

...I'm getting the feeling that I'm more or less just reinventing LISP.

LISPers are convinced that this statement accurately describes the last 30 years of work in programming languages. You're going to need some good metaprogramming facilities next, which means you might want to read this.

LISP was onto something big: the code, and even the language itself, can be inspected & modified by itself; and thus a hope for the "stuff" of computers (objects, widgets, programs) to also be grown and manipulated in the same ways -- not just from a programming perspective, but from an USER perspective. Those possibilities, and not LISP itself, is what makes LISP so profound.

I think that the theory that LISP eventually shows up (in some convoluted form) in any sufficiently complex software, could more accurately be stated that "living structure" of the kind of power I mentioned above is what shows up repeatedly, and as LISP maps more directly to this nature than any other language, you can generally map (or equate) those patterns to LISP in some form.

I think even the vast majority of LISPers miss this fact, and only understand the "power of LISP" only within the context of programming & coding, which is an extremely limited view if you consider the larger possibilities of extending this power to the end-user: being able to "make" or change anything in the computer as you see fit, through some DIRECT means of interaction (or the illusion of it), and even change how that process of interaction works (again, through direct means). Take the "power of LISP" and pull in HUMAN expression, not just code.

Two take-aways: (1) The "power of LISP" is actually a more fundamental property of "living structure", and fundamental properties and behaviors have been found that are consistent in ALL living structures, whether in nature or in man-made things. LISP has some of these properties, and in fact cannot present the power that it does WITHOUT having these properties; they are universal and can be studied directly in all "living" structures. (2) This power can be extended to all aspects of software through to the end user to create "living structure" in the world in ways yet unrealized. That is, LISP may have been successful in bring some amount of "living structure" to the CODE, but this has yet to happen with the actual resulting software itself. When it DOES, then the "real" computer revolution can begin.

I'm hoping to achieve some of "(2)" by creating an open-ended structure of code & data (LISP has done this, and hence the similarity); but I am THEN hoping to extend this through a user interface. From that point, I can modify all the code and the UI by interacting directly through the UI, and not necessarily just because I can "view the 'code' ", but because all the pieces of the system correspond to objects that make sense in a direct-representational manner (and not just in the context of "code").

Part of the reason why LISPs have been able to pull off their grand unification of code and data so effectively is that they are also syntactically trivial. Smalltalk had a lot of similar ideas in terms of "objects all the way down" (vs lists all the way down), but I think this was unfortunately obscured by some fairly messy syntax.

Also, as a slightly different note, "object orientation" (without classes) is semantically equivalent to closures. Adding in classes and metaclasses is just adding additional layers of closures. Recognizing this is important to clear thinking on the matter of designing a system that is "objects all the way down".

"object orientation" (without classes) is semantically equivalent to closures. Adding in classes and metaclasses is just adding additional layers of closures. Recognizing this is important to clear thinking on the matter of designing a system that is "objects all the way down".

Agreed. In fact, here's an excellent article from vpri.org that does just that. Also, in JavaScript, I like to use closures on captured " me" instance rather than using "this", and I think that's the same idea.

As for syntax, I won't disagree. However, once the elements of code are exposed through an interface that is 1:1 with the AST structure, syntax becomes irrelevant. That is, there is less a concept of textual representation and parsing as there is of just direct representation and manipulation of the underlying structure. Of course, LISP did textualize that very very well.

Having just spent 18 months manipulating C++ ASTs, both through libclang, and as Racket syntax objects (parsed S-Expressions with source metadata), I can tell you that trivial parsing is only half the battle. Any visitor for your AST needs to implement special-cased logic for every core form in the language (if you're using a typed AST, this means essentially for every node-type, or at least a large subset of all node-types). Now compare the grammars for Racket (in which every language feature that doesn't appear in a fully expanded program is implemented as a macro), and C++.

Having just spent 18 months manipulating C++ ASTs, both through libclang, and as Racket syntax objects (parsed S-Expressions with source metadata), I can tell you that trivial parsing is only half the battle. Any visitor for your AST needs to implement special-cased logic for every core form in the language (if you're using a typed AST, this means essentially for every node-type, or at least a large subset of all node-types). Now compare the grammars for Racket (in which every language feature that doesn't appear in a fully expanded program is implemented as a macro), and C++.

I appreciate that advice (it's awesome that we can even be having this conversation -- Cemetech++), but it might not be entirely applicable to what I'm after:

There is no parsing, because one essentially just edits an AST directly. This is more like a LISP program modifying (e.g. CONSing) data/code entities at runtime, and less like evaluating source code at runtime. I think you understood that already, but the point is a self-modifiable system rather than a simpler compilation model. In fact, there is no compilation either, because the runtime executes the AST directly.

As for the "core forms" of the "language" ... neither of these terms refer to anything concrete, because the whole system is self-defining and self-modifying. This is more like exposing the interpreter (or JITer) for on-the-fly modification at runtime, and less like having a compiler for X written in X. It's kind of like have a MOP (Meta Object Protocol) that is it's own MMOP.

So anyway, I don't know how much sense it makes to talk about visitors and core forms, just as it doesn't make sense to talk about "reflection" when the source constructs do not differ from the runtime constructs.

Now there WILL be a base structure and base operations (I'm going for CRUD operations on ad hoc objects, whereas LISP has CONS/CAR/CDR on lists); but these operations are provided through a message-passing paradigm rather than through a "spec" or grammar. There is less "case"-ing over a set of operators and more invoking operations name.

One other thing which distinguishes this work from a traditional "language" or runtime is that it is meant to evolve over time. This is different than "starting" from a predictable initial state every time a program is run, if even the program is allowed to modify it from that point (e.g. as with a MOP). Instead, it keeps running (and/or saves its state), somewhat like a never-ending debugging session. This is what an OS does. However, with this, you don't NEED an OS, because whatever your computer does or whatever is on it is all the same kind of "stuff".

It doesn't stop there, as one would expand this system through UI interactions in meaningful ways. For example, when one edits an image, one IS editing the data directly, but the display and tools and idioms are in terms that make the most sense for what it is, rather than just editing raw data values. In this system, you could define and modify whatever idioms & tools for whatever you want, and the hope is to arrive at something that is very easy to make do whatever you want in the most meaningful ways. See thesearticleshere for good examples in the right direction.

The great thing is that, once this kind of practice becomes better understood, then it would take VERY little to get a system up & running with whatever you want. See this article for a somewhat related example.

... I actually recently came up with a "story" for the progression to such a system, and/or the logic as to what kind of system is necessary for this, which I will post next. I kinda took several stabs at this already in previous posts here, but those were mostly stabs at specific solutions than at the problem / gain for fixing it, which is what I think I have now come much closer to describing directly, which I will share next.

A "computer" can be thought of as an environment for interacting with virtual "Artifacts" (images, videos, documents, etc.). "Tools" (programs) provide the ability to view, edit, and create artifacts (an image-editor, a document-editor, etc.). Thus, the kinds of artifacts one can make and the ways in which one can interact with them depends on the available tools. If you want a new kind of artifact, you need a new tool; if you want a new way to edit (e.g. your current image-editor does not provide a "blurring" effect), you need a new tool. Where do these tools come from? Someone else has to have already made them for you. In other words, all the things that one "can do" with a computer are not made available simply by "having a computer" (or device). That's ridiculous, and there's no reason it has to be that way.

For any raw material, the ways it can be crafted are limited only by the tools needed (and available) to do so. However, if those tools (and the tools for making those tools) are all made of that same material, then there is nothing to prevent one from crafting that material in any way imaginable! Computer artifacts and tools are made of the same stuff (binary bits), ergo the tools are also artifacts, ergo one should be able to create & modify whatever tools one desires, and thus also whatever artifacts one desires. Spoiler alert: Computer programming does NOT fit this definition! ( Here's why )

Analogy / Side Story:
Imagine a type of clay that can be made as soft, hard, rigid, or flexible as you like, simply by applying the right kind of pressure. One can make just about any artifact from it, but the quality will limited to the precision of one's fingers. However, one can make rudimentary tools that could in turn be used to make more refined tools, and thus more refined and complex artefacts, essentially at will, and essentially "by hand". All such are freely modifiable by the same means. To do anything that can be done with the clay, all you need is the clay itself. Software has this same potential.

It sounds like we need a tool-tool (a tool for making and editing other tools), and from there one can extrapolate whatever tools one wants. So, let's say we have this tool, and it provides operations for building a software tool (menus, buttons, triggers, graphics, behaviors, etc.); but what happens when we need to add features to a tool that this tool-tool does not support? If we need a different tool-tool every time, then we are back in the same situation we started in. Thus, what we really need is a way to edit the existing tool-tool. (Uh-oh, does this sound familiar?)

A tool-chain is not the answer either, because then it's the same situation at a higher level: either the tool-tool-...-tool needs to be editable, or we are stuck with a decided set of operations (as if one can really predict everything that will ever be needed) and thus forfeit having a truly evolvable system. (Also, it's ridiculous to have to edit A to edit B to edit C, etc.)

So, if unlocking the full potential of computer systems (i.e. being able to build & interact with whatever one wants and however one wants) requires open-ended tooling, but a tool-chain is not the answer, then what we really need is a tool that can modify every aspect of itself.

Such a tool could be that universal "clay" for computer stuff: If you have the clay, then you also have every other tool you want. Thus, such a self-modifying tool IS a "computer", and could thus obviate (replace) the need for an "Operating System". Such a tool fulfills Alan Kay's "dynabook" vision Brett Victor's "dynamic images" tool.

So both of these posts suggest to me that you should read up on phase-separation, macrology, and desugaring, which together are the core ingredients for what you're trying to achieve.

I understand that your goal is direct AST modification, through "message passing", and that the goal is for the system to be extensible, but at some level you're going to need to emit machine instructions, and the sanest way to do that is to define a subset of your AST language which is directly intelligible to some abstract virtual machine, and delegate responsibility for emitting machine code to that abstraction. Then, the rest of node types have associated code that handle the process of self-modification (i.e. "desugaring") into the subset which is intelligible to your abstract virtual machine. In the case of Racket or ML, this is abstract virtual machine is basically just an interpreter for a typed lambda-calculus, but in your case you may prefer a different model of computation.

Macros are the code responsible for the "self modification" of the language (i.e. AST manipulations, or "functions from source code to source code"), and they are a pretty well-studied problem in the PL community. The key ingredient for expressive macros is hygiene. This is essentially the property that syntax trees should be compose-able without side-effects (where side-effects means "accidental capture of identifiers").

The last insight, which is directly relevant to your project, is that there are two benefits to the macro-language sharing an implementation and grammar/AST-structure with the target language. First, it reduces programmer overhead, since they only need to learn one language and not two. The second, and more critical, is that it pretty trivially allows you to extend your definition of macros to allow macro-generating macros (i.e. "functions from source code to (functions from source code to source code)"), and collapse your tool-tool-...-tool-chain into a single tool capable of recursive self-modification via "phase-separation" (also called "multi-stage compilation") each piece of code and each identifier in your AST are "aware" of their depth in the recursive chain, and that chain is then recursively evaluated until you end up with a complete program running at the base phase, which is what we traditionally think of as "run time". The two languages I'm familiar with which really do justice to this notion are Racket, and MetaML. Of the two, Racket has the implementation I'm more familiar with (and appreciate more, since it's a LISP), but MetaML has the better explanation.

No, no, if what I was trying to achieve was generally well understood or has already been done, then we'd have true dynabooks. One would poke and prod with their fingers (rather than write code) and get it to do whatever one wanted, easy-peasy.

I know how I've made it sound by focusing so much on the mechanism and "language" (and maybe I started with the thought that a "language" is what's needed). A magic AST is not the goal, though it happens to be the closest description to the means.

When I discovered the works of [url=worrydream.com]Brett Victor[/url], I was blown away that there was someone else out there (aside from Kay himself) who really understood this vision well enough to give near concrete examples of what some of this might look like. Nothing else I have ever found on compilers or languages has ever come close to envisioning software like he has. Maybe some of the same flavor of goals exist, but it's almost always to make programming easier; not to redefine software as something to be controlled as an end-user.

To get there, we need to start with something moldable, and from there we need to depart from just molding a language and start molding a user environment, until the user experience is as moldable as the crude building blocks needed to even begin to have something that can evolve to that point.

EDIT: LISP did not quite get there, because it's used & thought of as a language rather than as a system; but if the strengths & dynamics of LISP were extended through to the end user, then it would be a lot closer. The closest thing I've seen to this is VIM; but that's only almost scratching the surface of what could be. SmallTalk perhaps came closer, but somehow while being even more language-tied than LISP. The way that the concept of "classes" was built into the system ... YUCK! Though, not nearly as gross as the JVM and CLR, which have class structure built into the VM assembly. Very short-sighted.

Anyway, one does need dynamic building blocks to get there, so yes we do focus on that; but maybe I can show how dynamics at the language / compiler level alone will not get us there:

elfprince13 wrote:

...phase-separation, macrology, and desugaring ... are the core ingredients ...

... for compiler & language magic, but the end product is still whatever you make of it regardless of the language used. I'm talking about redefining how people interact with software, which I'm trying to show depends on the end-product being modifiable (versus just the code used to generate it). These ingredients are focused on static elements of source code.

elfprince13 wrote:

At some level you're going to need to emit machine instructions, and the sanest way to do that is to define a subset of your AST language which is directly intelligible to some abstract virtual machine, and delegate responsibility for emitting machine code to that abstraction. Then, the rest of node types have associated code that handle the process of self-modification (i.e. "desugaring") into the subset which is intelligible to your abstract virtual machine. In the case of Racket or ML, this is abstract virtual machine is basically just an interpreter for a typed lambda-calculus, but in your case you may prefer a different model of computation.

... Actually, that's not far off! Specifically, I plan picking a very simple structure with very simple operations, and having those things be interpreted directly, rather than emitting machine code. The "interpreter" itself, though, needs to be made of these same pieces so that it can also be inspected and changed by the very code that it is interpreting. This necessarily would involve some way to go back & forth between the two ... but that part does not need to be great, it just needs to be possible, and then it can be made better (if needed) later; but the "base" (kernel) engine would at least be wrapped in the same abstractions that are fed into it. This is similar to the FORTH language, though I don't think that exposes the "kernel" for modification. (side thought: if I did implement a 2-way JIT-er, then one could re-code the "kernel" as if it was no different than any other piece ... but anyway, that's a tool that could be made later, and is thus not worth focusing on to get this started, and this is all very much about having enough of something to get a process started so that it can evolve freely).

elfprince13 wrote:

Macros are the code responsible for the "self modification" of the language (i.e. AST manipulations, or "functions from source code to source code"), and they are a pretty well-studied problem in the PL community.

... This, again, is focusing on modifying source-code, rather than the end-product (object-code). I want something that is modifiable as it sits. This is part of the reason that something like a "runtime AST" is necessary for what I'm after. As far as code that manipulates other code or generates other software, that's no different than any other code: thinking in LISP, it's just a function that takes a list and processes it in some way; and that way can be a grammar unique to that function, and thus one can have a DSL right there. What makes a LISP macro different from other functions is that they are evaluated at compile-time, and that their arguments are not "evaluated" (e.g. the whole "expression" is passed in, in it's literal structural AST form) ... With my system, there is no "compile time", and I'm working out whether "expressions" are even auto-evaled at all, or whether one has to say "eval this and pass the result". ... Anyway, you essentially get any "language" from a function (or family of functions). I know the gain is supposed to be code transformation (e.g. at compile time), but what's to keep a "tool" (as I've called them in the previous post) from generating code and handing the result back? I think that's like converting "compile time" into an on-demand thing, just like lazy-evaluation.

elfprince13 wrote:

The last insight, which is directly relevant to your project, is that there are two benefits to the macro-language sharing an implementation and grammar/AST-structure with the target language. First, it reduces programmer overhead, since they only need to learn one language and not two. The second, and more critical, is that it pretty trivially allows you to extend your definition of macros to allow macro-generating macros (i.e. "functions from source code to (functions from source code to source code)"), and collapse your tool-tool-...-tool-chain into a single tool capable of recursive self-modification via "phase-separation" (also called "multi-stage compilation") each piece of code and each identifier in your AST are "aware" of their depth in the recursive chain, and that chain is then recursively evaluated until you end up with a complete program running at the base phase, which is what we traditionally think of as "run time". The two languages I'm familiar with which really do justice to this notion are Racket, and MetaML. Of the two, Racket has the implementation I'm more familiar with (and appreciate more, since it's a LISP), but MetaML has the better explanation.

... Yeah, I guess that's not all that different from what I was thinking. I mean, as I said just previously, since there is not building (e.g. the runtime is always running), the concept of "compile time" and "run time" is blurred, and so you could do all this same thing with just run-time code anyway; only you'd do it "first", whenever/whatever that means in whatever context.

... I suppose maybe there is more in common with your suggestions and my goals than it seemed at first. I guess the BIG difference is that I want an environment that is always running, and you lose the distinction between compile-time and runtime. "Compile-time" for this just means invoke a transformation on some code, and then invoke the result as needed. ALSO THOUGH, I am hoping to generate something beyond a dynamic language: a dynamic user environment that is it's own tool. Go read/watch Brett Victor's materials at worrydream.com (I've scatter links in previous posts), as it comes closer to this than anything else I've ever seen (though I've had the general idea before I found his stuff). Specifically: Stop Drawing Dead Fish, Learnable Programming, Inventing on Principle, Humane Representation of Thought, Pictures that Change, Magic Ink, Substroke, and A Few Words on Douglass Engelbart.

Anyway, thanks for the information; I really will have to investigate some of this further, as it may have more relevance than I thought; but yeah, I'm not making a language, because even that can (can NEEDS) to change if needed, if we really want to be able to make computers whatever we want without having to reinvent them from scratch each time. Otherwise we presume that the kernel of our magic dynamic system is the end-all be-all. I think such a system can come out of a simple spec, and could thus even be sculpted from or even as sub-systems within one another (and not just by virtue or transforming A into B, because then A is still the end-all be-all, and B is just A in a different language).

I do intend on continuing to complete my "story" above. Key points I have yet to include are ... probably anything I said here (minus anything overly specific) that wasn't already mentioned, but also how programming languages are not the same thing as a dynamic system, and how modifying source & compiler and generating is not the same as modifying the (generated) artifact directly (which is one reason that the artifacts & system have all be in a dynamic AST form akin to LISP).

I don't think having a static core interpreter is any more contrary to your vision than running on fixed hardware. At a fundamental level, you have to be aware that you're not going to break the Church-Turing thesis: it doesn't matter what your core model of computation is, but you have to have one. The interesting part is what you build on top of it, and how humans interact with it, and enabling composability. Anyone who's programmed z80 (or other assembly language) for any length of time knows that self-modification isn't the barrier (after all, to an assembly program, self-modification is trivial), it's building the right layers of abstraction, and that requires that code be composable.

Have your own thoughts to add to this or any other topic? Want to ask a question, offer a suggestion, share your own programs and projects, upload a file to the file archives, get help with calculator and computer programming, or simply chat with like-minded coders and tech and calculator enthusiasts via the site-wide AJAX SAX widget? Registration for a free Cemetech account only takes a minute.