Sunday, November 29, 2015

I've been pondering for a while a concept for a new language. Partly its interesting to me because I want to get more diverse compiler experience than just Wake's set of problems,and partly because its a program I know I would use.

I'm going do something that I didn't do with Wake, that I think in hindsight is actually the best way to start a language -- I'm going to make it open and public from day one.

Now that wake has been officially "alpha released," there are still mountains to do with it. And just the same I have seen interest and I feel like I have made a positive impact on programmers who even just ask the question, "what problem is wake trying to solve?"

With that new experience (along with some advice that its never too early to publish an idea for an open source project), I wanted to share what little I have to say on it yet, and invite you all to critique it -- or, if it ever makes it into other developers hands in a serious way, it will serve as a cool journal of how Encase came to be.

A publicly commentable specification draft of Encase can be found here. It is still a work in progress.

What problem is Encase trying to solve?

I love the command line interface, and one of the reasons why is the magnificent composability of bash. Naturally I have a sort of terminal-formatting fetish, where I'm constantly trying to make my programs look as cool as possible.

But the two concepts are actually very separate. On the one hand, you have awk, cut, and xargs, which cooperate as beautifully as a centaur hugging a unicorn, and on the other hand you have top, less, and vim, which are a set of distant vacation spots. They look great by themselves but you can't be in both Sicily and Venice at the same time. These programs are built in ncurses, a complex GUI framework for the terminal.

I believe that ncurses was not *nix-y enough, and that it is the choice of making ncurses a library that separates the top from the less. I believe that if we could make ncurses into a command in just the right way, you could awk your less and maybe even less your top.

My approach with Encase will be to adapt ncurses into a language. You can write a program that simply spits out a mix of content and Encase, and pipe it into any program you choose, usually that program being an interpreter that renders it for you live as your program runs.

Many immediate uses come to mind.

top # view the encase code raw
top | encase # view it how humans are meant to see it
top | strip-encase > computerFriendlyLog # save JUST the actual content to a file
top | encase-record > topReplay.ncs # "record" a specific run of a program
top | custom_top_reskin.sh | encase # completely re-skin an Encase app??!

The last one specifically may be a stretch; but the great thing about ncurses as a language is it would suddenly just be strings. You could trivially, say, replace all instances of 'red' with 'blue', all borders with one twice as thick, etc etc.

As an added benefit, you would be able to run curses in any language without any required re-learning: python, bash, c, c++, php, ruby, javascript. No new APIs to learn, no special porting or required compilation flags (looking at you, PHP).

And last but not least, if done well it would make a sleek CLI program interface a breeze, no longer requiring you to look up the right ANSI escapes, and in a much more tailored way than a library (ncurses) could ever provide.

What inspired the (perhaps temporary) name?

At first I was going to go with Curse as a blatant ncurses reference, but, I thought it was too negative and too similar to ye olde curses. After all it will be a very different beast in the end.

I then thought about using "jinx" as a synonym of curse, but wanted a stronger hat tilt to ncurses than that. I almost chose "njinxes" but it was very similar to the web server nginx.

Playing still more off of ncurses, I found myself saying "ncase" when trying to play off of its phonetics. Combined with the layout engine inside it that "encases" your content, I decided that "encase" hit on the marks of inspiration, coolness, and brevity.

Some Examples

While literally every idea I'm about to present is subject to change, and/or may never see the light of day, it wouldn't be very "open" of me to tell you the goals (and even describe the name!) without showing what I had in mind.

First of all, one goal is to make it very easy to ignore most (if not all) Encase code if you want to. For this reason, I've decided to make every line of Encase begin with a standard sequence. The rest is output.

$$ this is some Encase code
this is raw output

I chose $ because it is (hopefully somewhat uniformly) rare, and also does not require an escape. I almost used backslash for its rarity, but didn't want it to be heavily escaped in other languages (ie, println("\\ your Encase code")). I chose to use multiple $s so that the chances of ever needing to worry about escaping your output is really really low.

Beyond that, it should have imports, and it should be turing complete (so that your turing complete custom UI components can be shared across languages).

$$ import alert
$$ n + 1 times alert "hello" // turing completeness

Here we assume someone has written a plugin in Encase which creates momentary popup alerts. We can import it (provided its installed on the right path...) and then use it. Notice that we're using a variable (n), doing some math (n + 1), and looping (n + 1 times ...).

I have a feeling the times construct will work very nicely in creating formatted output. I included it in Wake as part of the standard library and liked it; its a great example of the type of constructs one would find in a language tailored to string formatting.

Writers of plugins (such as this alert plugin shown above) would be able to leverage a box layout engine to create the effect of windows, panes, buttons, and more. They would (ideally) all be easily describable in Encase itself even if the layout engine they used was a native layer.

Here we would create a format named window, which would require the title as a parameter. The whole box is inside a border format (which would be equally simple and just draw a box around whatever inner content it gets). Inside that border, we create a horizontal box containing the title and its dividing line, then below that a scrollbox. We name those sections for use later.

When we say open border we can print to it and put other formats inside it. Same thing goes for our window. This is why we say render content, we are going to describe how to render or rerender a window whenever our content or sizing has changed. content in this case is a variable but its quite nicely readable when content is the chosen name.

The rendering engine would buffer this in memory, and at a fixed frame rate it would diff the previous state with subsequent states, performing as few and as optimal updates as possible to keep performance high.

In terms of the layout engine itself, there are many papers on some layout engines and their time complexity. I have yet to choose which I'll use but I'm sure a good one exists.

This leaves us in a nice place to design a layout, name its components, and then as our app runs we can switch panes, change their contents. We should be able to add and remove components on the fly to create very dynamic interfaces.

Problems I'm Still Thinking On

This section will be very disorganized! Muddle through as long as it holds your interest.

Unlike most GUIs, I am deliberately running the visual code in a separate process than the application. This makes conventional two-way binding and/or event handler concepts more or less impossible. For a while I thought my biggest problem would be managing scroll states, but I did realize that I can rely on the mouse (where supported by the terminal) to handle scrolling. Still, this leaves a burden on the app developer to performantly follow the state of the visuals and update them as needed, capture keystrokes and form navigation themselves. If you wanted to add click-to-select-text-box functionality, you'd be out of luck.

I think this might be elegantly solvable by changing top | encase to encase top, and let the encase interpreter call top to watch all I/O. However I'm not sure how exactly this would all work out. Perhaps a fifo streams events into the inner process? Still an unsolved issue. Remember too that the goal is to make programs composable; how could such a fifo be made simple to use in your own way?

I'm also not entirely sure what to make of multithreading. And I don't mean multi threading in terms of efficiently rendering the layouts on a multi core processor, I mean a threaded program where each thread spits out encase code that works perfectly on its own, but doesn't work perfectly when the lines get all jumbled up. On the one hand multi threading support would be difficult to seamlessly add (obviously the interpreter would have to be threadsafe, and have multiple states per thread all interacting at once...). On the other hand it might be clunky. Every thread would need to have its own prefix, and going into no-code mode for printing content would be ambiguous. If Encase ran all lines of code on one thread, anyone doing parallized output would need to manually synchronize their updates into sensical batches of code -- not exactly a minor quirk.

I'm also kind of curious about late binding variables. Notice in the window example, we have a title that is passed in when the window is created. What if you want to change the title? We would probably add in some functions specific to windows that could be called whenever you are inside a window box. But it would be cooler if there was a slick way to make it automatically rerender the title of the window whenever the original variable it came from was changed.

I'm also thinking pretty hard on what types of mistakes would be common to code in encase as its written. Anythnig I can do to avoid confusion (forgetting to end a window, forgetting a rerender function, using formats in the render function instead of the format body and getting lowered performance, etc etc) and make it less obvious. There will be no type system to catch this stuff for the users!

What I'm Excited About

Mostly I'm excited to try to add JIT aspects to the compiler. It may be a challenge for me to get it right cross platform and everything, but I think this is a great example of a language that could use a JIT (all that rerendering from scratch in a necessarily dynamic language!), and I would have a blast even optimizing a few things for a specific architecture.

I'm also excited in that it seems like it would be very easy to whip up some clones of some classic unix utilities like nano, top, and/or some bigger fish like a simple vim clone. It sounds like a cool project...maybe even make a Roguelike or two!

Either way (if I make it and its as good as I hope, or not) it has been fun to think about, and I invite you to comment with your thoughts/suggestions.