In OSH 0.2, I fixed the bugs revealed by torturing the parser with a
million lines of shell. I also introduced parser
benchmarks.

OSH 0.3 sped up the parser by 6-7x. I introduced more
benchmarks, including ones that measure execution speed.

Now OSH can parse abuild in about 250 milliseconds. That's still too slow,
but it's not blocking progress.

I plan to release OSH 0.4 at the end of this month. It will be able to run
not just abuild, but also shell scripts from Aboriginal Linux and Debian.

After that, the stack will be empty again. I had to shave some yaks, but I
didn't lose sight of the goal!

What Is a Linux Distro?

I didn't understand how Linux distros worked until pretty recently. It's
useful to think of them as having (at least) these four components:

A set of source tarballs from "upstream" sources, e.g.
GNU, Linux, Apache, or LLVM.

A "meta" build system that turns source tarballs into binary
packages. This build system invariably uses shell scripts. Sometimes GNU
make is used; sometimes Python is used; but there are always shell
scripts.

A script to create the root file system — i.e. to "bootstrap" the
system so that it can build its own packages. We'll see this below.

It's ~2600 lines of shell. Here is an excerpt. (I worked
with this script a few years ago, and I remember it looking scary. It
used weird incantations that I didn't understand. Now it appears relatively
easy to read, which I think means I've spent too much time with shell :-) )

What I accomplished: I used OSH to build an Ubuntu Xenial image,
chroot into it, and run commands. The sections below describe the fixes
required to make this work.

What I accomplished: I ran the i686 target, which builds a complete
system from source. In contrast, debootstrap assembles an image from
binary packages. (I haven't yet run Debian's package build system, which is
based on GNU make.) I booted the resulting image in QEMU and got a
shell prompt!

In summary, I tested OSH on a diverse set of shell scripts found in the
wild, and did whatever was necessary to make them run.

I started this process after the last release, and I honestly
didn't know how long it would take. There were more problems than I expected,
but I was also able to fix them more quickly than expected!

Features Added

This section describes are the holes I filled in to make these scripts work.

Tracing Support

Some errors I ran into had obvious causes. For example, OSH would throw
NotImplementedError when it encountered ${s:1:2} (string slicing).
Implementing slicing and getting past the error was easy.

However, other errors required debugging thousands of lines of other people's shell scripts.
This motivated me to learn more about bash and debugging. In particular, this
tip on making xtrace useful by setting $PS4 helped me figure
out where scripts were going wrong.

I implemented these debugging features in OSH:

Implement set -x / xtrace, with $PS4 support.

Added support for $SHELLOPTS, so you can inherit xtrace. Shell scripts
often invoke other shell scripts, and there needs to be a way to preserve
-x.

Added variables that are useful in the PS4 string: $LINENO, and my own
$SOURCE_NAME.

Shell Options for Strict Behavior

A recurring theme was relaxing OSH's strict behavior in order to accomodate
common shell usage. However, I added the ability to opt in to the strict
behavior. I added set -o strict-control-flow, strict-array, and
strict-errexit. I'll address this topic in another blog post.

Overhaul of Word Splitting and Evaluation

POSIX has quirky rules for the $IFS variable, which determines both how words
are split and how the read builtin splits fields.

I rewrote the crappy regex-based version of IFS-splitting with an explicit
state machine. This is an interesting piece of code which I may explain in
another blog post. It's in core/legacy.py. It turned a lot of red tests
green.

Two Kinds of C-Escaped Strings

echo -e '1\n2' and echo $'1\n2' both print the lines 1 and 2. Their
relationship is the same as the relationship between [ and
[[ — the former is dynamically parsed, and the
latter is statically parsed. For example, dynamic parsing allow this:
char=n; echo -e "1\\${char}2", but static parsing doesn't.

I implemented these with a similar but not identical lexers. Metaprogramming
let me avoid duplication.

Prefix/Suffix Strip Operations Use the Conventional Algorithm

This is another feature that touches on some computer science. Originally, I
translated globs to Python regexes, in order to take advantage of Python's
non-greedy matching, e.g. the expression ${x%%*suffix} could be implemented
with the regex .*?(suffix).

However, abuild uses character classes in globs, e.g. ${i%%[<>=]*}
which isn't easy to translate reliably.

So instead I had to implement these operators using a linear number of
calls to fnmatch(), which makes the overall algorithm quadratic. If
fnmatch() isn't linear in the worst case, which it often isn't,
then the algorithm could be cubic.

However this issue doesn't appear to arise in practice, as all shells use this
slow algorithm. Strings are generally short.

Minor Features

There were several other minor features to implement. In most cases, I had
already done the hard part: representing code with the lossless syntax
tree. The implementation often "falls out" after choosing a good
representation.

Slicing of strings and arrays: ${s:1:2} and ${a[@]:1:2}.

Process substitution: diff <(sort left.txt) <(sort right.txt). This
feature is inherently flaky because it doesn't wait() on the forked
process, and it didn't set $! until bash 4.4.

The type builtin without -t. abuild unfortunately matches
the output of type with a regex.

More of the test builtin:

-L and -h are aliases to check if a file is a symlink.

[ -t 1 ] to check if stdout is a TTY. There is no color in
abuild without this!

-nt and -ot to compare timestamps on files.

Shell WTFs

Reimplementing these shell quirks was both fun and depressing. As penance,
I've been maintaining a wiki page of Shell WTFs (which is not
well-organized).

I could blog every day about one of these and not be done for months. But I
remind myself that my main goal is to improve shell with the Oil
language, not dwell on the past. Legacy behavior is only useful
as far as it gives people an upgrade path to Oil.

Bugs Fixed

File Descriptor Usage

As far as I know, a shell must handle file descriptors differently than any
other Unix program. It can't open any files in the descriptor range 3-9,
because shell scripts may use them directly.

The main program and the source'd scripts are now moved out of the way with
dup2() immediately after opening them.

I fixed a crash in statements like echo hi 6>&1, which debootstrap
uses.

I used the /proc/$$/fd/ mechanism I mentioned in in OSH Runs Real Shell
Programs to debug these problems. It's a very useful
way of showing the file descriptor state of a process.

Bugs Related to CPython's Buffering

I encountered another problem: Python does its own buffering of file I/O.
I believe this is on top of libc's buffering, although I haven't looked
into it deeply.

sys.stdout.flush() is required after type; otherwise $() may be
incorrectly evaluated. Hat tip to timetoplatypus for mentioning this with
respect to the dirs builtin.

The read builtin can't use Python's f.readline(). The descriptor that
underlies the sys.stdin file object changes whenever you redirect, which
interacts badly with buffering.

Instead, I have to read a byte at a time from file descriptor 0. This seems
inefficient, but I noticed that dash, mksh, and zsh also do this
(in C).

Other Bugs

Fix precedence of && and ||. Confusingly, they have equal precedence
in the command language, but the normal unequal precedence in the [[
expression language.

Fix the scope of variables set with FOO=bar myfunc. Shells differ in behavior here!

Fix ${x/pat/replace} when x is undefined. (This case revealed a bug in
mksh.)

Fix a crash when cd-ing away from a directory that's been removed.

readonly R; unset R should return 1 and respect errexit, not
unconditionally fail. Although I consider this a programming error,
errexit will be on by default in Oil. (It wuold also be nice to make this a
statically-detected error.)

What Was Not Done

I punted on a few things which weren't strictly necessary to build system
images, or which had easy workarounds:

The trap builtin is unimplemented; warnings are printed on stderr.

alias is also unimplemented. I changed a couple lines in
alpine-chroot-install. Trivia: bash is the only shell that doesn't
expand aliases by default; it requires shopt -s expand_aliases.

set -h / hashall is a stub that does nothing. This option is used by
Aboriginal and affects bash's $PATH cache, which I don't yet understand.

Also note that these OSH builds are in a sense "shallow". I changed the
shebang lines of thousands of lines of top-level scripts, but they often invoke
more shell scripts with a #!/bin/bash or #!/bin/sh shebang line.

For example, building any Linux distro will require running dozens of
configure scripts. Fortunately, OSH can already run those.

What's Next?

As mentioned, the upcoming OSH 0.4 release will include all this work.

I also have several writing tasks on my TODO list:

Why Write a New Shell? After every release, I receive questions about
the project's motivations. There are many motivations, but I need to explain
them more concisely, and link to them all from one place.

Project Retrospective. I consider the work described in this post a
major milestone! It's worth reviewing how we got here. What's left?

Lexing Posts. I have several unpublished drafts of posts in this series
(see the lexing tag).

Now that OSH is in better shape, I'd like to resume writing about
shell-the-good-parts. The first two posts are now a year old!

If I have time: a review of academic papers about shell. nickpsecurity
brought an interesting paper to my attention, and I followed the citations
and read two more papers. I responded in comments on lobste.rs
and reddit. There is more to say about them!

It would also be nice to get oil-dev@ going again. If you're interested in
contributing, e-mail me or leave a comment.