Many people find Lisp s-expressions hard to read as a programming notation.
This paper discusses various ways to extend/modify
s-expressions so they can be more readable without losing their power
(such as quasiquoting, macros, and easily-manipulated program fragments).
The goal is a notation
that can be trivially translated to and from traditional
s-expression notation (both by computer and in people’s heads), and
isn't dependent on a particular underlying semantic.
The paper
identifies and discusses three approaches that seem particularly promising:
indentation, name-prefixing (so func(x y) is the same as (func x y)),
and infix support.

It then defines a particular way of combining these approaches, called
“sweet-expressions”,
that can be viewed as an essentially backward-compatible extension
of s-expressions.
A sweet-expression reader can accept typical
cleanly-formatted s-expressions without change, but it also supports
various extensions (optional syntactic sugar)
that make much clearer code possible.
This is purely a matter of screen presentation; underlying systems can
continue to use s-expressions, unchanged.
For example, here’s a trivial Common Lisp program
that takes advantage of sweet-expression’s formatting extensions
(the Scheme version is similar):

Indentation. Indentation may be used instead
of parentheses to start and end
expressions: any indented line is a parameter of its parent,
later terms on a line are parameters of the first term,
lists of lists are marked with GROUP, and
a function call with 0 parameters is surrounded or followed by a pair of
parentheses.
A “(” disables indentation until its matching “)”.
Blank lines at the beginning of a new expression are ignored.
A term that begins at the left edge and is immediately followed by newline
is immediately executed, to make interactive use pleasant.

Name-prefixing. Terms of the form ‘NAME(x y...)’, with no whitespace before
‘(’, are interpreted as ‘(NAME x y...)’;.
If its content is an infix expression, it's considered one parameter.

Infix. Optionally,
expressions are automatically interpreted as infix
if their second parameter is an infix operator
(by matching an “infix operator” pattern of symbols),
the first parameter is not an infix operator,
and it has at least three parameters.
Otherwise, expressions are interpreted as
normal “function first” prefix notation.
Infix expressions must have an odd number of parameters with the
even ones being binary infix operators.
You must separate each infix operator with whitespace on both sides.
You can chain the same infix operator, so (2 + 3 + 4) is fine; to mix
infix operators, use parentheses.
Thus "2 + (y * -(x)" is a valid expression, equivalent to (+ 2 (* y (- x))).
Infix operators must match this pattern (and in Scheme cannot be =>):

This paper describes the rationale behind the
older sweet-expressions version 0.1;
see Sweet-expressions: Version 0.2 for information
on the changes made to sweet-expressions since this paper.

Introduction

S-expression notation is a very simple notation for programs,
and programs in variants of the Lisp programming language
have traditionally been written using s-expressions.
In this notation,
an operation and its parameters is surrounded by parentheses;
the operation to be performed is identified first,
and each parameter afterwards is
separated by whitespace. So “2+3” is written as
“(+ 2 3)”.
As noted in the May 2006 Wikipedia, this syntax
“is extremely regular, which facilitates manipulation by computer.
The reliance on [s-]expressions gives the language great flexibility.
Because Lisp functions are themselves written as lists,
they can be processed exactly like data:
allowing easy writing of programs which
manipulate other programs (metaprogramming).”
In short, s-expressions are a powerful and regular way to
represent programs and other data.

I’ve written a lot of Lisp code,
so I’ve learned to read s-expressions fairly well.
(I wrote a lot of Lisp code in the late 1980s on a $120,000 system.)
But I am never the only one who reads
my programs -- I need to make sure others can read my programs too.

Here’s the problem: for most software developers,
programs written solely using s-expressions are hard to read,
and they will only voluntarily
use programming languages that allow, at least optionally, a
more common notation.
This is particularly true for the usual infix operations
(+, <, and so on).
People who use Lisp-based languages all the time eventually learn,
but not everyone wants to use them all the time, and even developers
who are comfortable with programs in s-expression notation
need to share their work with others.
Wikipedia notes that
“the heavy use of parentheses in S-expressions has been criticized --
some joke acronyms for Lisp are ‘Lots of Irritating Superfluous Parentheses’,
‘Let’s Insert Some Parentheses’, or ‘Long Irritating Series of Parentheses’ “.

Yes, I know the arguments.
“S-expressions are powerful and regular” !
Of course they are. They are a wonderful intermediate representation for
lots of things, in fact.
But they are a terrible user interface, especially
if you are trying to share your results with others.
People today -- even most programmers -- want systems that are
“easy to use”, and one of the best ways to make something
easy to use is to make it familiar.
Most software developers have been trained for many years to use
traditional infix mathematical notation, and S-expression notation
fails to use it.
Programming is often not a solo effort; development today is
practically always a group effort, and readability for a large
diverse group matters.

All the loud statements about the power of S-expressions cannot compete
with time looking at real programs.
Most software developers laugh at languages that are
“obviously weak” (to them) because their default parser cannot
handle <, *, and - in conventional ways.
Below is a trivial Common Lisp program to compute
a factorial - now ask, “is this really the most readable format
possible for other readers?”

(defun factorial (n)
(if (<= n 1)
1
(* n (factorial (- n 1)))))

Maybe you think so now, but I use this trivial program as an example to
show some alternatives that I think many people would prefer.
For example, here’s the same program, but exploiting the
abilities of sweet-expressions;
a reader for sweet-expressions could read both the previous expression
and this one:

defun factorial (n)
if (n <= 1)
1
n * factorial(n - 1)

The 1993 paper
“The Evolution of Lisp” by Guy L. Steele, Jr. and Richard P. Gabriel
section 3.5.1 discusses “Algol-style Syntax”,
give a history, and slyly makes fun of efforts
to try to use anything other than S-expressions:
“Algol-style syntax makes programs look less like the data structures
used to represent them. In a culture where the ability to manipulate
representations of programs is a central paradigm, a notation that
distances the appearance of a program from the appearance of its
representation as data is not likely to be warmly received... it is
always easy for the novice to experiment with alternative notations.
Therefore we expect future generations of Lisp programmers to continue
to reinvent Algol-style for Lisp, over and over and over again, and we
are equally confident that they will continue, after an initial period
of infatuation, to reject it. (Perhaps this process should be regarded
as a rite of passage for Lisp hackers.)”
Paul Graham posts a longer version of this quote.

Perhaps. But it’s worth noting that thousands of languages have been
invented over the years, but almost none decide to use
S-expressions as their surface expressions.
Smalltalk and
Python took many ideas from Lisp
(see Norvig’s
“Python for Lisp programmers” for more)... but
not its surface syntax.
Languages like ML and Haskell have strong academic
communities... and do not use S-expressions for their surface
syntax either.
Logo was devised to have Lisp’s power, but intentionally
chose to not use its syntax.
Dylan actually switched from S-expressions to a more
traditional syntax, when they wanted “normal” people to use it.
Yacas
is intentionally Lisp-like underneath, but completely abandoned Lisp's
surface syntax for normal user interaction
(as have essentially all computer algebra systems, even though many
have a Lisp underneath).

The fact that so many people are compelled to find a “fix”
to Lisp syntax indicates to me that there is a problem...
the fact that so many efforts fail suggests that
it is a hard problem.
Norvig’s Lisp retrospective
found that even in 2002, when comparing Lisp to Python and Java,
Lisp was far faster and more extensible, and had many powerful
properties... but it was gaining few users (it was mostly stagnant).
I argue that one key reason is the syntax; Lisp's syntax looks
genuinely hostile to most “normal programmers”.
Instead of Lisp-like languages gaining converts who adjust to prefix syntax,
many people are ignoring or abandoning Lisp-like languages
to use languages designed to be readable by humans.
John Foderaro correctly said, “Lisp is a programmable programming language”,
but people expect standard notation to be already available.
Obviously it’s easy to create a some kinds of front-end syntax for Lisp-like
systems, but they generally don’t catch on -- in part because they
usually don’t provide the full power of the system underneath
(e.g., you often can’t do backquoting with comma-lifting, or insert new
parameters in a control structure, or handle macros well).
I prefer to use the best tool for the job; when it's not Lisp,
don't use Lisp.
But some jobs are naturals for Lisp-based langauges, yet their
poor readability seriously interferes with that.
I think there are ways to retain the power of Lisp-based languages,
while improving their readability and retaining their flexibility.

Oh, and let’s debunk one claim here.
Steele and Gabriel claim one problem with Algol-like syntax
is that there are “not enough symbols”.
Yet even if that were true, by combining characters
lots of symbols can be created without resorting to fancy characters.
Indeed, <= is a common combined character for “less than or
equal to”; nobody has trouble understanding that.

So below, I discuss a lot of past and current work to improve
the syntax of Lisp-like languages.
I then focus on a few especially promising areas that can make
Lisp more readable while still accepting normal s-expression notation
and having only minimal changes to it.
These include:

Indentation for meaning (like Python and Haskell);
there is already work on this, particularly with I-Expressions.

Infix approaches - how in the world can we
rationally add infix support?
There are lots of options, which I explore.

After looking at many of the possibilities, I then present
“sweet-expressions”,
which I believe is an approach that resolves these problems in
a reasonable way.
I then end with conclusions.
In an appendix I document BitC a little further, since it started the
whole thing for me, and in a separate paper on
alternatives for s-expressions
I document some of the alternatives I tried before I came up with
s-expressions.

Goals

My goals were originally for the BitC project, but now I think there might
be ways to work more widely for any language that uses Lisp-like notations.
Here were my goals for a programming notation for Lisp-like systems:

Readable. It should be more “readable” to the uninitiated,
in particular, it should look like more traditional notation.
For example, ideal format would support infix notation
for operations that normally use infix (+, -, <=, etc.),
support having function names before parentheses, and not require
as many parentheses.

Mappable. There needs to be an obvious
mapping to and from current s-expression notation,
which must work for both data and code.
The key advantage of Lisp-like languages is that you can easily manipulate
programs as data and vice-versa; that must remain for any
modified syntax.
Otherwise, there’s no point;
there are lots of other very good programming languages that support infix
and other nice notations.

General/Standardizable.
It should be very general and standardizable across all systems that
accept s-expressions (Lisp's original syntax);
nobody wants to relearn syntax everywhere.
It should be useful in Common Lisp, Scheme, Emacs Lisp, BitC, ACL2,
DSSSL, AutoLISP (built into AutoCAD), ISLISP (standardized by ISO)
BRL,
and so on.
A thread
about guile noted this very need.

Easily implemented (relatively speaking).
It must not require tens of thousands of lines of code to do.
However, if it takes a little extra code to produce nice results, that is fine;
better to do things well once.
It need not be easily implementable via a few tweaks
of an existing reader, though that’d be nice.
Yes, rewriting a reader is a pain, but those only have to be written once
per implementation.
Note that even among implementation of a particular language there
is often much variance, so the format needs to be simple enough to
support many implementations.

Quote well. In particular, for both forward quote (’) and
backquote/quasiquote (‘), it must be easy to find the end of the quote,
and for backquote/quasiquote,
it would be very desirable to support initial comma (,)
and friends
to locally reverse the quoting (the whole point of quasiquoting).

Backward-compatible.
Ideally, it should be able to read regular s-expressions (at least
normally-seen formats of them) as well as the extensions.
I’m willing to give a little on this one where necessary.
Note: Version 0.2 of sweet-expressions is not perfectly
backward-compatible, but most typical s-expressions as people actually
use them are also valid sweet-expressions.

Work with macros.
There will probably be some tweaks for indenting and infix notation
(especially since infix reorders things!), but
macro processing should continue to work in most cases.

My notion is that the underlying s-expression system would not change...
instead, the system would support a reader that takes an extended notation
and converts it into s-expressions.
A printer could then redisplay s-expressions
later in the traditional notation, or same kind of notation used to input it.
It would also be nice if the notation was not confusing or led
to likely errors.

I’ve written this document to put at least a few ideas down in writing.
In the long term, I suspect what needs to happen is for there to
be some sort of “neutral” forum where ideas can be discussed, and
code shared.
For any syntax
to be widely used, there must be trustworthy, widely-usable implementations.
I think at least one FLOSS implementation
with a generous license that permits use by proprietary programs and
Free-libre /
open source software (FLOSS) programs.
Note that the LGPL doesn’t work as intended with most Lisp implementations;
Franz has created the
Lisp LGPL (LLGPL) which is a specific clarification of the LGPL
for use with Lisp.
Lisp code is typically licensed
under the LLGPL instead of the LGPL, since the LLGPL
clarifies some otherwise sticky issues and ambiguities in the LGPL.
Note that any FLOSS software should be
GPL-compatible,
since there is so much GPL’ed code.
The implementations would need to be widely portable and modular,
so that they can be widely depended on.

Scheme and Common Lisp aren’t really compatible at all.
Translators like
scm2cl
(Scheme to Common Lisp translator)
could help initially, certainly to get started
(suggesting that it'd be best to start with Scheme, and then transition
at least an initial version to Common Lisp).
But in the end, specialized implementations
well-tuned to different environment will be necessary for an
improved syntax to really work.

Past Work

Here are some past efforts to try to make s-expressions more readable.

Special syntax for various constructs

Most readability efforts focus on creating special syntax for every
language constructs; these often end up unused (because they cannot
keep modifying the grammar to match the underlying system),
or end up creating a completely new language
less suitable for self-analysis of program fragments.

“The Evolution of Lisp”
lists many efforts to create "Algol-like notations for Lisp",
which generally included infix notations and attempts to
be more “readable”.
Originally Lisp was supposed to be written in M-expressions,
which were more traditional in format.
Function calls were written as F[x;y...] instead of (F x y...), for example.
One problem is that they kept adding new syntax, and never found a
good “final” M-expression format, so it was never implemented... and
it just receded into the never-finished future.
Many other notations were developed, including those of LISP 2.
These generally had if ... then ... else and other more
traditional naming conventions.

If R is a word tagged as RLIS, then R takes as its operands
a whole bunch of things linked by commas. At present I have this
grammar ambiguous on
R1 a, b, c, R2 d, e, f;
where R2 could (as far as the grammar is concerned) be being
given one, two or three arguments. This problem arises if the
operands of R may themselves end in an R. This is harded to avoid
than I at first thought - one might well want conditionals in the
are list of an R, but then
R1 a, IF x THEN R2 b, c;
comes and bites. I guess this is a “dangling comma” problem.
The above two problems are resolved by the parser genarator favouring
shift over reduce in the ambiguous cases.

“IN”, “ON” are both keywords, as used in
for each x in y do ...
and words with the RLISTAT property. This is sordid! Similarly
“END” has a dual use. This is coped with by making special provision
in the grammar for these cases.

One trouble of many of these notations is that it becomes
difficult to see where the end of an expression is (e.g., there
might be no way to indicate the “end” of the if statement); this can
creates ambiguities and makes it harder to easily match
the infix notation and s-expression if you need to.
And, if you want to describe arbitrary Lisp s-expressions, not having
an end-marker means that you may not able to access some of the
capabilities of the underlying s-expressions
(though that may not be an issue for all uses).

The ACL2 language is Common Lisp-based, but it has a separate
front-end for a more traditional interface including an infix processor;
it is named
IACL2.
This is not often used, for several reasons:

ACL2 is only defined in s-expression form, so you cannot read any
of the documentation or examples without knowing s-expression form anyway.

IACL2 is not as portable as ACL2, nor as supported.

IACL2 does not support all the capabilities as ACL2 -- and who wants
to use a tool that is known to not work when you most need it?

IACL2 has a “dangling else”, so it is not always trivial to see
where something ends.

Logo
is basically Lisp with an infix and more readable syntax.
Instead of “(”...”)”, Logo uses “[”...”]”.
Normally, all commands begin with the name of the function, just like
Lisp, and Logo even has text names for math functions:
sum, product, difference, and quotient.
More interestingly, infix is also available... by using symbols,
Logo automatically uses the infix forms instead.
Logo knows the number of parameters for each function, so once the
number of parameters is provided you can just provide another
function call on the same line without any marking of the end of the
call (see the “rt” call below). This is not as flexible as
s-expressions, where you can always add another parameter.
Here’s a sample Logo program:

There is also
CGOL (an Algol-like language that compiles into Common Lisp);
this was developed originally by Vaughn Pratt, and written by
a UC Berkeley graduate student, Tom Phelps.
This program is a Common Lisp implementation of CGOL that is translated
into Lisp before execution.
I do not know what its license is.
You can look more generally at the
CMU archive, though beware of
licensing problems.

Clike
(see also Clike on Sourceforge) is
a compiler which converts code written in a simple C-like language to Scheme.
Pico is an educational Scheme with C-ish syntax.

TwinLisp is a "new way of programming in Common Lisp"; it
creates a new reader with a more conventional syntax, which it then
translates to Common Lisp and executes.
It predefines precedence - which unfortunately means that it is tied
to a particular interpretation of all operations.
It also defines special control structures;
thus, like most past efforts to create "readable Lisp", it ends up
sacrificing generality of the resulting format
(the reader is rigged to a specific set of operators with special
syntax - so creating new operators is not simply a matter of defining
the meaning of the operator).

Lisp resources
gives pointers for where to go for more information.
For extensions specifically focused on indentation, see the text below
on indentation;
for extensions specifically focused on infix notation, see the
section on infix.

Skill

Skill from Cadence is
a proprietary Lisp-based extension language.
By examining that paper,
this text, and
this document,
a hint of its syntax can be inferred.
Skill supports name-prefixing, where FUNC(x y) is treated as meaning (FUNC x y).
Skill also supports infix processing, automatically translating
infix operators into an internal prefix format.
In Skill, it appears that all of the prefix functions have alphabetic names,
e.g., "(plus 3 4)" adds 3 to 4.
This means that infix operators can be unambiguously detected and used without
surrounding whitespace; thus "3+4" is automatically translated to "(plus 3 4)".
Skill does not consider indentation significant.

I point out Skill specially because it does shows that it's possible to
have a very general syntax that is easier to use.

Instead, he wants Arc to have a syntax, because it makes programs
shorter (a big win), but have additional notation as simply
an abbreviation for the longer (still valid) syntax.
He also would like to show structure by indentation instead of
parentheses (which would become optional where no ambiguity results).
His ideas sparked many others to re-investigate s-expression syntax,
generally thinking in terms of “syntax as abbreviation”.

Arc has inspired many people to re-evaluate Lisp syntax, and
see if there are ways it can be made more readable.
Some
comments on Arc made some interesting points:

Peter Norvig said,
“For the Scheme I wrote for Junglee’s next-generation wrapper language, I
allowed three abbreviations: (1) If a non-alphanumeric symbol appeared in
the first or second element of a list to be evaled, then the list is in
infix form. So (x = 3 + 4) is (= x (+ 3 4)), and
(- x * y) is (* (- x) y). And (2), if a symbol is delimited by a “(”, then
it moves inside the list. So f(a b) is (f a b), while f (a b) is two
s-exps. And (3), commas are removed from the list after infix parsing is
done, but serve as barriers to infix beforehand, so f(a, b) is (f a b),
while in f(-a, b + c), each of -a and b + c gets infix-parsed separatly, and
then they get put together as (f (- a) (+ b c)). This seemed to satisfy the
infix advocates (and annoy some of the Scheme purists). You might consider
something like this.” Here are some of my thoughts about this:

Rule 1 (the check on the first and second elements of the list) is
a little hackish, and by having a check on the first element too, you
can’t use usual Lisp notation at all for infix operators.

Rule 3 (removing commas) is simple. Amazingly enough, this doesn’t
interfere with quasiquoting as long as commas are always
required as function parameter separators; since you cannot have a
null-length parameter, any parameter beginning with a comma has a
comma for lifting!
But that doesn’t mean that the resulting code is easy to read; all those
extra commas can make it hard to find the comma that is doing the lifting,
and unless prefix operators are allowed, they aren’t needed for
parameter separation... they just create more syntactic noise.
So I haven’t gone further with this, though others could.

Rule 2 is a neat idea; there’s the minor risk of bad spacing causing
problems, but it is a trivial way to get more traditional functional
notation without harming the syntax (functions cannot
have “(” in their name anyway).
Indeed, I think there’s a good idea here, as I discuss below
in “Name-prefixing”.

Sudhir Shenoy’s “ideas from Perl” said:
“Please don’t use parentheses for s-expressions (’[’ and ‘]’ or
even ‘{’ and ‘}’ would be preferable. The biggest complaint I have
about parentheses is that it makes infix mathematics (which you are
planning to introduce in Arc) really hard to read. e.g. (def interest
(x y) (expt (-r * (t2 - t1) / (days-in-year)))) reads really badly
when compared to [def interest (x y) [expt (-r * (t2 - t1) /
[days-in-year])]]. The overloading of meanings of parentheses
(grouping + s-expression) which isn’t a problem in Lisp may cause a
loss of readability in Arc.”

”Paul Graham notes that you can infer some parens
in Common Lisp from indentation and newlines.
In particular, if a line contains more than one
s-expression, it must be a prefix of a list of
those s-expressions. If a sequence of lines is
indented from the previous text line, it must be a
continuation of that line’s s-expression...
All of this leads to needing extra parens around
invocations of parameterless methods. Oh well.”
He explicitly assumes that adding parentheses will
not disable indentation processing.

”In general, infix-to-prefix transformation should
be relatively simple, reversible, and allow mixing
of infix and prefix expressions; infix operators
can’t be infix if they’re the first element of a
list anyway. However, infix expressions could be
valid prefix expressions; how about (reduce +
mylist), for example? OCaml solves this problem
by adopting the infix interpretation where there
is ambiguity, requiring parens around infix
operators used as values. This probably isn’t the
best solution for a Lisp, but you could probably
require (function +) instead of (+) and get away
with it.
I’d like to be able to use at least (+ - * ** /
mod) infix, with their usual precedence; and a
left-associative : for cons would shorten a heck
of a lot of Lisp programs.”

“On multiline string syntax: it’s ridiculous to
include the leading spaces on successive lines.
Spaces to get subsequent lines of the string in to
where the first line of the string started should
not be included in the string; their absence
should simply be a syntax error.”

“These syntax rules still don’t help cond much,
either the Arc variant of cond or the traditional one.”
He suggests borrowing McCarthy’s syntax from the 1960s:
“condition => consequent”.
I (Wheeler) think that
using “=>” this way would be a bad idea, at least in Scheme;
Scheme already uses “=>” for a completely different
meaning inside “cond” constructs.
Another operator could work, though finding a clear one may
be more difficult.
Often “->” means “implies”, and having to similar-looking
operators with different semantics is a bad idea anyway.
Actually, I think that when you add indentation (see below),
constructs like “cond” actually look pretty reasonable, so
this is may be best solved a different way.

BitC version 0.9 uses this “syntax as abbrevation” approach.
You can indicate that “Value” has type “Type” using the quasi-function
“the”, e.g., as (the Type Value),
but the preferred form is “value : type”.
It does not go far right now, though; + is still prefix (not infix).

Quack is yet another Lisp derivative.
This adds two syntactic items.
First, "postfix colon syntax" - colon with indentation as a shortcut for
parentheses.
Second, "infix colon syntax" - a colon without spaces around it are
a shortcut for function/macro call, right-associative, so
cdr:cdr:obj is (cdr (cdr obj))).
He doesn't know if they're workable, and I don't find
the examples convincing.
Indentation has promise, but the colon ends up in a bad place and it's
completely unnecessary.
And while an infix operator like the infix colon is useful, making that
the only infix operator is rather weak; where's 3 + 4?

1. We use () to indicate grouping. ((((a)))) will mean the same thing as (a).
2. We use right associative infix operator notation -
a * b + c translates to (* a (+ b c)) in lisp.
3. We use indentation (tab-size = 4) to convey grouping -
say hello becomes (say hello (to lisp))
to lisp
4. We use square brackets notation to encode lists - for instance -
[a,b,c] means (a b c)
The items in a list can also be separated by line breaks. So
[ one
two means (one two three)
three ]
Since indentation is significant,
[
one
two means ((one two) three)
three
]
There are a few special operators that help reduce the need for
parentheses and aid readability -
1. a : b translates to (a . b) and due to right associativity,
a : b : c translates to (a b . c)
2. a :: b translates to (a b)
3. a := b translates to (define a b)
Here are the cases where you won’t need parentheses -
1. Your expression starts on its own line. In this case,
the first term is the head term and the rest of the terms
on the same line are tail terms.
2. Your expression is the second argument to an infix operator.
For example -
mid :: quotient (low + high) 2
is equivalent to
mid :: (quotient (low + high) 2)
Exceptions -
1. Use func() notation to indicate that you mean (func) (a function
call) and not just the value of func.
Limitations of current implementation -
1. Back-quoting not supported.
2. The use of comma separator outside of list expressions is not
defined and handled well enough.
Enjoy! ... and, for the record, I’ll stick to the parentheses,
thank you :)

This aids readability, but note the inability to handle all
of Lisp's capabilities (like back-quoting), and that the
creator won't use it (clearly a sign that it's not good enough).

So now, let’s turn to three areas that have great promise,
and then see if we can combine them together.

Lisp programs are normally presentated using indentation;
the Common Lisp standard even includes a pretty printer!
Experienced Lisp programmers eventually stop seeing the parentheses,
and see the structure of a program instead as suggested by indentation.
So why not use indentation to identify structure, since people do anyway,
and eliminate many unnecessary parentheses?

Other languages, including Python, Haskell,
Occam, Icon,
use indentation to
indicate structure, so this is a proven idea.
Other recently-developed languages like
Cobra (a variant of
Python with strong compile-time typechecking) have decided to use
indentation too, so clearly indentation-sensitive languages are considered
useful by many.
In praise of mandatory indentation notes that
it can be helpful to have mandatory indentation.

Past work on indenting to represent s-expressions

Paul Graham (developer of Arc)
is known to be an advocate of indentation for this purpose.
As I noted above,
Kragen Sitaker’s notes on Graham and Arc discusses how
indentation can really help (in this notation, functions with no
parameters need to be surrounded by parentheses, to distinguish them
from atoms - “oh well” ).
Graham's
RTML
is implemented using Lisp, but
uses indentation instead of parentheses to define structure.
RTML is a
proprietary programming language used by Yahoo!’s Yahoo! Store
and Yahoo! Site hosting products, though Yahoo are transitioning away from it.
Paul Graham’s comments about the RTML language design and
this introduction to RTML by Yahoo.

Darius Bacon's ”indent” file,
includes his own implementation of
a Python/Haskell-like syntax for Scheme using indentation
in place of parentheses, and in that file he also includes
Paul D. Fernhout's implementation of an indentation approach.
Bacon's syntax for indenting uses colons in a way that is limiting
(it interferes with other uses of the colon in various Lisp-like languages).
I have not had a chance to examine Paul D. Fernhout's yet.
(It also includes an I-expression implementation.)
All of the files are released under the MIT/X license.
(Darius Bacon also created mlish, an infix syntax front end listed
earlier).
Lispin discusses a way to
get S-expressions with indentation.

I-expressions

I-expressions
are an alternative method for presenting s-expressions
(either program or data), using indentation.
They are defined in SRFI ("surfie") 49; this has final status,
making I-expressions a quasi-official part of Scheme.
I-expressions have no special cases for semantic constructs of the language.
SRFI 49 includes a sample implementation with an MIT-style license
(based on the
Sugar project).

As you can probably guess, both of these are equivalent to this
traditional s-expression representation:

(define (fac x)
(if (= x 0)
1
(* x (fac (- x 1)))))

The keyword “group” is used to begin a list of lists.
(One minor drawback: it's more difficult to use a function
named "group", so it's best to simply not create such a function.)
You can drop back several indentation levels without dropping them all:

let
group
foo
+ 1 2
bar
+ 3 4
+ foo bar

The SFRI permits both tabs and space characters.
I-expressions can also be quoted (including being
quasi-quoted); see the proposal for information.
Note this means that ’ followed by whitespace is now the
beginning of a quote, because the initial whitespace is significant,
and newlines become important.

In any indentation system for s-expressions,
you need to be able to express s-expressions such as
(f a), (g (h 1)), and (j (k)).
In the first case, the first parameter is simple symbol,
in the second case, the first parameter is a call with at least one parameter,
and in the third case, the call to k has no parameters at all.
I-expressions represent 0-parameter function calls
by having the parameter surrounded with (...) (or having () as a name-ender).
Let’s see what the indentation rules by themselves look like:

The idea is simple, but it can be hard to reason about how it works -- so
here is one way to think about it (if you're using this to write a program).
On any line, the first term is the function to be called; the rest of the
items on the line, and each indented line below, are its parameters.
If a line has no other parameters (on the rest of the line or indented), then
it's an atom; otherwise, it's treated as one expression and parentheses
surround that line and its indents (if any).
The “group” term creates an extra surrounding (...) in the
resulting s-expression, so you can create lists of lists.

So how do you present computing a function and then calling it?
Personally, I’d just switch to traditional s-expression
notation in this case.
But the I-expression representation actually isn’t bad;
just use a “group” followed
by how to compute the function to be called:

; ((getfunction x) a b) can be represented as:
group
getfunction
x
a
b
; or:
group (getfunction x)
a
b

The SFRI supplies Guile code (not fully portable Scheme code) to
implement I-expressions (e.g., it uses define-public and not define).
However, it should be easy to port.

In the I-expression sample implementation a
“(” disables I-expression processing until its matching
“)”, and this is an intended part of the definition
of I-expressions.
This has all sorts of wonderful side-effects:

I-expressions parsing becomes very safe to use with existing code -
pre-existing oddly-indented code will almost certainly start each
expression with an opening parenthesis, disabling indentation processing.

It supports dealing with text that is very
close to running off the right-hand side; just use parentheses
to disable indentation processing. Python does the same thing.

Disabling indentation this way
doesn’t interfere with the use of parentheses to invoke
function calls with 0 parameters - such calls
don’t take that much room on a line!

There are a few special cases where the reader with I-expressions
enabled parsed a file differently.
When two top-level s-expressions follow each other,
either (1) on the same line or (2) on consecutive lines with
the second expression indented, the lines will be combined by
an I-expression reader.
The first seems unlikely to me; the second is a little unfortunate.
What's worse, though, is that in a read-eval-print loop, users
have to enter return twice to see their results, and it's easy to
end up trying to evalute an empty list.
We'll discuss those below in possible extensions to I-expressions.

The mailing list discussion of I-expressions includes a
posting of an I-expression pretty-printer; the author says
"I just couldn't get to sleep, so I wrote one".
Be warned: I believe this pretty-printer has a bug, because it
does not handle (f (g)) correctly.
The first pretty-printer I wrote for I-expressions in
Common Lisp had the same error, too, so this is a common mistake
when doing this task - be warned.
The expression (f (g)) should print like this:

f
(g)

I believe that
users who use indenting expressions should only use space characters,
and never tabs, to indent.
The R5RS Scheme specification doesn't even officially permit tab characters
(though implementations generally do).
The real problem is that tabs produce a varying number
of spaces on real systems;
experience with Python suggests that using tabs can cause a lot of problems.
Most of today’s text editors can be configured to turn tabs into spaces
automatically.

Possible extensions to I-expressions

I-expressions and similar indentation approaches are a very nice approach.
But there are several ways it could be extended or modified, now that I've
had a chance to experiment with them.

Immediate execution if begin on left-hand-edge

If you use indented I-expressions interactively
(in a read-eval-print loop), you have to enter an extra
a blank line (an extra "enter") before the expression is evaluated.
That's because the system needs to know when an expression has ended;
the system believes you might indent the next line and add more expressions.
This is fine for multi-line expressions, but it's very annoying
for simple single-line expressions.
If you do a lot of simple one-line commands this becomes very annoying.
And this is likely; few people type in long, multi-line definitions
into a read-eval loop, because they would stick those into a file instead!

So how can we make indenting easy to use at a command line?
We could create a special command for use at the beginning of
a line, such as "! ", that means "at the end of the line, you're done" .
Or we could create a special indicator that means "execute now" .
But these don't seem any easier than entering a blank line, and they
are yet something else to remember.

One simple modification could be to add a special case, based on
whether or not text was entered in column 1.
There are several obvious variants:

If leftmost edge has a "(" ,
and the entire expression is immediately followed by a newline, then the
expression is complete.
This has lots of problems; this doesn't handle evaluating atoms,
such as running "x" to see what x contains.
So I'll reject that.

If text begins at the leftmost edge (no leading horizontal space),
and the following term is completely read, followed immediately
by a newline, then the expression is complete and the read function returns.
Thus, if these are entered at the leftmost edge, they will immediately return:

x
(load "file")
load("file")
(3 + 4)

However, the following will not (they will wait for a blank line)
if entered on the leftmost edge:

load "file"
define x
3 + 4

The system will wait for a blank line in these cases,
because the first term is followed by a space, not a newline... so
it is waiting to see if you'll add more parameters by indenting them.
The 'define x' is an example of good news (probably), and the
'load' command is an example of the bad news.
Another problem is that the rule is a little complicated to explain.
The good news is that 'define x' on the left edge
will not trigger an immediate return.
With these semantics, if you want to enter an indented expression,
you'll need to indent the first line or include
at least one parameter (expression) on the same line.
If you indent at all on the command line, or enter one or more
terms on one line, you must type an extra blank line to execute the sequence;
this seems reasonable.

If text begins at the leftmost edge (no leading horizontal space),
and there are no unmatched parentheses,
it is immediately completed at newline.
In this approach, you must indent the first line to
have an indented expression.
This means that if these were entered on the left edge, they would
be executed immediately:

load "file"
3 + 4
define x

The good news with this approach is that
'load "file"' on the left edge works as a user might expect -
it will trigger an immediate return.
The same is true for 3 + 4 ... suddenly an interactive Lisp interpreter
makes a plausible calculator!

The bad news is that 'define x' on the left edge followed by return
will also trigger an immediate return of the read expression, which is
almost certainly not what was intended.
That is not such a big deal in an interactive command loop, since the user
could try again.
But more seriously, this would be a likely
thing to do in a file, and it could cause very hard-to-detect bugs.

How can we counter the risks of not noting the change in semantics
for text on the left edge?
It might be possible to have a special loop for file-reading
that detected the case of "unindented expression followed immediately
by indented expression", and trigger a warning or error.
Another variant would be to have two slightly different modes for read
(or two different starting functions):
one intended for interactive use, where unindented lines are ended on newline
where possible, and one intended for file reading (where there is no
such distinction).
This does complicate
reasoning and implementation, and
it's quite normal to cut-and-paste into an interactive session.

One promising approach to countering the problem
is to configure other tools to help detect such problems.
Text editors, for example, could be colorize such constructs.
Thus, if a left edge that would run immediately is followed by an
indented line, it would be labelled with red lines.
And special tools could be designed to detect this as well.

One good point is that the rule is simple: "if you want to indent,
you must start by indenting the first line".
When you indent, you enter a blank line or another expression at the
same indentation level.
Frankly, that's much simpler than the rules above.

If text begins at the leftmost edge (no leading horizontal space),
and there is exactly one term on the line or it is a legal
infix expression, then the expression is complete and the read function returns.
Thus, if these are entered at the leftmost edge, they will immediately return:

x
(load "file")
load("file")
(3 + 4)
3 + 4
3 + (4 * 2)

And these will not:

define x
load "file"
3 + 4 * 2
3 + 4 +

For sweet-expressions, for a long time I keep going back and forth between
option 2 and option 3.
Option 3 presents the risk of subtle bugs, and that is of great concern.
However option 3 is a joy to work with interactively.
I then said,
"This is a hard choice to make, so I plan to experiment.
Expressions like load("filename") and (3 + 4)
seem particularly easy to explain,
and do not have the drawbacks of this alternative, so they suggest
using alternative 2."
It was only in November 2007 that
I finally realized that what I wanted was option 4.

Note that when activated these also partly eliminate one
of the minor incompatibilities of I-expressions with traditional s-expressions.
If there are two separate outermost s-expressions on consecutive lines, with
the second expression indented from the first, as long as the first expression
was not indented (a likely case) the lines will no longer be combined by
the I-expression reader.
Of course, this could be misleading, because the second line might
appear to be combined with the first; this is a trade-off with
no perfect answer.

Ignoring excess blank lines

Excess blank lines get interpreted oddly, as '(), which the
underlying system may then try to run (Scheme in particular
complains about them).
It can also cause confusion if interpretation gets "out of sync".

A plausible modification would be to ignore any additional
blank lines before a (next) expression.
In short, a blank line shouldn't really return '() to the system;
if they want that, users should need to enter that directly.
If there blank lines at the end of the file, the system should probably
return the end-of-file marker (if it supports that) once it has
consumed the final blank lines trying to read the next expression.

An alternative would be to ignore any additional
blank lines after an expression, but you don't want to do that.
Then you need to treat the beginning of a file specially (assuming that
there was a beginning of file).

Also, should horizontal spaces followed by newline be treated the
same as a blank line?
Users cannot easily see the difference, especially on typical printouts.
That isn't clear, so I leave that as a question for now.

Indentation, a wrap-up

I-expressions and their extensions are a real improvement.
But they are not enough; let's keep looking.

Peter Norvig had some interesting ideas, as noted earlier.
Let’s look at one of his rules, the rule that says
“if a function name ends with an open parentheses, move it inside the list
(when converting to an s-expression)”.
This means that “(fact x)” and “fact(x)”
will mean the same thing.

Obviously, this is trivial to parse.
We don’t lose any power, because this is completely optional -- we only
use it when we want to, and we can switch back to the traditional
s-expression notation if we want to.
It’s trivially quoted.. if you quote a symbol followed by “(”, just keep going
until its matching “)” -- essentially the same rule as before!
Technically,
this is a change from some official Lisp s-expression notations
and implementations.
For example, entering “a(b)” into CLisp (a Common Lisp implementation) is
the same as “a (b)” -- its parser tries to
return the value of a, followed by running the function b.
But it’s not clear it’s a big change in practice;
commonly accepted style
always separates parameters (including the first function call
name) with whitespace.
So normally, what follows a function call’s name is whitespace or “)”,
and this is enforced by pretty-printers.
Thus, many large existing Lisp programs could go through this kind of parsing
without resulting in a change in meaning!

Does this help? Let’s rewrite the CL factorial example, but
not make the infix operations do this:

defun(factorial (n)
if((<= n 1)
1
(* n factorial((- n 1)))))

It looks slightly more familiar, but not that much.
Let’s try again, but move the infix ops out too;
the result is actually not bad:

defun(factorial (n)
if(<=(n 1)
1
*(n factorial(-(n 1)))))

If we really wanted it to look conventional, we
could use wordy names instead of symbols that are traditionally infix;
that isn’t too horrible for + (use “sum”),
but that is rather wordy for others (I don’t like it):

Note that as far as I can tell,
you do not need any whitespace after the opening
parentheses, because atoms (including function names) cannot have
parentheses in their name in s-expressions.
A variant of this idea would be to require whitespace after the
opening paren if name-prefixing is used, but I see no need for that.

Mathematica does something similar.
As noted in
How Purely Nested Notation Limits The Language's Utility,
its FullForm notation can be transformed into Lisp using simple rules,
starting with transforming f[a,b] into (f a b).
But it isn't designed to be backward-compatible with existing Lisp
notations; its use of "," to separate arguments would cause confusion
since "," is also used in macro handling.

Name-prefixing can be combined with indentation (e.g., I-expressions).
Let’s presume that
if you indent further and begin the expression with a parentheses,
or by a function name, no space, and a parenthesis, that parenthesis
and its mate are silently ignored
(they don’t add yet another subexpression in the s-expression).
Furthermore, let’s presume that if you have a name-ender format
(function name followed immediately by open paren), it does not switch
to some “no more I-expression” mode.
Then you could do this (using English names for operators):

This does introduce the risk of someone inserting a space
between the function name and the opening “(”.
But whitespace is already significant as a parameter separator, so
this is consistent with how the system works anyway... this is
not really a change at all.

I think this is slightly better for untrained eyes.
It’s hard to argue that this is a major improvement, at least by itself,
but the more I look at it the more I like it.
What’s more, the pain is tiny, and that’s a good thing.
This is a lower pain, lower gain approach, and it can be combined
with SWP-expressions, which I’ll describe in a moment.

The article
Improving lisp syntax is harder than it looks
discusses name-prefixing, but I think it makes a number of errors.
It first claims that this would be hard to integrate with macros -
this actually isn't true if it's built into the reader
(and the macros aren't doing reading themselves).
If the reader transforms a(x) into (a x), then when the macro has a chance
to run all it sees is (a x) - exactly what it was expecting to see.
He also says that
"Another disadvantage to this change of syntax is that it makes
functional programming much more odd looking.
Lets say you have a list containing functions and you want to call the
first one. In Scheme you write ((car lst) params) and in Common Lisp
(funcall (car lst) params). However in our new syntax it looks like:
car(lst)(params) and funcall(car(lst) (params)). Neither of these is
very elegant, and it only gets worse if that call in turn returns
a function, which would look like: car(lst)(params)(params2) and
funcall(funcall(car(lst) (params)) (params2))."
But I find this remarkably elegant, and better than the
traditional notation - to do functional programming, just cuddle
up the parentheses. It's much easier to understand sequential
parentheses compared to a deeply nested list.

Again, let’s continue the thought that we want to support
a syntax that is maximally Lisp-like, generally accepting existing expressions.
The biggest issue in making Lisp s-expressions easier to
read are in whether or not to provide infix support, and if so, how.

The first question is, should you support infix at all?
Because if we do, and we presume that
the underlying s-expressions always have the operation first, this means
that we are changing the external presentation of (some) data.
Note that we are only changing the presentation for entering
programs and program-like data,
and possibly changing how they are externally displayed;
the traditional s-expression would still be used internally.
Thus, if “{3 + 4}” is interpreted as an infix expression,
it will quietly be transformed to the s-expression “(+ 3 4)”,
so the “car” (head) of “{3 + 4}” would be “+”.
The
ll1-discuss mailing list includes
discussions on the issues of infix.

The reality is that nearly everyone prefers infix notation;
people will specifically avoid Lisp-based systems solely because they
lack infix support built-in.
Even
Paul Graham, a well-known Lisp advocate,
admits that
"Sometimes infix syntax is easier to read. This is especially true
for math expressions. I've used Lisp my whole programming life and
I still don't find prefix math expressions natural."
Paul Prescod remarked, “[Regarding] infix versus prefix...
I have more faith that you could convince the
world to use esperanto than prefix notation.”
Nearly all developers prefer to read infix for many operations.
I believe Lisp-based systems have often been specifically ignored even
where they were generally the best tool for the job,
solely because there was no built-in support for infix operations.
After all, if language creators can’t be bothered to support the standard
notation for mathematical operations, then clearly it isn’t
very powerful (as far as they are concerned).
So let’s see some ways we can support infix, yet with minimal changes
to s-expression notation.

If we are willing to support infix, there are many options.
The big issues are:

How do you determine when to use infix notation?
Is this manually specified (if so, how, and how deep does it go?) -
or is it automatic (if so, how, and how do you override it?).

What are the legal infix operators?

What are the semantics? (Do you do something trivial, like swap the
first and second parameters, allow many parameters for the same operator,
or go all the way to full infix with precedence?)

For all of these, you also need to determine how to display them too.

There are some code samples that might be used as a starting point
for implementations; after going over the issues, I identify
some of them.

Nathan Baum (of Bournemouth, United Kingdom) posted on
4/14/2006 8:19:16 PM the idea of using [...] to surround infix notation
(I’m sure he’s neither the first nor last).
He said,
“Lisp actually has a few syntactic shortcuts of its own.
Essentially anything that isn’t part of a name is a character which
triggers a specialised reader function.
Strings, for example, don’t need to be built in to the Lisp core:
a reader can be associated with the double quote character,
and that’ll return a string object.
This also means you can redefine it yourself to get
shell/Perl/Ruby-style interpolated strings, for example.
Even lists themselves are read using a reader function triggered by
[parentheses].
One simple shortcut is to have ‘normal’ Lisp expressions
delimited by [parentheses], and infix expressions delimited by brackets...

(print (+ x (- (/ y z) a)))
; or
(print [x + [[y / x] - a]] ”.

How do you determine when to use infix notation?

There are several major options:

Not automatic (special syntax).
If infix is not automatic,
you add characters that say “make this infix” ,
using pairs such as {...} or [...] around infix operators,
or use single expressions before the first or second parameter.
The single expression might be single characters, like ? or !, or
multiple characters, like the dispatching macro characters (#...).
I don’t recommend “!” because it looks too much like “not”
in other langauges.
Using something short is a good idea, because infix is very common.
The
Lisp FAQ mentions
#I being used as the “Portable Infix Package” marker
(I have searched and not found out more about this; suggestions would be
appreciated).
(This discussion
mentions an author using #I(3+4) to notate infix, and remarks that
[3+4] would have been simpler.)
Requiring a syntax marker
is not as “pretty” as automatically notating infix operators, though
it has the advantage of clarity if you process the resulting s-expressions
often.
Kantrowitz’s infix macro package for Common Lisp
is an example of this approach.

Pairs of characters can force infix interpretation,
and seem like the obvious idea, but they do have some disadvantages.
Infix operations can work across a list, so it makes sense to mark the
end of such a list and use a special pair of characters like {...}
instead of (...) to mark an “infix list”.
There are two pairs available: {...} and [...].
Advantages of {...} is that they are more commonly used
for statement blocks in other programming languages (Logo being an exception),
and in particular users may want [...] for other purposes
(such as identifying lists, like BitC).
A disadvantage of using {...} is that in many fonts these characters
look extremely similar to (...), making them harder to see;
the characters [...] are more distinct.
But in either case ({...} or [...]),
using a pair means you have to match the correct closing pair,
so inside many parens you have to make sure you use the right ones in the
right place. This can create much extra work, and it is debatable if
they help much in error detection.
Also, infix notation is really obvious when you see it... so there
really isn’t a big need for a special marker to show its end.
Combining this with name-prefixing does not seem hard; just accept
“{” for name-prefixing as well.

Combining the paired characters {...} or [...] with
indented forms like I-expressions requires careful thinking about
what you are doing.
In particular, if you want to identify something as a parameter
of a function f and switch to infix notation, do you
indent inside function f?
I think you should, so the system should only
add one level of s-expression (from the indent)
if the first character after the indenting whitespace is “{”,
instead of the two that would otherwise be implied.
In all other cases, other than the beginning of a line, a “{”
would normally end up matching a “(” in the corresponding s-expression
if you want to stay similar to usual s-expression syntax
(if not, see the relaxed infix notation such as ZY-expressions, below).
This means that you need to think carefully about the meaning of {...}
(or [...]).
Basically, don’t think that {...} create a matching s-expression (...).
Instead, have the mindset that the infix operators (+, -, etc.)
create the s-expressions, and that the surrounding characters {...}
simply clarify that what’s inside has an infix interpretation.
Also, do {...} disable I-expressions inside them (as I’ve proposed (...) do),
or should a “{” at the beginning of a line only
switch to infix without disabling I-expressions?
I’m not sure what the meaning of indented sub-expressions would even
be, if indentation still mattered.
Thus, for the moment, I’ll assume that {...} disable I-expressions
inside them, as I’ve proposed (...) do, for the simple reason of simplicity:
if an infix expression is so complicated that it goes beyond one line,
indentation like I-expressions are more likely to harm than help.

Note that if it’s not automatic, there’s a follow-on question...
how deep does infix go when it is enabled?
Do the infix operators only work at one level (one list), or do they
keep going down to lists contained inside?
This question doesn’t come up if the detection is automatic,
so let’s address that next.

Automatic.
To work automatically,
if some parameters look infix, then an infix interpretation is used.
E.G., if presented with “(3 * 4)”, it is automatically interpreted
as “(* 3 4)” internally.

There are two major variants that I can see:
(1) if the first or second parameter is an infix-type operator
switch to infix notation (Norvig’s approach),
or (2) only the second parameter can be infix to trigger infix processing
(my approach).
The first approach has the advantage that it can work
with prefixed “-” traditionally, e.g., (- x * y) works, becoming
(* (- x) y).
But there are also many problems with that approach.
For example, then it is much harder to use traditional
Lisp notation (with the infix operator in front) at the same time,
and it’s easier
to get confused when looking at traditional s-expression notation.
It is all too
easy to accidentally trigger the infix notation when it wasn’t intended.
And the traditional Lisp notation (- x) is not that hard to write or read;
if you support traditional function notation, like -(x), it's even easier.
Finally, if you only trigger on the second parameter, you can
add a subtle safety check: the switch to infix notation can be made to only
occur if there are three or more parameters.
This safety check avoids unintentionally enabling infix in some cases,
and makes it very simple to implement trivial quoting mechanisms:
since expressions like “(self +)” will not be interpreted as
being infix, using functions as parameters is simplified.
In fact, you can go further on the safety check: You can say that
infix lists have to have an odd number of parameters, and that the first
parameter (as displayed) cannot be an infix operator (so a list of
infix operators won't cause problems).
I think the second approach is much better.
For fully-traditional infix notation, the first is slightly better,
and we’ll discuss that later, but for our purposes I don't think it's
as useful.

If it’s automatic, now you need to determine the legal pattern of infix
operators, since they will trigger the infix interpretation.
This is probably some sort of regular expression;
we’ll discuss this below.

You’ll also need a syntax to say when it is not
infix at least (to disable the automatic part).
Again, this can be identified using
surrounding pairs such as {...} or [...] around non-infix operators,
or using single expressions before the first or second parameter
to declare that this is not infix.
For functions, in many real-world use cases
Common Lisp’s #’ notation or simply “(function +)” would do.
Indeed, if at least three parameters are required for infix,
any function that simply returns its second parameter
could do the escaping.
As noted below, I'm currently using "as(+)" aka "(as +)" to do this escaping.

Automatic approaches, either way, are in some sense a little “hackish”..
the second parameter determines the order of parameters in the
resulting s-expression!
Yet it produces very nice-looking results, and I suspect that over time
it will become workable.
After all, in all other languages you have to look for
infix operators too, so this actually not that strange a requirement.
And the results (shown elsewhere) are startlingly clear.

You might still have an optional “make this infix” syntax indicator
as well, or at least a “warning, this is infix”,
which can be used when printing.
That way, when you read
a result as infix, you’ll get warned that the expression is infix.
If this is optional, you might still want to make it very short, since
you may have to read a lot of them in printouts.
For example, let’s say that #_ means “no infix” and is placed
before the first parameter (mnemomic: “base case” ),
while ? means “infix notation follows” is placed before the first
(displayed) parameter (#^ might make a reasonable alternative).
Thus, “(3 + 4)” , “(?3 + 4)”, and “(#_+ 3 4)”
could all mean the same as “(+ 3 4”), and the “car” of all of them is “+” .
Or, if [...] are the infix warnings, [x >= 4] is a way to hint that
the s-expression is actually (>= x 4).

Of course, if the notation permits mixed infix and prefix notations,
you can easily have a “manual” notation that just means
“turn on autodetection all the way down from here” and end up having
a combination of properties.
Alan Manuel K. Gloria’s “infix notation macro” is a good example of a
mixture - in this implementation,
there must be an initial marker (nfx ...), but it then descends all
the way down to all subexpressions.
As a result, you could use the marker at a very high level, and
have a result similar to automatic detection.
A similar result could be had by having a function like
“(enable-infix)” or “(sugar)”
which you could insert at the beginning of a file as a top-level command
(this could also enable indenting and the name-prefixing notation), which would
then enable the infix operations (even at the
topmost level) until specifically disabled.
Using such a
command would mean that everything below, for the rest of the file,
would use automatic infix detection... but since there would
be a specific a command to enable it first,
it would be compatible with existing code (which didn’t enable infix).
A command-line flag, or invoking a command with a different name
(possibly via a different file extension), could enable automatic
detection of infix notation.

What are the legal infix operators?

If detection of infix operators is
automatic, now you need to determine the legal pattern of infix
operators, since they will trigger the infix interpretation.
Even if it’s manually triggered, if you permit arbitrary expressions, or
want error-checking of the operators
(”you said this was infix but there were none”)
you’ll need to detect infix operators.

There are several options:

You could support only a fixed list (+, -, etc.).
That is very inflexible, though.

You could start with a fixed list, and allow the user to specify which
ones to add/override.
The additions would probably include information on precedence and direction
(right or left),
and if so, these additions would need to be made before they are used.
One big risk is that if the infix expressions are read before the
commands to make the additions,
the infix expressions would be misinterpreted.

You could define some sort of pattern.
The pattern needs to be simple, so humans can easily remember it.
This has the advantage of not depending on proper setup of a
table of operators.

You could support default patterns and allow additions
as well, if you wanted to.
Or define a pattern, but it's an error to USE the pattern before
declaring it (and maybe its precedence).

If we use a pattern, what is the pattern?
Patterns are often expressed as regular expressions (REs), and an
obvious RE for this purpose is: [+-\*/<>=]+.
Basically, it has to be a set of one or more special characters.
(One alternative, which might be nice, would be that it has to start
with a special character, but after that, other characters
are allowed; be wary that +h1 is an infix operator but +1 is not).
This is enough to cover the 4 arithmetic expressions (+, -, *, and /),
and all the usual comparisons (<, <=, =, >=, >).
It’s also enough to cover other operations that people like
to have as infix, such as “**” (exponentiate),
“->” (implies), “<->” (if and only if), and so on.
You can probably add “&”, which is enough to represent common
representations of “and”.
But there are some issues, especially with the characters not noted here:

Adding “:” is tempting, because “:” and “::” are sometimes used as
infix operators (BitC uses “:” for declaring types).
On the other hand, “:” is sometimes used as filler, so there are
pluses and minuses to including it in the pattern.
In Common Lisp, a “:” by itself is not a problem (it’s illegal normally),
but the “:” separates package names from their symbols.
In addition, in Common Lisp a blank package name (e.g., :x) is a keyword...
so anything with a colon at the beginning or middle is probably going
to be a problem.
Similarly,
a Scheme
proposal (SRFI 88) proposed using
names ending in “:” as keywords, with
SFRI 89
using them for optional parameters.
Thus, while a single colon (:) appears to be a fine name for
an infix operator, more complex names with colons in them
appear to be more problematic for backward-compatibility with existing Lisp-like
systems.

One possible concession would be to say that “and”
and “or” are specially
recognized as infix operators; that’s a kludge, though all Lisp-like
languages have “and” and “or” operators, so it would make sense.
On the other hand, if infix operators automatically switch lists to
infix notation, s-expressions of sentences like
“(Jack and Jill)” would silently
become “(and Jack Jill)” - and that
seems dangerous.
It especially seems dangerous if s-expressions are sometimes displayed
with automatic infix notation without
any indication that the parameter order has changed.
So I think this is an unwise choice, even though at first it seems reasonable...
but that still begs the question of how to represent "and" and "or", since these
are incredibly common infix operations.

Many languages use && or & to represent "and", and
use || or | to represent "or".
Adding “|” or “||” is tempting
because it is a common notation for “or”,
and in Scheme the character is reserved for possible future extensions anyway.
But simply adding it to the list of allowed symbols
is more problematic, because in Common Lisp the “|”
already has a meaning - it escapes symbol text until the next “|”.
So if “|” keeps its usual meaning, then “|” as
an operator has to be written as “\|” in Common Lisp (yuk).
But there's a clever trick here - the syntax "||" can always be used to mean
"or", whether or not "|" has the special role of escaping symbol text.
This would mean that in Common Lisp, the function's name would actually be
an empty string (!)... but since its name would print as "||", it would
look quite clear and understandable.
This wouldn't be combinable with "=", (e.g., "+=" is a fine atom, but "||=" would
immediately simplify to "=")... but that seems liveable.
It would be possible to force a literal | as the real symbol, but this would
look ugly; \| or \|\| is hideous for code reading.
In Scheme and many other LISPs this would be a non-issue; the || would just
be interpreted as a two-character name for an atom, which is fine.
The & can be confusing in some contexts, too;
it is sometimes interpreted specially in parameter lists.
But as far as I can tell, & is
not normally interpreted in any special way outside of parameter lists, and
even there, && isn't normally accepted.
It's certainly possible to use "^" to represent "and"
(using ** for the power operator),
but that still begs how to represent "or"; few keyboards provide the
mathematical "or" symbol (an upside-down wedge).
So && and || are probably reasonable text representations
for logical "and" and "or"... which
is certainly consistent with typical practice.
Note that "and" and "or" are special forms in Scheme (and many other Lisps),
so that they can short-circuit; this means that redefining them is often
more complicated.

An unfortunate oddity happens with Scheme:
Scheme actually defines a “=gt;” syntactic marker in its “cond”
processing, and it occurs as the second parameter.
This means that if you are using the second parameter to automatically
detect infix operations, you will almost certainly consider this
construct to be an infix operation.
Unfortunately, you cannot solve this by creating a new simple
function => that can handle this (having only one form at this
location changes the meaning of the construct).
There are several solutions; two obvious ones are to require manual
detection or to have a specific list of infix operations.
Another solution is to specifically state that “=>” is excluded from
the pattern of allowable infix operators, at least for Scheme.
You can partly justify
this on the additional grounds that -> and => are easily confused.
It would probably be possible to solve this with a complex
Scheme-unique macro for => that essentially restored its original
meaning, though this will probably cause problems if an error occurs
nearby; I think simply declaring that it's not an infix operator when
processing Scheme is safer.

How do you represent assignment and field accessors,
if the underlying language has them?
The pattern supports "=", but
if you use "=", then you have the problem that this is easily
confused with equal-to.
This is one of the big problems with C and C++
(confusing = with ==), and it'd be unfortunate to duplicate that mistake.
In addition, if the system maps any infix name directly to the identical
prefix function names, then you may not have a choice... in most
Lisp-like systems, "=" is already an equality operator.
The term "<-" is compelling for assignment, and "->" is
compelling for field access, but the two do not go together well;
just imagine deciphering
"x -> a <- b -> c".
Parentheses help, e.g.,
"(x -> a) <- (b -> c)" - but
it's still a little awkward.

One possibility is to use a different form for assignment, such as
"<--" or "<==".
The term "<==" is actually fairly easy to distinguish, so
"x -> a <== b -> c" is easier to read,
and
"(x -> a) <== (b -> c)"
is actually especially nice.

Another
possibility is to use "<-" for assignment, and use [...] as a
field accessor. That looks nice:

x[a] <- b[c]

But if we use [...], we need to figure out
what general function to map [...] to, and we just lost the phrase
[...] for other purposes (such as describing special lists).
(For sweet-expressions, I'll presume that the language or user will
define the infix macros/functions, and not add [...] simply because
it's easier and more general not to.)

Since Unicode/ISO 10646 is finally becoming widely available, and many
systems can process them (e.g., using UTF-8 encoding), additional infix
symbols could be accepted that are not in ASCII.
This would certainly resolve the “and” and “or” issues, since there are
standard mathematical symbols for them:
∧ (”logical and”) is character U+2227 (decimal 8743) and
∨ (”logical or”) is character U+2228 (decimal 8744).
See Unicode
symbols chart, especially the mathematical operators chart,
for more information.
The character set from U+2200 through U+22FF is allocated to
mathematical symbols.
This will remind those of us who’ve been around a while of APL!
Probable characters for infix operators include
U+2227 (”and”),
U+2228 (”or”),
U+220A (”element of”),
U+2209 (”not an element of”),
U+220B (”contains as member”),
U+220C (”does not contain as member”),
U+2229 (”intersection”),
U+222A (”union”),
U+225D (”defined as”),
U+2260 (”not equal to”),
U+2254 (”colon equals”),
U+2282 (”subset of”),
U+2283 (”superset of”),
U+2284 (”not a subset of”),
U+2285 (”not a superset of”),
U+2286 (”subset of or equal to”),
U+2287 (”superset of or equal to”),
U+2288 (”neither a subset of or equal to”),
U+2289 (”neither a superset of nor equal to “),
U+228A (”subset of with not equal to”),
U+228B (”superset of with not equal to”),
U+22C8 (”bowtie” - sometimes used for join),
U+22BB (”xor”),
and
U+22BC (”nand”).
But not everyone has good support for these yet, so while
include a set of Unicode infix operators in the standard set might
be a good idea (anticipating their use),
it’s probably premature to require their use.

If the printing routines normally convert to infix,
then you need to not have a pattern that is likely to be
engaged unintentionally... which are good reasons to not have
the terms “and” and “or” be infix operators, and certainly a
good reason to stay away from anything other than an isolated “:”.
You can probably limit the regular expression length, say up to 3-6 characters,
to help limit unintentional conversions as well.
I would not include “and” and “or”, instead
add & and | to the infix operators (and say that | has to be escaped),
and limit it so it must be 1 through 4 characters in length.
I would accept “:” alone (so it can be used for infix type declarations),
but nothing else with colons.

These considerations suggest that “=>” be quietly prevented from
being an infix opreator, and then use
this regular expression for infix operators;

As a related matter, you need to decide how to separate infix operators
from non-infix operators.
In many other language families, “x-y” would be parsed as x minus y.
Unfortunately, in many Lisp-like languages, names often
include typical infix operator characters such as “-”.
Both Common Lisp and Scheme would be unusable without the - symbol,
because they have many functions with “-” in the name
(in addition, Scheme has a convention
where “->” embedded in the name indicates a conversion).
The simplest and most obvious approach is to require that
any infix operator be surrounded by whitespace;
this is very easy for humans to remember, and is consistent with
normal s-expression syntax anyway.
You could require escaping such characters as part
of a name, e.g. |simple-string-p| or simple\-string\-p, but this is
both ugly and unnecessarily incompatible with common practice.
If you’re devising a new language, you could simply forbid such
characters in regular symbol names, and then you could automatically
detect all the infix operators without requiring that they be surrounded
by whitespace.
However, for general-purpose parsing, sweet-expressions will require
that infix operators be surrounded by whitespace
(in cases where function names don't have such symbols,
this requirement could be relaxed).

What are the semantics of the infix operations?

Once we detect that we are using infix notation, what are the
semantics of the “infix” notation?

Swap first two parameters.
An extremely easy-to-implement and general approach is
to just swap the first two parameters.
A good idea for this case would be to require exactly 3 parameters -- that
way, we won’t accidentally screw things up, because when
limited to exactly 3 parameters, there’s no
difference between swapping and the normal interpretation of infix.
This is trivial to implement,
and yet it’s enough to implement basic infix operators in a way
that looks nicer.
This means that the presented expression is very similar to the
actual s-expression, which has its advantages.
If you want to add a fancier infix format later, this approach
is a reasonable stepping-stone towards that.

Allow chaining (duplication) of identical operator.
This extends the previous option, but you can have an odd number of
parameters, and the even ones much match.
Thus, (3 + 4 + 5 + 6) becomes (+ 3 4 5 6).
Trivial to implement (just swap the first two parameters and
ensure the rest match).
This approach makes it possible to fully use the capabilities of
underlying functions that allow multiple parameters, and that's a good thing.
Interestingly enough, Common Lisp’s definition of comparison operators
would make expressions like
(x <= y < z) work as it does in mathematics!

Note that this does not support precedence, so
you have to surround different operators with parenthesis or indents,
e.g., (3 + (5 * 6)).
In one view, that's a disadvantage - you still have to indicate some
information that would be automatic in other languages.
On the other hand, this makes when there are new lists explicit,
and that has some pretty big advantages.
It also sidesteps the problems of defining precedence
(either fixed or a way to add them).
Those are pretty big advantages.

There are many options for determining when things aren't correct, and
then deciding if that is an error or merely an indicator that infix was
not intended.
Note that this only supports an odd number of parameters, and obviously
there shouldn't be an infix operator presented as the first parameter
in the text.
Doing otherwise could be an error, or could be an indicator that it
should not be treated as infix.
For the moment, I'll plan to consider both as an indicator that infix
was not intended.
This only permits all the operators to be identical - if they're different,
but all infix operators, then the writer probably intended some sort
of precedence (and so an error should be reported).
Otherwise, infix was probably not intended, so again, I'd probably
interpret that as "not infix".

Full infix of binary operations, with precedence rules.
Allow infix and add precedence rules (* before +, etc.).
This would mean that “(3 + 4 * 5 ** 2)” would quietly transform into the
s-expression “(+ 3 (* 4 (** 5 2)))”.
In cases where it is
important that you be able to clearly see the mapping between
the surface presentation and the underlying s-expression, just
don’t use precedence - instead, parenthesize
(so those who do not need to see that representation don’t have to).
This is trivial to implement in Lisp... the problem is what to
do about the precedence of unknown operators.
You’d need to either not support precedence of them,
have a default for unknown operators,
or have a way to set precedence values (and make sure the settings
are read before processing the code).
Setting precedence is easy enough, but its disadvantage is that
it invites trouble - you need to make sure that the
rules are set before the parameters are swapped, and mistakes
could result in hard to find errors.
Left association could probably be safely assumed
for more than one instance of the same unknown operator.
Alternatives are giving
unknown operators a specific precedence relative to other operators...
or simply requiring that other operator’s precedence be made
clear with parentheses (or their equivalent).
Predefining precedence for common functions, and
requiring statements about the others, isn’t a bad thing;
people usually don’t want
to have varying precedence tables (it’s confusing), and they
also don’t want to memorize large tables.
Again, I suggest making duplicate operators (3 + 4 + 5) turn into one
s-expression (+ 3 4 5); have the user override if they want something
different.

Here, we don’t allow prefix/suffix operators,
such as “- x”, or suffix operators, like “x !”.
There is a trade-off here;
prefix/suffix don’t work well with automatic detection or
trying to also read traditional s-expressions correctly - it would
be all too easy to misread such expressions.
Using name-prefix notation for them, such as -(x), is a very reasonable
alternative.

Full infix of binary and non-binary operations,
with precedence rules.
This is like the above, only now we also allow prefix operators,
such as as “- x”, or suffix operators, like “x !”.
Allowing them adds a little extra flexibility, and might make sense
if the specific expression is always manually marked as being infix.
But it has many problems, as noted earlier.

Precedence rules

If you have precedence rules, what are they, and are they controllable?
It's not hard to add a command that sets precedence rules, but
there's always the problem that you risk not getting
that called at the "right time" (before reading).
Worse, if there are global precedence rules that can be changed,
setting them in one environment may manipulate another (unintended)
environment, so now we have to have a parameter for (read), and that may
be hard to pass down.
This is particularly a problem for Lisp-like systems, where you may have
multiple levels and meta-levels that you might not want to interact.
Another big challenge is that if different developers set precedences, it's
harder to combine their code.
Different operations may have different meanings in different
circumstances, so the lack of control over precedence can be a problem too.

A chain of the same operation can normally be combined into a
single operation with all those parameters, e.g.,
(a + b + c) should become (+ a b c).
This adds capability, without any problems or loss.

Everyone agrees that * and /
should have the same precedence, both of which have
greater precedence than the equal-precedence binary + and -,
and that all of these are left-to-right.
So in theory that, at least, could be implemented.
Originally I thought that
it may be best to just implement those universal rules,
and normally use parentheses most everywhere else for controlling
infix precedence (or at least discourage lots of precedence-setting).
Chaining is fine, but as discussed below, precedence causes a lot of problems
in many use cases, so in the end I decided against them for sweet-expressions.

Printing expressions that might be infix-able

So far, I’ve primarily discussed how to read in expressions, but
expressions need to be printed too.
A likely answer is “just print s-expressions as usual” by default.
That means that the output would be “(+ 3 4)”, even if the input was “(3 + 4)”.
For debugging code, this is probably a Good Thing.
There might be value in presenting expressions back with some
infix operations “re-inserted”, at least as some sort of “pretty-printing”
function or option.
It might be wise to identify which lists are being interpreted as
infix in these cases (e.g., by surrounding them with {...}).
I think there is no reason to try to redo precedence when re-displaying;
showing how the lists are actually stored is very clear and
eliminates questions about precedence.
Thus, if “(3 + 4 * 5)” is input, the resulting s-expression would be
“(+ 3 (* 4 5))”, and the infix-printer might show this as
“{3 + {4 * 5}}”.

Combining infix with Name-prefixing

There is a subtlety when combining name-prefixing with automatic infix:
I think people expect an infix expression to be considered a single
expression, even though at first it appears to be a multi-parameter list.

For example, most people would accept this syntax for calling function f with two parameters, x and y:

f(x y)

However, if infix is normally done automatically, they would expect this to
be computed with f given one parameter, not three:

f(x + y)

Note that this is actually a very nice thing; it means that in many cases,
name-prefixing causes the number of parentheses that need writing to reduce.
So the original Lisp:

(f (+ x y))

becomes:

f(x + y)

Infix control

There is likely to be a need to control infix processing.
These should be a standard way
of saying “disable infix” and “enable infix”.
You'd like to be able to enable or
disable infix at only one level, in particular,
so that you could leave infix on as the default and yet disable it in
one particular expression.
It would be especially useful to have an
“enable infix for this one level of expression” operation that
must receive
an expression that “looks like an infix expression”
or it is in error.

The following text are some very early ideas on the topic;
I include them here, but it's all very early, and at least some
is likely to be very wrong.
Still, you may find useful thoughts here.

One implementation issue is that macros work from outside in, and
most macros will not expect special additional macros.
In particular, it might be very useful to cuddle an infix operator
with a macro that says do not use infix, like this (where "as"
means "as-is"):

(define as(+) ...)

However, the obvious solution will fail; typical implentations of
"define" do not expect to be followed by another macro in the definition!
This means that any such function reference will need to be removed at
read time or very early on during eval time (e.g., by
an outside cuddling nfx(...)),
because other macros won't know what to do with this.

There should be a standard way of inserting at the
beginning of a file or read-eval-print loop “please switch to and from
special processing, and
for setting some of its options (e.g., infix control).
For debugging, you could just print s-expressions as now.
However, a standard way to request printing these expressions
that shows infix as such, but
marking infix operators and non-infix, would be a good idea
and each list which could be misinterpreted as an infix expression
but is not has another marker.
This implies a need for a marker that says “the following list is
an infix expression and it’s an error if it is not”.
If adding specific new infix operations is supported
(such as “and” and “or”), a standard name and
syntax for this add operation would be a good idea too.
Ideally, the external interfaces for these operations
be pseudo-standard across all Lisp-like notations.
Bridging the gap between Common Lisp and Scheme might be challenging
for a standard interface (in some cases string
parameters might be needed; string syntax is now universal, but
other syntax is not - e.g., the keyword syntax of
Common Lisp is incompatible with the proposed keyword syntax for Scheme).

The notation for controlling all this is to be determined.
Here’s a start.
Originally, I looked at implementing another # character option.
If you want to do that, you might look for
sharpsign options that no one one currently uses, examining sources such as
the Common Lisp Hyperspec
(particularly the
sharpsign section),
the
R5RS Scheme specification,
the
GNU Emacs Lisp Reference Manual,
as well as less-common systems like
NewLisp.
Using #I for infix is tempting, but
Scheme uses #i for inexact; it’s not clear if #I
would get interpreted the same way on some systems, but it’s not worth risking
and might be confusing to developers anyway.
I especially considered using #/ to mean "infix".
The #@ combination is used by Emacs Lisp, so we should avoid that.
However, Scheme only supports one-character lookahead; if you want to
be able to call the "ordinary" Scheme reader for other #-beginning
constructs, there isn't an easy way to do that - because to read the
character after #, you have to consume the #, making that more
difficult to do.
So I don't think beginning with # is a good idea.

I’ve looked for some possible notation, under the presumption that this
is a future standard format common to Common Lisp and Scheme
(so we need something unallocated by either).

A simple approach might this naming convention, with function-call-like
macros:

nfx(...): Everything inside is interpreted as infix if it can, recursively.

unfx(...): Everything inside is NOT infix, recursively.

nfx1(...): The immediately contained inside is infix, and it is an
error if it is not.

unfx1(...): The immediately contained expression is not infix.

Each of the above could either accept a single list (in which case it's the list that is being referred to), otherwise it is the expression itself as
the rest of its parameters that is being referred to.
Since any infix expression must have at least three parameters
(2 if unary operators are used), this isn't ambiguous.

We could interpret unfx(...) around a second parameter to quietly
disable infix processing.
Alternatively, we could have a different function/macro name such as
"as(...)". This would Treat the inside (function) as-is, so it won't be
considered an infix operation (use this if an infix operator is the
second parameter, but you don't want the expression to be
considered infix).
Here's an example:

defun as(+) (left right) ...

One interesting challenge in doing this is recursion in the read function.
I decided to implement much of the processing in the reader, and
to have the processing go "as I go" in the reader,
rather than have the reader automatically add nfx() in front of expressions
and then have eval() fix it up by calling a macro.
Since the reader is called on ordinary data in s-expression format,
it is very inconvenient to automatically have the reader
add nfx(), etc. calls!
Yet if there is no outermost call to "fix up" the expression,
then the outermost parameters would be defun, define, or other terms
that would not handle these macro calls correctly.
And you dare not remove things piecemeal - read can be called recursively,
so if you call read, remove things partly, but then later all "fix up"
all the way down again, you can "unfix" things.
This is one reason to have a separate "as" function, so that these
different forms can be differentiated.

Note that “(s (b))” is interpreted as you might expect during
sweet-expression processing, but (s(b))
is interpreted as the s-expression ((s b)), not (s (b)).
In practice this is not a problem - nobody likes the
format (s(b))
for s-expressions, and not all s-expression processors would even accept them.
More also needs to be done to describe their interaction with macros;
this is a critical area and hard to get right, but since there is
relatively little that is changing, this may not be so bad.

A simple program could transform an existing program
to sweet-expression format, including its comments,
and with any necessary markers to control infix interpretation.
I expect that the need for
controlling infix interpretation will be exceedingly rare, so
the results should look very nice.

Infix implementations

Many, many people have implemented infix processors.
Here are some, not including the
many larger notational systems (like IACL2) that have an infix notation
built into them:

Alan Manuel K. Gloria’s “infix notation macro”
is extremely promising.
His approach is to create a macro “nfx”, and then put spaces
around everything. From then on, everything inside transitively
can use either infix or prefix notation.
It detects which lists are infix by examining the second parameter,
and determining if it is in a list of infix operators.
At this time the license intends to allow arbitrary use, though
not in a clear legal manner.

Mark Kantrowitz wrote an infix reader macro that is typical of
many such packages (dated Jan 18, 1995); his runs on Common Lisp.
It is often mentioned, though its restrictive license makes it
useless to many.
It “allows the user to type arithmetic expressions in
the traditional way (e.g., 1+2) when writing Lisp programs instead of
using the normal Lisp syntax (e.g., (+ 1 2)). It is not intended to be a
full replacement for the normal Lisp syntax. If you want a more complete
alternate syntax for Lisp, get a copy Apple’s MLisp or Pratt’s CGOL.
Although similar in concept to the Symbolics infix reader (#<DIAMOND>),
no real effort has been made to ensure compatibility beyond coverage
of at least the same set of basic arithmetic operators. There are several
differences in the syntax beyond just the choice of #I as the macro
character. (Our syntax is a little bit more C-like than the Symbolics
macro in addition to some more subtle differences.)
It is not open source software; derivatives are allowed, but
no fees or compensation may be charged for “use, copies,
distribution or access to this software”, which probably forbids
Linux distributors from distributing it
(it is in Debian’s “non-free” section for this reason).
Here it is.
Norvig extended it.
With this, you can write #I(3+4).
It supports all sorts of precedence rules, etc.
However, its restrictive license makes it useless to most people;
it cannot be distributed by many FLOSS vendors, it cannot be used
in commercial settings, and since there
is no hope of commercial support for it, its code is unusable to most people.
However, it is a good idea that
with a fresh re-implementation might be practical for some
(after all, this package is a re-implementation itself).

”mlish” is an infix syntax frontend for Scheme (from the same person who made “indent” ).

The
Gambit Scheme implementation reader includes the
SIX (Scheme Infix eXtension).
“The backslash character is a delimiter that marks
the beginning of a single datum expressed in the infix syntax ...
the backslash character escapes the prefix syntax temporarily
to use the infix syntax.
For example a three element list could be written as ‘(X \Y Z)’,
the elements X and Z are expressed using the normal prefix syntax and
Y is expressed using the infix syntax.
When the reader encounters an infix datum, it constructs a
syntax tree for that particular datum. Each node of this tree
is represented with a list whose first element is a symbol
indicating the type of node. For example, ‘(six.identifier abc)’
is the representation of the infix identifier ‘abc’ and
‘(six.index (six.identifier abc) (six.identifier i))’ is the
representation of the infix datum ‘abc[i];’.

guile-arith
is a module to demonstrate how to build a parser in the
Guilde version of Scheme, but is very immature.
Its web page says “this is not meant to be used.”
It builds yacc/lex parsers into guile C modules.
It has a demo, supporting notations like this:

Putting it together: Sweet-expressions

Obviously, there are lots of different ways to try to make
Lisp-like languages easier to read.
So let’s combine them.
In particular, I’ve identified a particular set of options that I
think are tasty.

Sweet-expressions: A basic definition

I have developed a combination I call “sweet-expressions”
that I hope will be a good solution.
In this notation, normally-styled s-expressions will typically
work “as normal”,
but various extensions are added that make it possible to create
easier-to-read programs:

We’ll support indentations using augmented I-expressions.
An indented parameter with no following parameters (on the same line or
via indentation) is considered a symbol, so a function with no
parameters must be invoked by surrounding it with “(...)”
or terminating its name with “()”.
We'll augment it by ignoring blank lines at the beginning of an
expression, and by immediately executing a term that begins at the
left edge and is immediately followed by newline
(to make interactive use pleasant).

By default, any expression is an infix expression
if the first term (the function name) is not an infix operator,
its second term is an infix operator,
and it has at least three terms;
otherwise it is a traditional prefix expression.
Infix expressions must have an odd number of
terms, and the even parameters must be the identical infix operators.
So, (2 + 3 + 4) is okay, but (2 + 3 * 4) is not; you must use
(2 + (3 * 4)).
That way, you only need to look at the first two terms of any
expression to see if something is infix or not.
If all the (odd-numbered) infix operators are the same, they are chained
into a single infix operator, so (2 + 3 + 4) becomes (+ 2 3 4).
Note that in a name-prefix form, it's the parameters that are considered,
so f(2 + 3) becomes (f (+ 2 3)).
Lists where the even-numbered parameters are
different are considered non-infix (at least if any of them aren't operators); that
will mean that more "traditional" Lisp s-expressions will be read
without meaning change, even if they have an operator in the second position.
You must separate each infix operator with whitespace on both sides.

Previously I planned to use built-in precedence rules
if the infix operators differ, but
after some experience I'm currently thinking that it'd be better to
not support precedence at all.
Precedence is nearly universal among programming languages;
they're very useful, and only a few infix-supporting languages
(such as Smalltalk) lack them.
But in the typical use cases of a Lisp-like language, precedence has some
significant downsides that are irrelevant in other languages.
First, let's talk about a big advantage to not supporting
precedence in sweet-expressions:
It makes the creation of every new list obvious in the text.
That's very valuable in a list processing language; the key advantage of
list processing languages is that you can process programs like data, and
data like programs, in a very fluid way, so having clear markers of new lists
like parentheses and indentation is very valuable.

What's surprising is that precedence turns out to have a host of problems in
specific context, leading me to believe that it's a bad idea for this usage.
Precedence is very useful in general (and it could be added for
special cases), but adding precedence rules to a general-purpose list expression
processor creates a slippery slope of complexity.
You can trivially create commands that create new infix operators, and
express their precedence and direction - no problem.
But you're using a Lisp-like language, the use case tends to be different,
with objects shifting between being data and being code.
Often you're working at multiple levels and meta-levels of input -
and it's not likely that you'll want their precedences to be the same.
Yet trying to handle that turns out to require very complex management of
the reader routine.
Also, combining code from different sources can be a nightmare if
they have different precedence values.
In short, having precedences vary by meta-level creates complexities
you just don't want to deal with.
That suggests moving to a completely static, pre-canned set of static
levels is required.
Okay, but someone has to predefine that set.
That turns out to be really hard; the meanings of various infix operators
are not cast in stone, so it's difficult to preset the precedence
levels meaningfully (since you don't know what the operators mean).
The one obvious case is that * and / have higher precedence than binary
+ and - basically everywhere.
That particular rule could be built-in, but if that's all you
build in, it's not clear that it's worth the trouble.
Especially since the idea that completely forbidding mixing the operators
means that we make the mapping between lists and the textual representation
much clearer.
It's easier to add precedence later, if that turns out to be important after
more experimentation.
But after the current experimentation
it appears that precedence simply isn't worth it in this case;
it creates complexity, and hides where the lists begin/end.
Precedence is great in many circumstances, but not in this one.

Of course, you can write code in some Lisp dialect to implement a language that
includes precedence.
Many programs written in Lisp, including PVS and Maxima, do just that.
But when you're implementing another language, you know what the operators
are, and you're probably implementing other syntactic sugar too, so
adding precedence is a non-problem.
But sweet-expressions are intended to be a universal representation,
just like S-expressions are, so their role is different... and in that
different role, precedence causes problems that don't typically show up
in other uses.
Not supporting precedence turns out to be better in this different role.

This approach supports binary infix operators, but doesn't include any
special mechanism for unary minus.
For unary minus, you must use the name-prefixed
form “-(x)” or the prefix form “(- x)” .
By only handling binary infix operators,
it is much less likely for an expression to be “accidentally”
interpreted as an infix expression when we did not mean it to.
For example, you can even have quoted lists of infix operators,
and freely mix s-expressions that begin with infix operations, without trouble.
But there are a few cases where automatic infix interpretation
while reading can be troublesome, so there will be ways to control it.
Note that an expression is not infix if you do
just about anything with the second parameter, or if you use
an infix operator as the first operator.
This means that:

(2 + (3 * 5)) <==> (2 + (* 3 5)) <==> (+ 2 (* 3 5))

We’ll allow the use of name-prefixing, so NAME(x y) is the
same as (NAME x y). NAME must be followed
immediately by a “(” without whitespace between them.
For the infix rules purposes we’ll
perform this transformation after doing infix expression handling,
so if the cuddled expression is an infix expression,
it's a single-parameter function
with an expression in infix form (you can use parentheses to cause a
multi-parameter interpretation).
Here are some examples:

I call this combination “sweet-expressions” - by adding
syntactic sugar (which are essentially abbreviations),
I hope to create a sweet result.
In the few cases where the sweet-expression is awkward, just use the
usual s-expression notation (with infix escapes if necessary) --
well-formatted s-expressions are still legal.
The result supports quoting and backquoting, is very
compatible with typical s-expressions, and should work well with most macros.

Controlling sweet-expression interpretation and next steps

There should be standard ways
controlling this, particularly for infix interpretation.
Here are some early ideas about this.

For the moment, I plan to just implement an "as(..)"
expression that can be cuddled around the infix operator (the
second parameter).
This must be removed at read time or very early on during
eval time, because other macros won't know what to do with this.
In other words, to ensure that "(defun as(+) ...)" will work correctly,
we need to process as(..) before we process "defun".
(Alternatively, use name-prefixing: defun(+ ...) works as well).
I plan to implement this at read time for now, so that all other macros
will work perfectly normally.. by the time normal macros get the
expressions, they only see standard s-expression format.
(Instead of as(...), I could use some notation beginning with #, but this
turns out to be awkward in Scheme (because handling # as a secondary reader
is painful), and it also turns out to be awkward in Common Lisp
(because functions have a different name space, so you have chained #
operators that look confusing).

Controlling infix processing is probably the most troublesome issue.
For new code, it's more convenient to presume infix unless disabled -
you really only need the occasional "as".
Yet for maximum compatibility with existing code, it's better
to expressly turn it on and to have control over it, so you want
default off, with various ways to enable it.
I believe this is very solvable.

For now, I'll just implement the "as" version, which is enough to
see how it works.

A related issue is defining what the infix operations are.
It seems clear to me that = (equality), <=, +, and so on should
map to their prefix versions.
I also think that && should map to "and", and || should
map to "or" -- these are common conventions in other languages, and
thus should be easy to read.
I think "**" should become "expt" (exponentiate).
But there are many other prefix operations with text-based names that
should be defined (by defining a function or macro) that aren't
as obvious.
All these mappings can be done using regular definitions, without
anything special happening in the language.

First, assignment (setf in Common Lisp, set! in Scheme).
Gloria suggests using "=" for setting values (setf in Common Lisp),
but I think that's a bad idea.
After all, "="... that's already the equality test, and having the same
symbol mean different things is likely to be disastrous; C already has
trouble simply because "=" and "==" look too similar.
The phrase ":=" has its problems too; it'd be considered "=" in the
keyword space (confusing!) in Common Lisp, and doesn't prefix well.
A plausible alternative is "<-", so "x < 5" reads as
"set x to the value 5". This also combines well with C-like
operators, e.g., "+<-" could be "add to"... that means that
"x +< 5" would be the same as "(setf (+ x 5))" in Common Lisp
or "(set! x (+ x 5))" in Scheme.

What should ++ be... increment, string concatenation, something else?
There should probably be a string concatenation infix operator,
but I'm not sure what to name it.

I would use "->" for implies, and
"<->" for iff (if and only if).
One danger is that mixing "->" and "<-" in the same expression
can be very confusing, but I don't see a simple alternative.

A "standard list" of common infix operations and their
meaning would make it easier to read and understand such expressions.
But this is difficult to do, and in case only makes sense once there's
more experience with sweet-expressions.

What about writing a list back out?
It'd be nice to have a written form that could always be read back
in as a sweet-expression.
An optional writer would be nice - one that added as(...) where needed,
and which optionally showed infix as infix, or optionally used indenting.
It could be made really obvious that it was the special writer by
using name-prefixing where possible -- that would be a glaringly obvious
difference from the traditional s-expression writer.
So it would write stuff like:

if((x = 5) display("Five") display("Not five"))

Alternatively, the writer could normally write without
using name-prefixing:

(if (x = 5) (display "Five") (display "Not five"))

It could also just write traditional s-expressions, but adding the as(...) in the
few places where it's necessary; that'd be good for debugging:

The two samples above are also legal sweet-expressions.
Why?
Well, the initial “(” disables I-expression processing, so even if
the formatting is ruined, we’re simply processing as usual.
There’s no function name ajoined to the right with “(”; normal
formatting rules wouldn’t even consider that.
As is usual for Lisp code, “infix” operators are listed first (not second)
so there’s no infix operator as a second parameter to cause a change.

But what if we take advantage of the new capabilities of sweet-expressions?
A pretty-printer could even do this automatically, as a way to start.. and
the results look fantastic.
Here's the Common Lisp version:

defun factorial (n)
if (n <= 1)
1
n * factorial(n - 1)

And here's the Scheme version:

define factorial(n)
if (n <= 1)
1
n * factorial(n - 1)

What’s even more interesting, we haven’t lost any power.
We can still use quotes, or quasiquotes with comma-lifting.
We can still have functions that return functions, and so on.
We can also mix up regular s-expressions as part of sweet-expressions, or
even mix things up back and forth:

can be expressed the same way in sweet-expressions.
But if you use I-expression indentation, they can become
staggeringly more obvious to the uninitiated:

cond
group (condition1)
result1
group (condition2)
result2
group t
otherwiseresult

Since the infix operations are “merged” if they are all the same,
this works as it would in mathematics in a sweet-expression:

if (0 <= x <= 10)
(dostuff) ; x is between 0 and 10 inclusive.
(dontstuff) ; x is not between 0 and 10 inclusive.

Discarded Variation: Initial "(" on block disabling more

In the definition of Sweet-expressions above,
any "(" disables indentation processing, but name-prefix and
infix process continue to work. This is mostly compatible with
existing Lisp code, but not completely. Name-prefixing is unlikely to cause
a problem in well-formatted code, but expressions that look like
infix expressions can certainly occur, and that could be a problem.
Basically, it'd be very nice to be able to disable infix
processing in some reasonable way.

I thought of one variation, which I describe here.. but then explain
its problems.

One solution is to modify sweet-expressions (as described above) so that,
if the first non-whitespace character of a newly-read block
is "(", more of sweet-expressions could be disabled.
We could disable infix processing, at least. Of course, once you do that,
why not just disable everything and process it as a traditional s-expression?
The advantage of doing this is that you can call the "original" read routine,
with all of its local extensions.
The rules would then get a little more complex to explain, but you also get
massive backwards compatibility.
If the first non-whitespace character is a special character
(mainly ";", "'", and ","), process them and try again.
Otherwise, after than initial character
normal sweet-expression processing occurs... so an embedded "("
in expressions like load("name") or if (n < 3)
would work normally.

There are disadvantages to this approach, too.
One sad thing is that parentheses-surrounded infix calculations would no
longer work at the topmost level, e.g., "(3 + 4)" would work
inside a block, but not as the topmost block.
You could enter "3 + 4", but that would require two "Enters" if we
keep the rule that "multiple terms on the initial line mean wait for more lines".
The user could call using name-prefixing any function that
returns its first parameter, e.g.,
if "calc" is a function that returns its first parameter, then
"calc(3 + 4)" could do the calculation.
An alternative for the "usual case" would be to
add one more rule: if there are 3 or more terms on the first line,
and it matches the infix pattern, then return immediately after the first
line is entered - so "3 + 4" now works on the command line.
I like that idea; it seems unlikely to happen by accident, and it's even
simpler than "(3 + 4)".
You'd still need a "calc()" function if on the initial line
you have an infix operation whose first parameter is also infix, e.g.,
"(5 * 6) + 3" won't work - you'd have to use "calc(5 * 6) + 3".
Such special-casing is not as "clean", unfortunately.
The advantages in backwards-compatibility and ease-of-use are significant,
though, so I'm leaning towards that change.

So here's how the rules would look:

If the first non-whitespace character of a new block is "(", it's
a traditional s-expression; it may be prefixed by arbitrary numbers of
blank lines (which may include spaces or tabs),
";...." comments, quoting ('), quasiquoting (`), or
comma-lifting (,).
Any spaces or tabs after the expression will be consumed before returning,
to increase compatibility with older systems
(an s-expression, followed by a bare atom, will be processed like a traditional
Lisp processor would process it).

Otherwise, if (after skipping content-free lines) the line begins
without any space or tab, and the line is either
one complete term like load("...") or one complete
infix expression like 5 + 6 + 7, it's considered a one-line sweet-expression;
it's returned (and run) immediately at the end of the line.
This special rule makes interactive use much more pleasant.
Without this special rule, we have to
enter extra blank lines all the time for sweet-expressions,
even when it's "obvious" that a new line isn't needed.
Note that more lines will always be requested
if there are any unclosed parens.

Otherwise, it's a multi-line sweet-expression.
Basically, putting tabs/spaces at the beginning of a sweet-expression
forces a "multiple lines" meaning.
While there's an open paren, indenting is ignored.
If there's not an open paren on the first line, the second line's tab/space indent
sets the "smallest indent"; any the line with less
indentation (including a blank line) ends the sweet-expression block.

The blank-line rule is needed to make interactive processing pleasant.
I think a space-and-tab only line should be considered the same
as an immediate-return blank line;
you could make lines with only spaces and tabs have a different meaning,
but that's certain to cause mysterious problems (since they'd look
the same).
Lines which include only a ;-prefixed comment should be completely
ignored, even if they have less indentation, so that they
can be included without ending the block.

Note: If the first line was indented, but the
second line is not indented at all, then the
previous line is processed and returned, with the second line unconsumed.
This deals with weird formats like:

hello
(more stuff....)

Such formats were probably intended to be interpreted as traditional
s-expressions (they make no sense as multi-line sweet-expressions),
so we'll interpret the first line as a traditional format to increase
backwards compatibility.

The rules are complex, though.
What's disturbing is that "(3 + 4) * 5" works completely differently
from "3 + (4 * 5)"; that's the sort of inconsistency that makes a
notation a burden instead of a pleasure.
One complication is that the "read" function needs
to return when it's read an expression, and ideally doesn't store internal
state of whitespace alrady read.
That means that recording the indentation of the initial block should be
avoided, if it's reasonable to do so... otherwise we have to record
that information.

In short, striving for backwards-compatibility, ease-of-use at the
command line, ease-of-use for programming, readability, and
full list processing without special-casing a lot of syntax involves
tradeoffs... my goal is to find the "best" trade.

Variations: Using different grouping symbols

There are times when you don't want infix processing
(or want to control it specially).
Also, it's clearly possible that reading an existing s-expression,
but using infix-interpretation, could occasionally silently change
the meaning of that expression.
And that is obviously of concern to some.

In discussions with Alan Manuel Gloria, another idea popped up:
really using multiple different grouping/list markers.
The original discussion noted
using [] for grouping/list initialization instead of ().
Basically, [...] would start a new list and
turn off indentation, but continue name-prefix and infix processing.
In contrast, () would at least disable infix processing.

Although the original discussions used [...], I tried out various
experiments and think that {...} is a better choice.
The problem is that [...] is used in mathematics, and many programming
languages including Python and Haskell, to mean a literal list.
But this is exactly what we don't mean - instead, we want a
"grouping" symbol that means that we specially interpret what's inside
(in particular, that we interpret infix operators).
In contrast, all languages with C-like syntaxes (C, C++, Java, C#) have
trained folks to read {..} as possibly meaning a block, and in mathematics
{...} can denote sets (again, special interpretation used for what's inside).

So, for the moment, let's presume that unprefixed (...) will indicate
"no infix" (since this is much more backwards-compatible with s-expressions).
There are actually two alternative ways to interpret (), if it's to be
distinct from {} grouping.
One is to say that () disables only infix, but you still have name-prefixing,
and you can switch back to infix inside by using {}.
Another is that () switches to pure s-expression processing.

There's a big advantage to saying that
() switches to pure s-expression processing
(disabling both indentation and infix processing, as well as indentation).
If you define it that way, the sweet-expression reader
can immediately call the original built-in (read) routine instead.
That's important - while implementing sweet-expressions, I've found that
there's a lot of implementation-unique read processing that you have to
try to simulate if you want ALL the functionality of the underlying system.
(E.G., standard Scheme doesn't have |..| constants, but some implementations
do, so to read stuff for their Scheme you have to implement these new
oddities.)
That makes it essentially impossible to write a portable implementation
of sweet-expressions that retains the capabilities of the underlying system.
As a side-benefit, most s-expression files would be just readable as-is...
which I admit is NOT a 'most important' criteria, but it'd be nice to have.

On the other hand, sometimes you'd like to say "no infix or indent processing",
but be able to use name-prefixing, and switch to infix inside it.
Guess what - there's a symbol pair already with that meaning: [..].
So [f g(x) + h(x)] is interpreted as (f (g x) + (h x)), without any
infix operator in sight.
Let's say that [..]
has this meaning as long as it's not name-prefixed;
that way, we can later give expressions like name[i]
another meaning.
If we use [...] to mean "list, no infix" and {...} to mean
"list, with infix", you can then write sweet-expressions without (...) at all.
This means that we could then give the reader a choice of how to handle
(...)... as grouping, like {}, or as no infix, as [].
The latter is more backwards-compatible, and so I'd expect that to be
the default; the former is much more similar to other languages.

I think it'd be nicer to start with "infix as the default"
(maybe a callable parameter if you want something else).
That's a reasonable default simply because MOST people are more
comfortable with it. Since ANY open paren disables the infix until its
matching paren, users that use ordinary s-expressions might not
even notice sweet-expressions.
It's not like normal Lispers type in "3 + 4" and expect it to work :-).
So it won't harm those who want s-expression prefix, but it
will help those who want infix.
That might make sweet-expressions more likely to be accepted;
you could enable them, and yet for most people they wouldn't even
notice the difference (except for the addition of new abilities).

The rules are simple: When unprefixed, use {} for grouping, which disables
only indentation; use [...] to disable indentation and infix;
use () to disable everything (infix, name-prefix, and indentation).
A prefixed {} or () is a function call, and only disables indentation.
We'll reserve prefixed [...] for future use.
This use of {} instead of () for grouping infix operators like {4 + {3 * 2}}
is _slightly_ nonstandard, but not terribly so, and lets us be
completely backwards-compatible with s-expressions.
That's pretty compelling.

What about name-prefixed forms? Even if (f x) disables infix processing,
that does not mean that f(x) must do the same.
It's consistent in one sense if it does, but f{x} looks really funny while
f(x) looks very conventional.
I'm presuming, for the moment, that unprefixed (...) will really only be
used for backwards-compatibility.

So now we're back to nice forms, without any calc()-like irregularities:

There's another interesting benefit too. This would mean that
sweet-expressions would be full of {...} when infix is used...
which would make them even easier to tell apart from s-expressions.

Interestingly, this also makes it easy to use special "infix" macros
(often written as "nfx").
Just tell users that "for advanced infix processing, use (nfx ....)
and then follow these infix rules (which can then support precedence
levels, defining infix operators, etc.)".
So now you can have a choice:
(1) a built-in "base" system doesn't need to be told about operators,
precedence, or associativity, but you have to group all different
operators using [] and you can only use punctuation as infix operators... and
(2) advanced "nfx-like" functions, but one where you have to
define all that stuff.

In theory, a program written using Scheme RSR6 might use [...] to
mean the same as (..). But I bet almost no one actually DOES that.

This switch to using {...} everywhere for grouping, and (...) for
straight s-expressions, is very tempting.
The factorial example becomes:

It's easy to accidentally use parens when you meant to use {...}, particularly
in expressions like {4 + {3 * 2}}.
But it's still tempting to make this change,
since this makes the entire notation even more backwards-compatible,
as well as giving it an easy (and clear) way to disable infix.

Since this gives an easy way to disable infix, this does raise the
possibility of allowing "and" and "or" as built-in
infix operators. After all, now infix operators are easy to disable.
On the one hand, it's horribly inconsistent to have "and" and "or" built-in
as infix, since all other infix operators are punctuation.
On the other hand, "and" and "or" are essentially supported by all major
Lisps.
Also, these two operators are often treated specially;
provers often treat them specially, and some Lisps treat them as
short-circuiting operations (so in some since they are already
very special).
A typical dilemma: You don't want many special cases, but if a special
case really makes something easy to use, it may be worth it.
And of course, once "and" and "or" are built-in, along with relational
operators and + - * /, perhaps some built-in precedence does make sense.

In the end, the best approach may be to choose the "most easily changed"
decision.
If precedence is not supported at first, then adding it later is
easy (since any written code won't use precedence).
On the other hand, removing precedence (or fixing precedence levels)
is hard/impossible.
Similarly, if () disables sweet-expressions entirely, then code won't be
written inside () that uses sweet-expressions... so it won't be a big deal
to add that functionality later.

It's really too bad about (...).
In any other language, (...) merely groups expressions, but if we're
serious about backwards-compatibility, then an unprefixed
(...) needs to turn off infix processing inside it.
If you don't need backwards compatibility, accepting unprefixed (...) for
simply grouping (without disabling infix) sure looks nicer.
It's too bad that there is a big trade-off here, I'd like the benefits of both.

Ideally, any data format would not be easily misread, and it would not
be easy to make a small mistake with disastrous consequences.
This is less likely if any minor change to an expression would make it
illegal.
Any format that can read both original S-expressions and a new format will
have some issues in this regard, simply because there are many more
expressions that were once illegal but are now legal.
A particular problem is that in some cases, expressions could be interpreted
as being infix but were not intended to be infix.
However, I believe these are very rare.
Conversely, expressions that are intended to be infix might not be
(e.g., because of a failure to surround infix operators with spaces); I
think that with experience this goes away.
The indenting is a non-issue; since this is only meaningful outside of
parentheses, any existing Lisp code will be inside parens, and
indenting like this is common practice, there's no real issue.
The prefixed names are unlikely to be in real code; there's a
risk that people will accidentally put spaces between the name and the opening
paren, but I think it's manageable.

If there's a really major risk that a misplaced space would immediately cause
your deep space probe to crash, perhaps a Lisp-like notation isn't the
best idea for your application.
Even regular s-expressions have similar problems; a space anywhere in an
s-expression separates parameters, and there's little compile-time checking
of the kind expected for such circumstances.
Lisp-like languages don't do a lot of compile-time checking in general
(of the kind expected for an "any mistake is disastrous" application).
Specialized notations that make it hard to make that kind of
mistake, and easier to
detect every mistake at compile time, might be better (Ada comes to mind).
But in many circumstances, that's not the primary need; if you need to do a pile
of list processing, including processing/analysis of language,
a list processing language makes sense.
For example, analysis of a critically-important program might be
a perfectly fine use for sweet-expressions.

Another problem is when there are multiple ways to express something, it
can be hard to choose between them. In particular, you can use
name-prefixing or indentation to implement a function call, as well as
traditional s-expressions - which do you choose?
And when displaying a list, which do you choose?
I think "name-prefixing for short expressions, and indentation when their
parameter lists get long", is a reasonable rule of thumb when writing code
where you intend to use it as a function call. If it's just a list of
information, a traditional s-expression shows it clearly.
I would probably use traditional s-expressions to write the list back out;
that is clear, and can help people avoid misunderstandings and misuse.

There are lots of other options, but I only have so much time.
In any case, I’ve investigated a lot of options, and I think I’ve
found a plausible combination.
What I have here is a paper design that looks like it should work.

I think the next step is to establish a project where people can
discuss this, and fix it based on those comments.
Then, after a period of discussion, create some FLOSS code
to implement it
that is widely-usable (say under the MIT, BSD-new, or LLGPL license).
One of the first things to implement are pretty-printers for this
format, so we can see what big programs look like in this
format (and see its problems).
Once the format is updated through experience coding it, I would want
the resulting programs would be available to all.
Multiple implementations will be necessary (for Scheme, Common Lisp, etc.)
and they will need to support several different usage scenarios:

the readers must accept sweet-expressions and use them to create
s-expressions -- and must be able to work as
a callable translation function, a preprocessor, and
in a read-eval-print loop (supporting other macros).

the writers of sweet-expressions
must support many pretty-printing options when taking s-expressions
and generating a “pretty” version.

My current thought is to develop it in a subset of Scheme, which can then
be easily translated to ACL2, Common Lisp, and so on
(Scheme has far fewer operations, and is much stricter about types such as
booleans, so it's easier to translate Scheme to something else instead of the
other way around.)

Finally, after experimentation and testing in the real world,
the final form of sweet-expressions could be set and released to the world.

I've created a SourceForge project named "readable" -
the project's role is to discuss ways to make s-expressions more readable,
create at least one (sweet-expressions), and promulgate FLOSS code
to support readable representations of s-expressions.
I could possibly create "sweetexpr" for specifically
working on sweet-expressions, or work via the Scheme SFRI process.

Appendix: BitC

I recently encountered the new language
BitC,
a language devised to support low-level programs
(e.g., operating system kernels) that are easy to mathematically
prove correct.
(I learned about this while writing my paper
High Assurance (for Security or Safety) and
Free-Libre / Open Source Software (FLOSS)... with Lots on Formal Methods.)
BitC combines ideas from Scheme and ML, including an
emphasis on functional programming and tail recursion,
with low-level capabilities like those of C.
In particular, it’s designed so that you do not need to
do heap allocation (e.g., “malloc” or “new”) or garbage collection
to create useful programs, a property that functional programming
languages generally do not have.
This is very interesting work, and BitC version 0.9+
uses an S-expression-like notation.
The reason it does, intead of a more common notation,
is because the authors want to make it easy to prove them correct.
One way to prove them correct is to use reasoning where program
fragments are repeatedly replaced by other fragments -- and
since s-expressions are very regular, s-expressions simplify the process.
They hope to reuse some of ACL2’s capabilities, which is written in
Common Lisp and expects s-expressions as input.

The problem is that s-expressions are still not easy for humans
to read, and BitC’s creators are well aware of that.
Also, while Lisp programmers routinely “write programs to write
programs”, here the application is different -- generally
a theorem prover is doing the rewriting, and displaying LOTS
of output from text generated from a program.
This means humans will be doing lots of reading, possibly of
long, complicated structures;
what you want is an input and output that is simple for the human to read
and unambiguous.
What’s more, this is a new language, not Lisp, so trying to stay
compatible with Lisp for compatibility’s sake is not necessary.
Indeed, the BitC folks had already strayed from the strict s-expression
camp; they had included a few abbreviations, like an infix “:” to
indicate types (the abbreviations quietly transform to regular
s-expressions).
Keeping the power of s-expressions (e.g., the regularity
of processing) would be advantage for BitC.

In an email I received from Jonathan Shapiro, he noted that
“I share your concern about using an [s-expression-based] syntax
[that users may find it hard to use]. The problem is
that we have a bunch of very nasty experience in Isabelle/HOL, TWELF,
and friends with the alternative...”, in part because
“All known examples for quoting programs so that they can be
manipulated as terms in a prover are a complete disaster if the
[programming language] is infix.”
He continued,
“When we get around to dealing with the logic language, it will become
clear that treating program fragments as terms is vital. In LISP this is
quite natural: ’(eval 1).
... ACL2 exploits a devious pun between applicative
functions and logic terms... [defining a function]
simultaneously [admits] admitting a procedure and a [mathematical pure]
function...
Because [this pun]
is maintained so successfully, godelization requires no explicit syntax...
Our experience with parsing infix languages as terms has been that (a)
it is truly dismal, and (b) it is incomprehensible to the developer...
even the people at UT could not make IACL2 work well...
One reason that we did NOT adopt the ‘{’ syntax for vectors was the
anticipation that we might use it for infix terms in some future
implementation. I’ve been very careful to preserve the possibility of an
equivalent infix syntax. It doesn’t mean that we would have to drop the
existing lisp-ish syntax.”
He also doesn’t want to spend time devising syntax,
since that is not the focus of his work.

Note that this application needs to be able to manipulate fragments
of program code, repeatedly -- a traditional Lisp strength,
and a traditional reason to use s-expressions.
Yet, we want the s-expressions to be easier to read!

Whether or not the BitC folks pick up these ideas immediately,
I think finding a way to make s-expressions more accessible
is an interesting puzzle, one that would be great to solve.
I suspect there are many people who might be
interested in picking up an improvement, if it really was an improvement.
The ACL2 folks have the same issues, as do many others, too.
So it'd be great to solve it once, and then anyone who wants to use
the solution can do so.

Appendix: Other Notes

Of course, making programming languages readable is much more
general topic, and there's a lot of literature on that.
The literature on parsing is also extensive.
One paper on parsing that I should mention is
Haskell, do you read me?: constructing and composing efficient top-down parsers at runtime,
Proceedings of the first ACM SIGPLAN symposium on Haskell table of contents
Victoria, BC, Canada, pp. 63-74, 2008.
This is implemented by "ChristmasTree":
"Changing Haskell's Read Implementation Such That by
Manipulating Abstract Syntax Trees it Reads Expressions Efficiently".