Semicolons in Go

When we started working on Go, we were more concerned with semantics than syntax, but before long we needed to define the syntax in order to write programs. One syntactic idea we tried was to reduce the number of semicolons in the grammar, to make the source code cleaner-looking. We managed to get rid of many of them, but the grammar became clumsy and hard to maintain as we worked on the compiler, and we realized we had overreached. We backed up to a compromise that had optional semicolons in a few places, a couple of rules about where they go, and a tool (gofmt) to regularize them.

Although we acclimated to the rules and were comfortable with them, once we launched it became clear that they were not really satisfactory. The compromise didn't seem to fit right with most programmers. The issue needed to be rethought.

The language BCPL, an ancestor to B, which is an ancestor to C, had an interesting approach to semicolons. It can be summarized in two steps. First, the formal grammar requires semicolons in statement lists. Second, a lexical (not syntactic) rule inserts semicolons automatically before a newline when required by the formal grammar.

In retrospect, had we thought of BCPL's approach at the time, we would almost certainly have followed its lead.

We propose to follow it now.

Appended to this mail is a formal specification of the proposed rules for semicolons in Go. They can be summarized much as in BCPL, but Go is a different language so the detailed rules differ; see the specification for the details. In short:

- In the formal grammar, semicolons are terminators, as in C.

- A semicolon is automatically inserted by the lexer if a line's last token is an identifier, a basic literal, or one of the following tokens: break continue fallthrough return ++ -- ) }

- A semicolon may be omitted before a closing ) or } .

The upshot is this: given the proposal, most required semicolons are inserted automatically and thus disappear from Go programs, except in a few situations such as for loop clauses, where they will continue to separate the elements of the clause.

No more optional semicolons, no more rewriting; they're just gone. You can put them in if you want, but we believe you'll quickly stop and gofmt will throw them away.

The code will look much cleaner.

There are compatibility issues. A few things break with this proposal, but not many. (When processing the .go files under src/pkg with an updated parser, most files are accepted under the old and new rules.)

By far the most important breakage is lexical string concatenation across newlines, as in

"hello "

"world"

Since the language permits + to concatenate strings and constant folding is done by the compilers, this feature is simply gone from the language; use a + to concatenate strings. It's not a huge loss because `` strings can still span newlines.

With the new rules, a semicolon is mistakenly inserted after the last element of a multi-line list if the closing parenthesis or brace is on a separate line:

f(

a,

b

)

To avoid this issue, a trailing comma is now permitted in function calls and formal parameter lists, as is the case with composite literals already.

A channel send spanning a newline can accidentally become a receive, as in

a = b

<-c

Inserting a semicolon after the b changes the meaning from a non-blocking send to an assignment followed by a receive. For this transformation to mislead, however, the types of a, b and c must be very specific: interface{}, chan interface{} and chan. So a program might break but it is almost certain never to succeed incorrectly. We are aware of the risk but believe it is unimportant.

A similar thing can happen with certain function calls:

f()

(g())

If f() returns a function whose single argument matches the type of a parenthesized expression such as (g()), this will also erroneously change meaning. Again, this is so rare we believe it is unimportant.

Finally, a return statement spanning a newline is broken up into two statements:

return

f()

For this to miscompile, the return statement must be in a function with a single named result, and the result expression must be parseable as a statement.

Gofmt's style already avoids all three problematic formattings.

This proposal may remind you of JavaScript's optional semicolon rule, which in effect adds semicolons to fix parse errors. The Go proposal is profoundly different. First, it is a lexical model, not a semantic one, and we believe that makes it far safer in practice. The rules are hard and fast, not subject to contextual interpretation. Second, since very few expressions in Go can be promoted to statements, the opportunities where confusion can arise are also very few - they're basically the examples above. Finally, since Go is statically type-safe, the odds are even lower.

Another language the proposal may evoke is Python, which uses white space for indentation. Again, the story here is very different. Program structure is not defined by white space. Instead, a much milder thing is happening: lists of statements, constants, etc. may be separated by placing them one per line instead of by inserting semicolons. That's all.

Please read the proposal and think about its consequences. We're pretty sure it makes the language nicer to use and sacrifices almost nothing in precision or safety.

Rolling out the change.

Gofmt will be a big help in pushing out this change. Here is the plan.

1. Change gofmt to insert the + for string concatenation. Give it a flag to omit the semicolons but leave them in by default.

2. Reformat the tree with that gofmt: this inserts + for lexical string concatenation but otherwise is a no-op.

3. Update the compilers to insert semicolons. They should then accept gofmt output (with semicolons) and semicolon-free programs just fine.

4. Try things out, revising the specification and tools as required.

5. Once happy, make the gofmt default "no semicolons". Reformat the entire tree.

The formal specification.

The following changes are applied to the spec.

1) New semicolon rules:

a) When the input is broken into tokens, a semicolon is automatically inserted into the token stream at the end of a non-blank line if the line's final token is:

- an identifier or basic literal

- one of the keywords break, continue, fallthrough or return

- one of the tokens ++ -- ) ] }

b) To allow complex statements to occupy a single line, a semicolon may be omitted before a closing ) or }.

2) The interpretation of comments is clarified:

a) Line comments start with the character sequence // and continue through the next newline. A line comment acts like a newline.

b) General comments start with the character sequence /* and continue through the character sequence */. A general comment that spans multiple lines acts like a newline, otherwise it acts like a space.

3) Replacements:

a) All uses of StringLit are replaced by string_lit.

b) All uses of StatementList are replaced by { Statement ";" }.

4) The following productions are simplified and always use semicolons as terminators. In idiomatic use, the semicolons are inserted automatically and thus won't appear in the source code.

5) The following productions permit optional commas. This enables multi-line constructions where the closing parenthesis or brace is on a new line if the last element is followed by a comma. The optional comma is only new for parameter lists and calls; composite literals permit it already.

Parameters = "(" [ ParameterList [ "," ] ] ")" .

CompositeLit = LiteralType "{" [ ElementList [ "," ] ] "}" .

Call = "(" [ ExpressionList [ "," ] ] ")" .

6) The following productions are gone since they are not needed anymore with the simplified productions outlined in 3) and 4).

StringLit

StatementList

Separator

FieldDeclList

MethodSpecList

ImportSpecList

ConstSpecList

TypeSpecList

VarSpecList

7) The two exceptions about when semicolons may be omitted in statement lists are superseded by 1b above.

> Appended to this mail is a formal specification of the proposed rules for
> semicolons in Go. They can be summarized much as in BCPL, but Go is a
> different language so the detailed rules differ; see the specification for
> the details. In short:
> - In the formal grammar, semicolons are terminators, as in C.
> - A semicolon is automatically inserted by the lexer if a line's last
> token is an identifier, a basic literal, or one of the following tokens:
> break continue fallthrough return ++ -- ) }
> - A semicolon may be omitted before a closing ) or } .
> The upshot is this: given the proposal, most required semicolons are
> inserted automatically and thus disappear from Go programs, except in a few
> situations such as for loop clauses, where they will continue to separate
> the elements of the clause.

Please, don't. I know that making semicolons optional is popular and
is done in a number of popular languages like JavaScript and Ruby,
however I don't like it.

Which do you prefer to read? Which makes the structure of the
calculation clearer at a glance?

Your semi-colon insertion rule would make the first style break
horribly, and would force the second style. Which I think it
significantly worse.

That said, I do like allowing trailing commas in argument lists.
That's good because it makes errors less likely when you come along
and add another element to a list. But I like putting my operators
where my eye can scan down and see them.

It's funny. I typed a response to this defending the change saying that it does what you want. But it doesn't; I misread your mail because I assumed you disliked the first form. Because I do; the second is the one preferred by everyone here. Those trailing operators make it clear that we're not done yet, which makes the code more readable to my eyes.

So to answer your questions: Which do I prefer to read? Which is clearer? The second.

I think these are minor points anyway. Again, it's about having a style, not what the style is. If the consequences of the rules were the other way around, I'd still be happy about them because I'd adjust quickly and not care. Gofmt saves all. Legislated taste trumps endless formatting arguments.

> That said, I do like allowing trailing commas in argument lists.
> That's good because it makes errors less likely when you come along
> and add another element to a list. But I like putting my operators
> where my eye can scan down and see them.

On Wed, Dec 9, 2009 at 6:49 PM, Rob 'Commander' Pike <r...@google.com> wrote:
> - In the formal grammar, semicolons are terminators, as in C.
> - A semicolon is automatically inserted by the lexer if a line's last
> token is an identifier, a basic literal, or one of the following tokens:
> break continue fallthrough return ++ -- ) }
> - A semicolon may be omitted before a closing ) or } .

I think you're making things worse, this is more complicated than
before and for what? The *unbelievable* feature of being able to leave
out semicolons? I thought this was an April's fools message at first.
Sigh.

Who *cares* about leaving out semicolons? Hordes of C programmers
you're trying to convert? Unlikely, they are used to it! Hordes of
Python addicts? Unlikely since they'll complain about {} right away!
Do *you* really care? I mean is it really making your code a whole lot
better or more enjoyable to write? There are plenty of things to fix
in Go, and you fix *semicolons*?

Don't get me wrong, I really, really like Go. Heck, I do almost
nothing but write Go code these days, for hours, and for fun. But I
strongly believe that you're heading the wrong way by adding more
rules to achieve a goal that's dubious in the first place.

Heck, you could probably have Robert Griesemer put together a clean
Oberon-style syntax in a few hours that would make all this go away.
Or just go for *real* C-style if you still believe that the world
can't deal with capitalized keywords. Either is a lot simpler than a
whole bunch of special cases. Yeah, I don't have to remember them
because of gofmt. Programming languages shouldn't depend on tools,
they should make good sense by themselves.

Just my $0.02 of course. Sorry for the rant, I don't mean to flame
anyone. I mean to flame the topic.

Right. The point is that the newlines are already there. A tiny fraction of statements span lines. Let's make the default support the common case instead of requiring fly specks everywhere to enable the rare case.

> Do *you* really care? I mean is it really making your code a whole lot
> better or more enjoyable to write?

Honestly, it is. Since we started toying with this idea
I've had to go back in and add semicolons to every snippet
of code I've written in mail to this list, and even that has
gotten frustrating.

> Don't get me wrong, I really, really like Go. Heck, I do almost
> nothing but write Go code these days, for hours, and for fun. But I
> strongly believe that you're heading the wrong way by adding more
> rules to achieve a goal that's dubious in the first place.

Believe it or not, this removes rules. Robert played with
this in the parser and the language spec and was surprised
how much simpler things got. I am doing the same conversion
in the 6g compiler right now, and the dead code I'm cutting
away is one of the ugliest parts of the compiler. I'd
forgotten writing it, but boy is it ugly. And soon it will
be gone.

The status quo lets you omit semicolons in a handful of
magic places (basically, at top level and after certain
kinds of }, but not all, and not always). The new proposal
is dramatically more regular, much simpler to parse.

An earlier version of Go did single-file (as opposed to
single-package) compilation; if you wanted to refer to a
type or function in another file or even later in the same
file, you had to forward declare it, just like in C. For the
cross-file references you had to import your own package to
get access to the other files. In practice, this was fine,
it worked, we got used to it, but it was weird and a
stumbling block for new Go programmers. When we decided to
compile a package at a time and throw away all the forward
declarations, it took a couple weeks of work to make the
change in the compiler, because it was breaking a
fundamental architectural assumption in the compiler
(one-pass compilation). When we were done it felt like
maybe that wasn't such a great use of our time, since we'd
fallen behind on other things and hadn't really changed the
language that much. But looking back, it was time well
spent. Ken says it's one of his favorite features. He observes
that you don't realize quite how taxing the forward declarations
are in aggregate until they're gone, because each one costs next
to nothing.

I think we're going to feel the same way about semicolons in
a few months (and it's a much simpler change).

Syntax matters. Otherwise the world would be programming in
Modula-3 (or Oberon?) or Lisp.

> In this case, I'd suggest taking a page from the book of Python and
> making them both an error but allowing
>
> foo := (a +
> b -
> c)
> or
> foo := (a
> + b
> - c)

That's a good idea, but there are a couple places
in the language where semicolons are needed inside
parens (the factored import and declaration blocks);
having a special rule for them would make the change
depend on parser state instead of being purely lexical.
It's a very important distinction, both for predictability
and for simplicity of implementation.

On Thu, Dec 10, 2009 at 12:56 AM, John Asmuth <jas...@gmail.com> wrote:
> How about allowing a \ to mean "end of line, but please, no semi-
> colon"? Pretty standard.

Backslash-newline is pretty standard in unix to mean "don't end my
line," but I think I prefer the elegance/simplicity of the proposed
new standard. The nice thing about not having the backslash is that
if you decide to join the two lines, you don't need to remove any
characters.

Hmmm. I guess I have to think back to when I first encountered the
style in Perl Best Practices. I thought it sounded silly. Then I
tried it for a day. Then I never looked back. In fact I changed how
I format other languages as well along similar lines, and I've been
very happy with the change. (And frustrated by languages like Ruby
whose mostly-optional semicolon rules prevent that style.)

The difference is that eyes more easily scan down the left-hand column
than they do the right. When you are orienting yourself in unfamiliar
code this makes the structure clearer.

> So to answer your questions: Which do I prefer to read? Which is
> clearer? The second.
>
> I think these are minor points anyway. Again, it's about having a
> style, not what the style is. If the consequences of the rules
> were the other way around, I'd still be happy about them because
> I'd adjust quickly and not care. Gofmt saves all. Legislated taste
> trumps endless formatting arguments.

I've seen and made the "having a style matters more than what it is"
argument plenty of times myself. However I've also become convinced
over the last few years that some stylistic choices really are better.
Not a lot better. But noticeably better. And the effect of
consistently doing something noticeably better adds up over time.

If I walk into a code base that does things differently, the win is
sufficiently small that it is almost never worth the effort to change
the existing style. Even if you discount the certainty of political
battles over it. But given the choice I want the following:

- An indent of 2-4 spaces. Research indicates that people can read
and understand code better with an indent in that range than one which
is larger or smaller than that.

- When formatting lists, a standard that makes it not a syntax error
to copy the last line and add another. In a language which allows
trailing commas, this means with a , on the end. In a language
without, this means putting commas *before* list elements.

- Operators lined up on the left where they are easily scanned. This
is the one I just brought up.

The purpose of semicolons in C/C++ is to allow multiple statements on
the same line and statement spanning several lines. Since those case
are the minority it means that most of the time semicolons are just
noise.

On Dec 9, 6:49 pm, "Rob 'Commander' Pike" <r...@google.com> wrote:
> No more optional semicolons, no more rewriting; they're just gone. You
> can put them in if you want, but we believe you'll quickly stop and
> gofmt will throw them away.
>
> The code will look much cleaner.

When I first read this I was all like, "WTF?" But now that I've had a
chance to think about it, I realize that was my Java/C/Obj-C etc
experience talking. Now I'm all like, "OK, let's do it."

Please do implement this. I'm all for getting rid of "administrative
debris" and giving the code a cleaner look. (In this respect I like
function signatures in Go better than Haskell, where the ::, optional
=>, and all the -> are tedious and unclear to the novice unaware of
typeclasses and currying).

- An indent of 2-4 spaces. Research indicates that people can read
and understand code better with an indent in that range than one which
is larger or smaller than that.

gofmt does have a mode that uses spaces for alignments and tabs for indentation. We are considering making this the default. With this setting, you are free to change the tab width in your editor to your liking and the code will remain properly formatted - or in other words: changing the tab width won't require re-gofmt'ing. If you want to try this, use the latest version of gofmt as follows: gofmt -spaces -tabindent <files>

- When formatting lists, a standard that makes it not a syntax error
to copy the last line and add another. In a language which allows
trailing commas, this means with a , on the end. In a language
without, this means putting commas *before* list elements.

This is now possible consistently for element lists of composite literals (as before), parameter and argument lists (part of new semicolon proposal)

Not that my opinion should necessarily carry significant weight, but I
like this for the most part. I like the simplifying effect it has on
the grammar. I especially like the shift from semicolons being
statement separators to statement terminators. I always disliked that
aspect of Pascal. And the lexical inference of a semicolon feels not
unlike the type inference in a statement like:

x := foo()

However, I would voice my support for some clean way to deal with the
general case of a multi-line statement, either a mechanism to escape
the newline or a way to say of a group of lines, "this is a multi-line
statement; turn off the inference for it." (But I'd prefer not to go
to the FORTRAN-style continuation character at the beginning of the
second line.) The case of the multi-line mathematical expression is a
significant one. Most mathematical typesetting I've seen puts the
operator at the beginning of the line, rather than the end. It's that
well-established style that I follow in code, and I know that I'd find
it an irritation if a semicolon rule made that style impossible.

It occurs to me that most cases could probably be handled with a
modification of the inference rule that says a semicolon is inserted
if the newline separates two tokens of the special class, rather than
follows one. But for now I stop short of actually suggesting it for
two reasons. First, I haven't looked to see if that re-complicates
the grammar significantly. Second, it makes an annoying special case
of a line beginning with a channel receive operator. Though while
annoying, the second issue could be addressed by either remembering to
put a semicolon at the end of the previous line or always writing:

White space, formed from spaces (U+0020), horizontal tabs (U+0009),
carriage returns (U+000D), and newlines (U+000A), is ignored except as
it separates tokens that would otherwise combine into a single token.
Comments behave as white space.

Because of the lexical scanning rules (rule 1(a) applied to an
identifier at the end of a non-blank line for the first of two lines
"label[whitespace]\n[whitespace]:"), i believe that the rule for
labeled statements needs to add the following rule. Any white space
between the label identifier and the colon must exclude newlines.

> Here is a proposal.
>
> - rob, for The Go Team
>
> . . .
>
> With the new rules, a semicolon is mistakenly inserted after the last
> element of a multi-line list if the closing parenthesis or brace is on
> a separate line:
>
> f(
> a,
> b
> )
>
> To avoid this issue, a trailing comma is now permitted in function
> calls and formal parameter lists, as is the case with composite
> literals already.
>
> A channel send spanning a newline can accidentally become a receive,
> as in
>
> a = b
> <-c
>
> Inserting a semicolon after the b changes the meaning from a non-
> blocking send to an assignment followed by a receive. For this
> transformation to mislead, however, the types of a, b and c must be
> very specific: interface{}, chan interface{} and chan. So a program
> might break but it is almost certain never to succeed incorrectly. We
> are aware of the risk but believe it is unimportant.
>
> A similar thing can happen with certain function calls:
>
> f()
> (g())
>
> If f() returns a function whose single argument matches the type of a
> parenthesized expression such as (g()), this will also erroneously
> change meaning. Again, this is so rare we believe it is unimportant.
>
> Finally, a return statement spanning a newline is broken up into two
> statements:
>
> return
> f()
>
> For this to miscompile, the return statement must be in a function
> with a single named result, and the result expression must be
> parseable as a statement.
>
> Gofmt's style already avoids all three problematic formattings.
>
> . . .
>

This seems like a win. (I'd also have been happy to type semi-colons
everywhere, which is what I've been doing thus far.) The current
sometimes you can drop them sometimes you can't is more confusing than
it really should be.

> Any white space between the label identifier and the colon must exclude newlines.

This is true in many places. The grammar explicitly
avoids all this complication by being phrased in terms
of semicolons, with the leading caveat that under very
specific conditions, a newline acts as a semicolon.

First of all, I love the proposal in general. Let's get this done in
one form or another. If anyone here thinks semicolons are not a big
deal, you are probably just used to it. Ever get a new chair,
monitor, or keyboard, and notice that for a long time you've been
uncomfortable? Semicolons are just as bad.

Some coworkers of mine have taken that style in SQL. After a few
years of reading their code, I personally can't stand it. Given the
way a comma is used in English, I think commas don't belong on the
*beginning* of anything. (Disclosure -- I suggested starting lines
with semicolons a week or so ago, but I tried it and didn't like it
after all.)

But, to Ben's point, I do like to put arithmetic operators at the
beginning of newlines when it comes to long expressions. Unlike
commas, which never begin a line of prose, arithmetic operators almost
never end a line of written arithmetic. For instance, in elementary
school you learn:

5
- 3

This has several distinct meanings:

A binary operation: "5 minus 3."
A list operation: "The sum of 5 and -3."
A set of steps: "Take 5, then subtract 3."

All of these are equivalent, but they are each useful in different
cases. This written form allows you to think about it any way you
like.

Now consider the alternative:

5 -
3

This form clearly has the meaning "5 minus 3", but I don't feel like
this means "the sum of 5 and -3" anymore, and I definitely don't see a
list of steps. In other words it begins to feel like a parsed
expression tree, and not like arithmetic. It's also harder to
understand what's going on, in the same way a program with semicolons
is harder to make syntactically correct. In other words it's an
unnecessary expense of a programmer's brain power.

The first form is much clearer because it gives me multiple ways to
think about the meaning of the expression. Please allow it in the
grammar.

I would be OK with a terminating "\", or requiring an unmatched
parenthesis or something. You smart guys figure it out. :)

In code, you could write the first form using an accumulator
variable. This clearly makes it into a "set of steps" like I was
hoping for, so it's not like you can't write it that way:

acc := 5
acc -= 3

But, not everybody will think to do that just for the sake of
readability. I expect multiline expressions to be used more of the
time. If the language is a little easier to read by default,
everybody wins.

On Thu, Dec 10, 2009 at 10:55 AM, Kevin Conner <con...@gmail.com> wrote:
> On Dec 10, 1:50 am, Ben Tilly <bti...@gmail.com> wrote:
>> ...
>>
>> Here is an example of a SQL formatting style that meets these rules
>>
>> SELECT s.foo
>> , t.bar
>> , t.baz
>> FROM table1 s
>> JOIN table2 t
>> ON s.some_id = t.some_id
>> WHERE s.blat = 'some'
>> AND t.blot = 'condition'
>> ;
>>
>> Now this may look odd at first glance. But write a hundred complex
>> SQL queries that way and the rationale behind it becomes obvious.
>
> Some coworkers of mine have taken that style in SQL. After a few
> years of reading their code, I personally can't stand it. Given the
> way a comma is used in English, I think commas don't belong on the
> *beginning* of anything. (Disclosure -- I suggested starting lines
> with semicolons a week or so ago, but I tried it and didn't like it
> after all.)

I agree that I prefer to see the comma last. However SQL does not
like trailing commas, and I've too many experiences where I make a
quick tweak to a long-running report that runs multiple queries, run
it as a sanity check, and then trip over the missing comma. Therefore
I've found it valuable to pick a formatting style that eliminates that
error.

Work as a reporting engineer for a year or two, constantly doing stuff
in SQL and you may come to appreciate the practicality of leading
commas in SQL.

On Thu, Dec 10, 2009 at 10:55 AM, Kevin Conner <con...@gmail.com> wrote:
> On Dec 10, 1:50 am, Ben Tilly <bti...@gmail.com> wrote:
>> ...
>>
>> Here is an example of a SQL formatting style that meets these rules
>>
>> SELECT s.foo
>> , t.bar
>> , t.baz
>> FROM table1 s
>> JOIN table2 t
>> ON s.some_id = t.some_id
>> WHERE s.blat = 'some'
>> AND t.blot = 'condition'
>> ;
>>
>> Now this may look odd at first glance. But write a hundred complex
>> SQL queries that way and the rationale behind it becomes obvious.
>
> Some coworkers of mine have taken that style in SQL. After a few
> years of reading their code, I personally can't stand it. Given the
> way a comma is used in English, I think commas don't belong on the
> *beginning* of anything. (Disclosure -- I suggested starting lines
> with semicolons a week or so ago, but I tried it and didn't like it
> after all.)

A similar approach can be taken with function arguments or enum values
(until we get C++0x compilers that allow a trailing comma in an enum
list).

The advantage such a construction has is that in an organization that sends diffs around for code reviews, it makes it very clear what initializers have been added, deleted, or modified from the end of the list. Otherwise, when adding a new initializer (or a new field to the SELECT query in the original example) you have to add a comma to the last item, forcing the code reviewer to look more carefully to see that nothing important was changed.

It's only a small thing, but it made life as a code reviewer just a little more pleasant.

And here is another argument why semicolons should be dumped. Every
inroductory book on programming I have read (for a language with
semicolons) contains the statement bellow. That semicolons allow you
to put multiple statements on one line.

This is inevitably followed by a paragraph explaining why you should
not use this feature as it makes programs harder to read and maintain.
So in the words of Troy McClure now that we've shown you how its done,
don't do it.

On Dec 10, 10:30 pm, Antoine Chavasse <a.chava...@gmail.com> wrote:
> I like the proposal.
>
> The purpose of semicolons in C/C++ is to allow multiple statements on
> the same line and statement spanning several lines. Since those case
> are the minority it means that most of the time semicolons are just
> noise.

After a bit of reading I can understand the rational behind leading
commas rather than trailing. It is indeed easy to forget to add a
comma to the previous line when adding a line to a multi-line
initializer, so requiring a leading comma works quite well. One way
around this is to take the Python approach: allow a comma at the end
of a list even if it is not followed by a value. The following works
in Python:

a = [1,
2,
3,
4,
]

def do_stuff (a,
b,
c,
d,
):
pass

do_stuff (1,
2,
3,
4,
)

It also remains naturally readable in a way consistent with written
English.

Of course, this does not work with arithmetic or logical operators in
Python. Nor does it work if one wants to declare a tuple without
using parentheses. One must use a trailing \ to indicate that the
statement continues onto the next line.

Go would have a similar issue, lacking semicolons: multiple assignment
and multiple returns values. If it had tuples, there would be no
problem, as one could explicitly declare the multiple values as a
tuple.

I can imagine the advantage in terms of readability in using leading
arithmetic operators, though. Something like:

a := b
- c
+ d
- e

. . . has the effect of saying: "b, then subtract c, then add d, then
subtract e". It does look a bit ugly, but as you say: it is
reminiscent of columnar notation in addition and subtraction.

As I said in my less useful reply to this thread: the semicolon rules
are not confusing. There aren't that many of them, after all. The
only major issue I find is that having optional semicolons means that
sometimes one might forget to add a semicolon to the previous line
when adding a line to the end of a block. I do that all the time.

Great! I support. It means that it will relax programmers from being
getting compiled error from missing ; here and there to almost no
worry about ; anymore but still keep the formal grammar strict. One
step up!

With all due respect, the second version is much easier to read than
the first version. Perhaps because I'm European, or maybe because of
any other reason. The first one looks awkward.

My current (most used) programming language does not have semicolons
as line terminators or separators, and it works just fine. Just as
Unix Shell works fine.

Please remove them.

On Dec 10, 1:56 am, Ben Tilly <bti...@gmail.com> wrote:
> On Wed, Dec 9, 2009 at 3:49 PM, Rob 'Commander' Pike <r...@google.com> wrote:
> [...]
>
> > Appended to this mail is a formal specification of the proposed rules for
> > semicolons in Go. They can be summarized much as in BCPL, but Go is a
> > different language so the detailed rules differ; see the specification for
> > the details. In short:
> > - In the formal grammar, semicolons are terminators, as in C.
> > - A semicolon is automatically inserted by the lexer if a line's last
> > token is an identifier, a basic literal, or one of the following tokens:
> > break continue fallthrough return ++ -- ) }
> > - A semicolon may be omitted before a closing ) or } .
> > The upshot is this: given the proposal, most required semicolons are
> > inserted automatically and thus disappear from Go programs, except in a few
> > situations such as for loop clauses, where they will continue to separate
> > the elements of the clause.
>
> Please, don't. I know that making semicolons optional is popular and
> is done in a number of popular languages like JavaScript and Ruby,
> however I don't like it.
>
> Which do you prefer to read? Which makes the structure of the
> calculation clearer at a glance?
>
> foo := some_big_piece()
> + more_calculation()
> - last_bit_of_calculation();
>
> or
>
> foo := some_big_piece() +
> more_calculation() -
> last_bit_of_calculation();
>
> Your semi-colon insertion rule would make the first style break
> horribly, and would force the second style. Which I think it
> significantly worse.
>
> That said, I do like allowing trailing commas in argument lists.
> That's good because it makes errors less likely when you come along
> and add another element to a list. But I like putting my operators
> where my eye can scan down and see them.
>
> Cheers,
> Ben

On Fri, Dec 11, 2009 at 8:57 AM, Santidhammo <svan...@gmail.com> wrote:
> With all due respect, the second version is much easier to read than
> the first version. Perhaps because I'm European, or maybe because of
> any other reason. The first one looks awkward.

It's one of those personal taste things. I prefer the first version
myself (operator followed by operand), but since it's rarely necessary
to write code spanning several lines like that I don't mind having to
use the second version. Unlike braces which are used all the time :|

I all honesty, the way I learned on school to make a sum is like the
following:

1
2
3 +
-----
6

I know that in some other countries, it is tought as:

1
2
+ 3
------
6

This is why I thought that some people liked to read this:

( a
, b
, c)

instead of

( a,
b,
c )

However, in human language, the comma is typed _after_ the last word,
not in front of the new word. So I really consider the latter one to
be linguistically better. With the possibility to add a trailing
'comma' for those who are lazy, finishes it off.

Having the operators aligned in a column, matched with what they are
using, is many times clearer. You don't say: "One plus two plus, three
minus, four plus, five" You say "One plus two, plus three, minus four,
plus five". We naturally group the operator with second operand. The
indenting quite effectively signals "I'm not done!". However, I know
its not uncommon to hit the compile button, and have "Syntax error
near ..." because there's a semicolon missing.

I suppose surrounding the entire thing in brackets isn't so bad,
though there are already a lot of brackets, and not always having
consistent meaning. (receiver, parameters, returns, function calls,
conversions, type assertions, declaration blocks...). Oh well.

I like getting rid of semicolons in general. But it seems like it will
be difficult for a teacher to explain some of the weirder exceptions
where it doesn't do what you mean. I also wonder how this affects
error message reporting?

One idea I had that might get rid of some of the weirdness is to count
parentheses and other matching brackets and only interpret newlines as
semicolons when they appear within curly braces. Although strictly
speaking, regular expressions can't handle checking for matching
braces, perhaps it wouldn't actually be that hard to modify a lexer to
keep a stack of them?

I like it, but I believe one more rule is needed so that
the Allman brace style will work. I know it's not the
standard Go style, but I still think it should be possible
to use it. The semicolon should not be inserted by the
lexer if the newline is followed by a {.

>> I strongly believe that you're heading the wrong way by adding more
>> rules to achieve a goal that's dubious in the first place.
>
> Believe it or not, this removes rules. Robert played with
> this in the parser and the language spec and was surprised
> how much simpler things got. I am doing the same conversion
> in the 6g compiler right now, and the dead code I'm cutting
> away is one of the ugliest parts of the compiler. I'd
> forgotten writing it, but boy is it ugly. And soon it will
> be gone.

Presumably this simplification means that compilation will be even
faster, which is a nice side effect. And it should be easier to build
source parsing tools for IDE's etc.

I didn't think about this. I haven't had the chance to play much with
go lately and not with the semicolon free version yet, but this is
worrying.
For instance, for someone who use this brace style at work it means
that extra carefulness is required when writing go code because the
program will not even compile when using this style of braces (and
when you use it all day long you don't even think about it).

Forcing a style in the formatter is one thing, but being inflexible in
the way the program have to be input hand is another.

>>
>> if i > 0 {
>> ...
>> }
>>
>> is not equivalent to
>>
>> if i > 0
>> {
>> ...
>> }
>>
>> with the current rules.
>
> I didn't think about this. I haven't had the chance to play much with
> go lately and not with the semicolon free version yet, but this is
> worrying.

Extremely so. I am a newcomer to the language, and like very much what I
saw so far - except this issue.

I just searched the ~620 messages I have received since I subscribed to
the list, (49 in this thread,) for the phrase "free format" - No hits.

In my not-so-humble opinion, computer languages should be format free.
That overrides any other concern regarding parser complexity, readability,
etc.
Anything else is a step backwards. That's why I never liked Occam and Python.

And for simplifying the language/parser/grammar, what can be simpler than
"every statement ends with a semicolon" ? Any exception adds complexity
instead of removing it.

> Forcing a style in the formatter is one thing, but being inflexible in
> the way the program have to be input hand is another.