1 Cow #1: source code is ASCII text on disk.

There may be interesting things to do by making source code more complex than simple text files, but the justifications for leaving that simplicity are not apparent from the rant. It sounds like Josh may want the IDE to be the compiler; that is, the IDE is taking care maintaining the relationships between code in some way other than "divining" it.

What's an example of the solution? Hypercard? AmigaVision?

What exactly is wrong with a computer programming language that humans can read?

I think SmallTalk had non text file sources. It is often mentioned as having a good development environment - with good refactoring for a dynamic language. It probably has some source control and search functionality. Never got to use it so can't say much more.

EDIT: Mathematica also has a non text file source. It is probably useful since it supports charts, images and similar as source. Not the best example for good language design, but it has it's niche.

It doesn't need to be a visual thing (as in drawing diagrams and such). It can still be text based (as in variables have names like "foo") and very similar to today's languages, except that you navigate and manipulate (insert/delete) sub-expressions instead of individual characters.

It's a self-contained VM. An entire graphical operating system that exists in its own little world, where Smalltalk is the only development tool. It's a case of the OS being the IDE. (I speak only from having used SqueakVM. If there are other kinds of Smalltalk environments, I'm not aware of them.)

My goal for this essay was to spark some discussion, so I'm glad it did.

Let me ask you this. Is software for the Starship Enterprise written as plain text on disk using VIM? I don't think it will be. Something will replace the way we write software today. So let's think about what that something will be.

1. I do use a real IDE. IntelliJ. (BTW, did you actually read the article or just the summary points? I specifically mention IntelliJ) IntelliJ does an amazing job of turning text on disk into a full structured and semantic understanding of code, and then writes it back to plain disk. Doesn't it feel wrong that the IDE must go through all of this rigamarole, and then throw away that information when you restart the IDE, and that every IDE must re-implement it. Why isn't the structured form what we actually store. Why can't that be the canonical storage and pure text is simply what we see when it's rendered for us humans?

And whether it is stored as XML or JSON is irrelevant. That's an implementation detail that the programmer never needs to see.

2: I specifically said we need a spec for this so any IDE can implement it. You shouldn't be required to use a paritcular ide. Interoperability is extremely important.

4 I mention the whitespace and tabs and bracing because it is often used on the front page of a programming language's site to say why theirs is superior to other languages. In reality, we shouldn't have to care.

5: type systems. I agree. Typesystems are good and important. Lots of academia languages say they are innovative because of their typesystems. I wish they'd spend their effort elsewhere on more important problems.

6: For a lot of programming languages syntax extension is considered a cow. ex: no operator overloading in Java. I argue that the problem wasn't the concept but the implementation. Properly scoped it's fine.

7: My point is that it's not a point of differentation. It's a sliding scale and people spend way too much time arguing about it.

8: a lot of new languages reject GC for it's failings. I say we should fix those failings instead.

Doesn't it feel wrong that the IDE must go through all of this rigamarole, and then throw away that information when you restart the IDE, and that every IDE must re-implement it. Why isn't the structured form what we actually store. Why can't that be the canonical storage and pure text is simply what we see when it's rendered for us humans?

That doesn't feel wrong at all. The alternative is to use a disk format that is easier for programs and more difficult for humans? Why would we do that?

Saving the kitchen sink to disk is not a silver bullet. If you start saving derived/cached information in there, then what happens if someone updates one part of the source without updating everything? Is the source now in an inconsistent state? If you start saving IDE-specific information in there, then does that mean I can't share my source code with my friend Bob who uses a different IDE? You're opening up a giant can of worms with this approach. Also where did you get this rule that the IDE must throw away information on restart? Microsoft Visual C++ saves Intellisense data to disk.

If you want to make it easier for IDEs and tools to do refactoring and whatnot, then that's fine, we can adjust the language to make that easier. Ban C-style macros, prefer static typing, etc. But those features are totally separate from the disk format of code. We can have all those features without making the source code difficult for humans.

Humans never need to look at the raw files, just like a graphic designer doesn't need to look at the raw Photoshop files.

The problem is not the disk format, but the format we edit code in. Do you believe that the optimal editing interface for code involves an array of characters and commands consisting of mostly insert char, delete char and move cursor to next/previous char? Suppose that there are hyperintelligent aliens somewhere. Are they editing their code like that? I don't think so. They probably developed an interface that involves a syntax tree and commands like insert subexpression, delete subexpression, move cursor to next/previous/parent/child subexpression. And more advanced structural commands like rename identifier, extract subexpression, etc.

IDEs are increasingly good at bolting this kind of interface on top of a character array. That is very hard when working in a format where you are constantly going through invalid states. For example if you delete an open parenthesis, the structure of the code is all messed up. Providing accurate intellisense, code highlighting, renaming, etc. while the file is in an invalid state is a hard problem, and IDEs solve it in ad-hoc heuristic ways. Imagine that virtually all Photoshop commands resulted in a corrupted PST file, and that Photoshop would have to work around this by trying to heuristically interpret the corrupted file! If you are editing an AST, these things are trivial. When editing a new node in the AST, you look at the parents of that node to find all the variables and functions in scope, and provide that in the intellisense list. It is no accident that this was invented in a structured editor and only much later poorly emulated in character array based editors. What other great ideas have we not discovered because they are simply too hard in character array based editors?

Right, that's the pragmatic thing to do. You can also flip this around: source is stored as AST, and you have a tool for pretty printing it if you want to deal on a character level. Both approaches have different advantages. In any case the big deal is the structured editing (& structured version control, etc.). How the files are actually represented on disk is a relatively minor detail.

ASCII has been around for nearly 50 years, Emacs and Vi have been proudly dying for 35 years, Vim for 20, these things have survived because they are evolutionary fit to do the job at hand, like sharks. They have survived dozens of methods of programming, thousands of languages, lots of different projects management ideals, gobs of operating systems, and fads.

Let me ask you this. Is software for the Starship Enterprise written as plain text on disk using VIM? I don't think it will be. Something will replace the way we write software today. So let's think about what that something will be.

Based on my recollections from the Star Trek: TNG Technical Manual and my recent viewings of the series, I've surmised that it's written directly in an object format managed by a simple AI, which is trivially decompiled into a human-readable high-level language format. Given the way the workstations are used on the Enterprise D, it's clear that the LCARS editor has some usage characteristics reminiscent of Vi and Emacs, though it's unlikely to be the same editor(s). (Neither Voyager or DS9 show the use of a workstation in this detail, and neither Enterprise nor the original series depict crew members programming a computer with anything more complex than the old switches-and-save-register-button interface common to microcomputers in the early 80's and 70's.)

Doesn't it feel wrong that the IDE must go through all of this rigamarole, and then throw away that information when you restart the IDE, and that every IDE must re-implement it. Why isn't the structured form what we actually store. Why can't that be the canonical storage and pure text is simply what we see when it's rendered for us humans?

Why should it? How is it relevant to anything other than composition of the program? Surely the bitcode produced by the compiler is all that is sufficient - as it is already used.

And whether it is stored as XML or JSON is irrelevant. That's an implementation detail that the programmer never needs to see.

Then why mention it if you don't have a concrete example to discuss?

I specifically said we need a spec for this so any IDE can implement it. You shouldn't be required to use a paritcular ide. Interoperability is extremely important.

We already have this flexibility with text. Why should we need an IDE at all? Why is a text editor and a shell for invoking a compiler not sufficient?

I mention the whitespace and tabs and bracing because it is often used on the front page of a programming language's site to say why theirs is superior to other languages. In reality, we shouldn't have to care.

Do you have examples? I'm hard-pressed to think of enough languages to make this anything more than a curiosity. Certainly not enough to make it an epidemic as you seem to insist is the case.

type systems. I agree. Typesystems are good and important. Lots of academia languages say they are innovative because of their typesystems. I wish they'd spend their effort elsewhere on more important problems.

Such as?

Type systems aim to solve a class a problems that otherwise make programming difficult and complex. What's more laudable than this?

For a lot of programming languages syntax extension is considered a cow. ex: no operator overloading in Java. I argue that the problem wasn't the concept but the implementation. Properly scoped it's fine.

That's not what I gathered from your rant. You crow about maintaining clean scoping rules, and then hold up JavaScript as a good example... Despite its rather dirty scoping rules.

Nevertheless, what examples do you have of this "sacred cow?"

My point is that it's not a point of differentation. It's a sliding scale and people spend way too much time arguing about it.

Oh, yes it is. Certain classes of languages can be used in systems programming. Others can't. These facets of a language relate to how well they scale in constrained environments:

C is appropriate for programming microcontrollers with small memories. Ruby is not. This is a factor of how well the compiled bytecode of a program can fit in a constrained space and maintain enough control over its memory usage to be able to operate. In cases like these, compiled languages are often a must.

On the same token, it can also be a factor in how stable or secure a language can be in certain usage scenarios. Haskell is appropriate in finance (where determinism is everything), for example, but PHP most certainly is not. (In this case, by "stable" I mean "has strong guarantees of consistency and isolation.")

a lot of new languages reject GC for it's failings. I say we should fix those failings instead.

What would be the workflow, exactly? How would I perform version control? How would I diff and patch it?

Version control systems and diffing and patching tools for that format would have to be built. This takes time, but a version control system and diff/patch system that understands the structure of code can do better than text based ones. For example suppose programmer A renamed function Foo to Bar everywhere in his version of the code. Meanwhile programmer B added a new call to Foo in his version of the code. When you merge with language agnostic text based tools, there is no way that this can be resolved automatically. A system that understand the structure of the code can automatically rename B's call to Foo.

It's not hard to imagine really. What are we doing in todays editors? Mostly (1) inserting a character at the cursor (2) deleting a character next to the cursor (3) moving the cursor left/right.

So what would a structural editor look like? Instead of the cursor being on some character, it will be on a subexpression. Of course you need to deal with incomplete programs, so there will be an additional node type for a missing subexpression. For example:

if(...){
print("foo");
}else{
print(...);
}

Here the ... indicates a missing node. So the corresponding commands in a structural editor would be (1) inserting a subexpression into a ... node (2) deleting the subexpression under the cursor, i.e. replacing it with a ... node (3) moving the cursor to a different subexpression (for example next, previous, parent, child, etc.).

You know, looking at your examples (and josh ideias) reminded me of Lisp and s-expression commands in Emacs....
Perhaps we should mix those 70s great ideas and concepts with our new shiny fast micro machines and have the best of both?

You are thinking about functions as names. When you call a function "Foo" you are not intending to call any function with name "Foo". You are intending to call that specific functionality that was called Foo before he renamed it and that functionality is now called Bar. So if somebody renames it, and you merge their changes into your code, the meaning preserving thing to do is to rename all the calls to Foo that you inserted.

With an AST representation that would be a no-op. When you insert a call to Foo, the representation of that is not "Foo", but a reference to the function definition of Foo. When the IDE displays such a node that references a function definition, it looks up the name from the definition. So if somebody alters the name property of the definition, all references to that definition automatically display the new name.

This was a "cool" essay for a far future, but it lacked any serious backing, does not address potential issues, and so over. Also it seems on multiple occasions you are not knowing essential tooling that already exists, or even you are blatantly ignoring the impact on the ENORMOUS amount of tooling that exists. It might be because you are biased on a particular language, I don't know.

Yet you wrote it like you are calling for that ASAP with not much else changing in the landscape. This obviously won't do.

Also some of your cows aren't. Are you really surrounded by people arguing over their formatting indentation? Shoot them in the head then. It would be more effective than pretending the whole world is.

I actually agree with some of the things you write. Of course I'm biased on those points, and I fail to see how this can be controversial (but I also honestly don't hear much controversy about that anywhere...)

On other occasion I don't even understand what are the practical implication of that you are trying to argue. For ex "7 Compiled vs Interpreted. Static vs Dynamic. Nobody cares.", well maybe, or maybe not, but what are the concrete propositions you are trying to make regarding language design in this case?

In 8, yet again I fail to understand why you think most language designer think "Garbage collection is bad."
Plus this is so a false dichotomy. Plus the GC is a so small part of resource handling. And don't get started on the "in the general case" thing, or anything like that: the general case is typically never what people talking about the general case think it is. The general case might not even exist.

What your essay made me think, is that I doubt you are a language designer. I even doubt you have experiences in a sufficient number of areas in which programming are used, and I doubt you provide much insight for a language designer.

If we switch to using something other than text files, say XML or JSON or maybe a database, then many interesting things become possible. The IDE doesn't have to do a crappy job of divining the structure of your code. The structure is already saved.

I have a surprise for you! Programming language syntax is meant to encode the structure of the program, even in Perl!

Yes but as any Compilers Class would teach you, getting structure (and most importantly: semantics!) from language may be VERY hard. It would still need to be done if we're to keep using languages as-we-know-it to write programs, but it would need to be done only once, and not many times (one by ide, another by compiler, etc)