Saturday, May 12, 2007

It's difficult to design a syntax for functions (blocks, closures, anonymous inner classes) as first-class values which looks natural in traditional (aka Algol-like) syntax. There have been three solutions:

Use traditional syntax, but add some sugar for functions which take just one functional argument - Ruby

Languages from the first group were never too popular, and languages from the second were never too "functional" (I'm not talking about purity, just being able to use high-order functions and stuff naturally). Right now Ruby is the most popular kinda-functional language.

Syntax

No block arguments

One block argument

Multiple block arguments

Scheme-like

OK

OK

OK

Ruby-like

Great

Great

Ugly

Python-like

Great

Ugly

Ugly

So Ruby-like syntax is a pure win over Python-like syntax - better for one-block functions, and equally ugly for no-block and multiple-block functions. It is generally considered a win over nontraditional syntax for no-block and one-block functions.

Compared to non-traditional syntax, there's a trade-off. It looks better for no-block and one-block functions, but loses for multiple-block functions.

To find out how common functions taking various numbers of blocks are I grepped OCaml library interface files. I took OCaml because as language with non-traditional syntax it wouldn't have anti-functional bias typical in Algol-like languages, and unlike Scheme/Smalltalk/etc. it is statically typed so it wouldn't take too much work to find out how many blocks functions take. SML and Haskell would probably work too here.

And the last one was false positive of regular expression ((unit -> (unit -> unit) * (unit -> unit)) -> unit), but I kept it in the statistics, as filtering only false positives and not false negatives would bias the results more than no filtering (and it doesn't change the conclusion).

Perl-compatible regular expressions

In the code above I used gr which isn't a standard Unix command. I'm simply sick of non-Perl-compatible regular expressions. grep should die, die, die horrible death. Unfortunately pcregrep program has annoyingly long name, so I aliased it gr. Do yourself a favour, run:

As a previous poster mentioned, Smalltalk probably deserves some consideration here. Multi-block methods seem more common than in other languages. Especially if instead of "number defined", you look at "number of calls."

Not to mention, Smalltalk syntax scales very elegantly from the no-block to the one-block and finally to the multi-block.

The claim that Ruby is the most popular "kinda functional" language is, I believe, not true. Javascript is almost trivially demonstrable as being more "popular", and Javascript supports true first-class functions.

Now, it's probably the case that only a small minority of those who make Javascript so popular know that it is in fact a "kinda functional" programming language. And perhaps it gets "ugly" marks across the board ...

This is some great analysis, but, despite being a big fan, I think Ruby's syntax still leaves something to be desired. For example, binding precedence can be confusing due to optional parentheses.

I often find myself missing the simplicity of Lisp syntax when dealing with a lot of higher order functions. In fact, I don't think my experience is that Ruby's syntax is ideal for 99% of circumstances as your OCaml investigation might suggest, but I'm not sure what percentage it is.

Mike McNally: First-class function values are certainly required for the language to be "kinda-functional", but that's a very low standard of being "kinda-functional" - even Perl has them. Javascript code I've seen tends to contain rather few high-order functions, so I don't really count it here.

I dislike your whole "traditional syntax" view point. There's no such thing, it's a popular syntax, but calling it traditional implies it's better and older, neither is the case.

You attach too much value to "traditional syntax" as if it's worth a lot to try and keep it. It's not. The C'ish syntax mostly sucks, is totally inelegant, full of exceptions and stuff you have to memorize, and hard to parse and generate.

Smalltalk and Lisps syntax on the other hand, are elegant, simple, easy to parse and generate, and simply better in every objective measure of merit in comparison to C'ish languages.

"Traditional syntax" doesn't deserve your adulation, rather, it needs to be abolished in favor of the superior alternatives already available.

Ramon Leon: We can call alternative syntaxes "superior" or "more elegant", but languages are much more likely to become popular if they follow established traditions.

Traditions can be broken if you get something more valuable in exchange - like macros. I'm just pointing out that supporting block arguments and functional-style programming is not one of such things, as one-block case is handled by fairly traditional Ruby syntax, and multiple-blocks case is too rare to matter.

taw, I understand the focus of your analysis, and agree with where you're coming from. What I'm saying is, that agreement already granted, I think Ruby syntax has some potentially rough patches with nested invocations of functions that accept a single block.

A more idle consideration is whether or not functions that take multiple blocks will become more common. The (somewhat) interesting point being that such functions are less necessary (using that word loosely) if you have elegant support for nested invocation of functions that accept a single block.

On another note, I disagree with your characterization of Javascript as not containing many higher order functions. Callback-type designs are so common in Javascript that higher order functions are all over the place. There are also a good number of functional programming academics porting techniques to Javascript, which I think will encourage a continued uptake of functional style in the Javascript community.

andre: grep -P is non-standard, not everyone uses bloated crapware filled with excess "features". And there's absolutely no need to use PCR for this (not PCRE, they are not regular expressions at all), grep is fine. Perl's regexs are very bad, you should learn actual regular expressions instead:http://swtch.com/~rsc/regexp/regexp1.html

And taw, you win the useless use of cat award. Why is it that nobody can be bothered to learn the basics of unix anymore? You aren't concatenating anything, so don't invoke cat!

Anonymous: There are some extreme cases where grep or pcregrep is much faster than another, but nobody cares because they never happen in real life.

On the other hand limitations of grep are a big problem all the time.

"cat FILE | complicated_command" is idiomatic Unix usage. Shells should have learned to handle such simple cats internally by now. If they haven't (and they haven't), that's a bug. I'm hopeful, however - someone finally started caring about shell usability. Recently shells started telling which package needs to be installed if some unknown command was issued. Some day maybe cp, mv and > will stop overwriting files without being asked to. (you can get there today with alias cp='cp -i', libtrash and so on). It's not like 1980s Unix is the best we can do.

What limitations of grep? Give an actual example. There is no case where pcregrep is significantly faster than grep, ever. Naive backtracking regex implimentations perform exponentially worse in many situations, and if you've had any serious experience with these bad regex implimentations you would be familiar with the common workarounds you need to use to create equivilent but non-pathalogical patterns so performance doesn't deteriorate exponentially as the input grows. You should actually read that link instead of dismissing it with nonsense like that.

cat somefile | some_command is NOT idiomatic unix, its idiotic unix. grep takes a file as its argument. grep 'pattern' file, not cat file| grep 'pattern'. And if something doesn't take a file as an argument, then redirect its stdin: grep 'pattern' < somefile.

What do you want your shell to do about the fact that you run commands for no reason? Its not a bug, its doing exactly what you asked it to do. Its not the shell's fault you told it to do something dumb. Unix is very user friendly and easy to learn and use. Just because you don't like the way it works doesn't mean unix needs to change, perhaps you just need to find an OS that suits your desires better. If you want needless questioning when you command (look up the definition of command) the system to do something, then tell it to ask you. mv means move the file. That's what it does. mv -i means move the file but ask me questions first. And that's what it does. 1980's unix isn't bad, its far better than the unfortunate messes that linux distros are. plan9 is even nicer, than unix, but its mouse-centric interface is a real disapointment.

Anonymous: Some limitations of grep are not supporting character classes like \d and \p{Hiragana}, ()s, backreferences, and all other useful PCRE features. It's not true that GNU grep is always faster - GNU grep used to be painfully slow in UTF-8 mode, something like 10x slower than pcregrep, but they apparently fixed that already.

"mv" should mean "move file". Unfortunately Unix is braindead enough to interpret it as "remove whatever was there first and move file". The removing part is intended behaviour in maybe 1% of cases - they you could confirm or use -f. In 99% if something is there it means a mistake was made, and you want to cancel. It's one of the worst cases of Unix brain damage ever - causing severe loss of user data just because the design fuckup is old and Unix fanboys consider all old design fackups sacrosanct.

In any case if you want 1980s Unix, you know where to find it. The rest of us moved on since then.

Of the "limitations" you mentioned, only one is real, and that's backreferences. The rest are just you not wanting to learn regular expressions. Not co-incidently, backreferences are the reason pcre is so horribly slow. Of course, you didn't use backreferences, and they are very rarely needed. If you really wanted a good grep that has backreferences, you could install one. There are grep implimentations that support backreferences, but still impliment any regexes without backreferences in the smart, effecient way. But hey, just use the lowest common denominator shitty software written by morons who still haven't figured out how to copy 40 year old algorithms, that sounds smart.

And while it may be nice to try to dismiss anyone who disagrees with you as a fanboy, it makes you look pretty dumb when you ignore what I've said while you do it. If I thought unix's flaws were sacrosanct, I wouldn't suggest that the modern replacement for unix which fixes many of those flaws is superior would I? I guess the cynics are right, you're not really a troll, you actually are that dumb.

Just because *you* think 99% of the time mv should behave like you are an idiot doesn't mean everyone thinks that. 99% of the time *I* want mv to behave just as it does, and the times I don't I use mv -i. I've never seen anyone think otherwise except for linux noobs like yourself. You know, the ones who pretend they know unix, and that unix is so great, but they have't figured out the basics of how to use their shell yet.

If want a horrible, bloated shitware OS that pretends to be unix, but ignores every single fundamental aspect of unix, then you should really be looking forward to the hurd so your kernel can suck as badly as your userland. Hopefully it will impliment nagging directly in the filesystem so you have an extra layer of stupid confirmations to protect you from yourself.

My software

Creative Commons

Unless otherwise expressly stated, all original material of whatever nature created by Tomasz Węgrzanowski and included in this blog, is licensed under a Creative Commons License. It is also licensed under GFDL (for Wikipedia compatibility).