Or the Actor model for that matter. Although I don't think that any of the existing Actor language implementations ever took concurrency to the theoretical extremes envisioned by Hewitt and his students.

Purity is a double-edged sword: it imposes a strong invariant so that certain kinds of programs and reasoning become easier, but others become harder. There are many different forms of purity, which are useful for different problems. The form of purity depends on what concepts are in your language, i.e., what your kernel language is, as explained in CTM.

The solution, in my experience, is for the language to be subsettable so that you can use the form of purity you want when you want it. For example, when writing constraint programs you want propagators to have a logical semantics (they should implement constraints correctly). When writing multiagent programs you might want a framework of asynchronous messages with no global state. Ideally, the form of purity you choose should be enforced by the language, so that you don't break the invariants.

Imposing the same form of purity on a complete language can be interesting from a theoretical point of view (to explore the limits of what can be done) but it is impractical. Neither Haskell nor Smalltalk do this, by the way. Neither are completely pure, and for good reason. Haskell must use global state in some cases (I see programs use UnsafePerformIO published in the JFP) and Smalltalk must use other data abstractions than objects in some cases. Both languages try hard to "contain" the different form of purity.

I'm thinking more in terms of popularity than in terms of quality. Everyone and their momma has designed a multiparadigm programming language, so the space is crowded - just try sorting out all the different better-than-Java languages than run on the JVM. But there seem to be only a handful of languages that make an attempt to be pure, so each can have its own sphere of influence.

That some multiparadigm languages are better than others is a useful fact. That compartmentalized purity makes programming easier is an important insight. That the pure subsets of languages portrayed as pure is usually inadequate for real programming is informative. But I'm asking about popularity here! :-)

The languages you listed out have purity at different places. Referentially transparent in Haskell has purity on variables; s-expression in LISP has purity on syntax; object in Smalltalk has purity on data structure.

It means that all the three purities can be put together. A language can be looked like LISP which consists of s-expression, run like Haskell which is referentially transparent, and hande data like Smalltalk which packs data into objects.

Another way of putting it is that Common Lisp says 'compiler is an abbreviation for parser-compiler, just as 'cello is an abbreviation for violin-cello. Thus Common Lisp lets you write (defmacro ...) and operate on the data structures built by the parser before the compiler compiles them.

The parentheses are there to let the human reader and the parser (the Common Lisp function READ) parse data structures without having
to know the arity of the keywords. Other languages use parentheses in
this way for functions. You write f(x, y, z) and the C compiler uses the parentheses to parse the text. This frees you of the need to tell the compiler that the function has three arguments prior to the compiler encountering a function call. Lisp's purity lies in applying the same principle to the control constructs too (because it lets you define your own, just as other languages let you define functions).

Another perspective is to note that a computer programming language has three syntaxes.

Base Language

The syntax that you use when you are coding with the language as it comes, straight out of the box.

Meta Language

The language you use for adding source-to-source transformations in order to extend the language

Extended Language

What the language looks like after you have extended it. For example, if you extend C with macros the source acquires more parentheses and has fewer brackets and braces

Using fully parenthesised prefix notation is arguably the only choice
for the Extended Language. How else are members of the programming team going to be able to parse each others code? However Lisp could have used an Algol style for the Base Language and the Meta Language.
So the relevant purity is best called Tri-syntactic Identity - rejecting features that are mildly helpful or at least familiar for Base and Meta Languages because they are not appropriate to include in the Extended Language.

My working theory is that choosing purity wins a language a lot of zealots as its fans.

But, how do you form that theory, given that all languages have a lot of zealots?

I admire choosing purity (I'd have said 'consistency') because that's a good way to discover the weaknesses as well as strengths of a paradigm. If you can't see a reason why you shouldn't use s-expressions for everything, or make everything an object, why not do it?

Would we have discovered such a use for monads, had Haskell not forced us? A compromising language seems to offer the best of everything, but it could suffer from an identity crisis in its design patters, frameworks, etc. as well.

But, how do you form that theory, given that all languages have a lot of zealots?

Now I'm not sure what I really mean by "zealot". I was thinking of someone who not only thinks that, say, Java is the bee's knees, but also that

Java is the solution to all problems

Java can only be improved by making it more Java-like

Java is the most powerful and expressive language yet to be invented

And so on. What I see as keeping a language alive is the belief among adherents that they are the only ones who get it. Yeah, it's arrogant, but sometimes it's true. And, regardles of the truth, it keeps people using the language, which is really what I'm trying to ask: what kind of language will find and keep users?

That phrase has spawned an uncountably infinite set of alternatives such as "the shiznit" and "the shizzle", as in "Lambda is the shiznit, yo". If you find this particular mechanism for encoding sentences attractive or amusing, you can read this page using Gizoogle (yo).

is that for all it's warts (mostly piss-poor closure syntax, and a weaker type parameterization model than necessary), Java really is best-of-breed on a fair number of axes. On portability, ease of deployment, breadth and quality of libraries, tooling, ease of integration, and depth of recruiting pool, Java is an easy order of magnitude beyond any of the current alternatives, and probably two orders of magnitude better than prior art (bar Smalltalk and Lisp). In terms of actually getting apps shipped, those sorts of thing matters.

If your question is actually, "Why do some languages create and retain zealots?" then maybe it's not actually a question of purity. Maybe it's related to language features: a programmer decides that some particular language feature is absolutely essential to development, and then becomes zealous about the language that contains that feature until either another feature becomes their feature of choice, or until a new language implements that feature better.

So, in your original table Haskell zealots are actually saying 'How can you program without referential transparency?' and so on...

As most programmers encounter language features through an implementation of that feature in a particular language, the zealotry attaches to the language rather than the feature.

I believe it would be more useful to be able to switch between 'everything is an X' views than to just use a language based on a single paradigm. Then again, is a language itself actually a single way of viewing a system, and do multiple paradigms equate to multiple languages?

I'm sure there is some "everything-is-a-" in Forth as it's another one of those languages along the lines of Lisp which take a basic principle and apply it to everything no matter how painful it is to use. ;)

If instead of taking a stack and returning a stack, a concatenative language's operators took a tuple of say (Stack, Environment, Code, Dump) and returned a transformed version of same, then the operators become the code of a Lisp virtual machine.

Similarly if in an actor language, the actors all took a message which was a tuple of (Stack, Environment, Code, Dump), and sent a transformed version along to another actor, then the chain of actors become the code of a Lisp virtual machine (capable of running concurrently any number of Lisp machines).

Reminds me of the reflective procedures in 3-Lisp (ACM link), which take 3 arguments: the environment (of the caller), the current continuation, and then a list of the actual arguments (unevaluated, IIRC). You can build up a large number of different constructs from this.

XY is a concatenative language in which everything is a word which takes and returns a pair of objects (stack queue). stack is a list which represents the evaluation of the computation so far, and queue is a list which represents the unevaluated future of the computation.