I have only a limited knowledge of Lisp (trying to learn a bit in my free time) but as far as I understand Lisp macros allow to introduce new language constructs and syntax by describing them in Lisp itself. This means that a new construct can be added as a library, without changing the Lisp compiler / interpreter.

This approach is very different from that of other programming languages. E.g., if I wanted to extend Pascal with a new kind of loop or some particular idiom I would have to extend the syntax and semantics of the language and then implement that new feature in the compiler.

Are there other programming languages outside the Lisp family (i.e. apart from Common Lisp, Scheme, Clojure (?), Racket (?), etc) that offer a similar possibility to extend the language within the language itself?

EDIT

Please, avoid extended discussion and be specific in your answers. Instead of a long list of programming languages that can be extended in some way or another, I would like to understand from a conceptual point of view what is specific to Lisp macros as an extension mechanism, and which non-Lisp programming languages offer some concept that is close to them.

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
If this question can be reworded to fit the rules in the help center, please edit the question.

6

Lisp has another trick besides regular macros. "Reader Macros" allow the parser syntax to be captured and extended at runtime, so even the basic token structure of the language is under control.
–
ddyerSep 12 '12 at 19:46

1

your question is contradictory, first you ask for a list, then you ask for conceptual information, which is it???
–
Jarrod RobersonSep 13 '12 at 20:17

Haskell

These features allow users to dramatically add to the syntax of the language outside of normal means. These are resolved at compile time as well, which I think is a big must (for compiled languages at least) [1].

I have used quasiquotation in Haskell once before to create an advanced pattern matcher over a C-like language:

There are also other features of Haskell that essentially allow for custom syntax, like creating your own operator or the automatic currying coupled with lambda functions (consider e.g. the typical usage of forM).
–
XionSep 13 '12 at 9:22

1

"Else the following qualifies as syntax extension...": very good point.
–
GiorgioSep 13 '12 at 16:56

Tcl has a long history of supporting extensible syntax. For example, here's the implementation of a loop that iterates three variables (until stopped) over the cardinals, their squares and their cubes:

This sort of technique is used extensively in Tcl programming, and the key to doing it sanely are the upvar and uplevel commands (upvar binds a named variable in another scope to a local variable, and uplevel runs a script in another scope: in both cases, the 1 indicates that the scope in question is the caller's). It's also used a lot in code that couples with databases (running some code for each row in a result set), in Tk for GUIs (for doing binding of callbacks to events), etc.

However, that's only a fraction of what is done. The embedded language doesn't even need to be Tcl; it can be virtually anything (as long as it balances its braces — things get syntactically horrible if that's not true — which is the enormous majority of programs) and Tcl can just dispatch to the embedded foreign language as necessary. Examples of doing this include embedding C to implement Tcl commands and the equivalent with Fortran. (Arguably, all Tcl's built-in commands are done this way in a sense, in that they're really just a standard library and not the language itself.)

This is partly a question of semantics. The basic idea of Lisp is that the program is data that can itself be manipulated. Commonly-used languages in the Lisp family, like Scheme, don't really let you add new syntax in the parser sense; it's all just space-delimited parenthesized lists. It's just that since the core syntax does so little, you can make almost any semantic construct out of it. Scala (discussed below) is similar: the variable-name rules are so liberal that you can easily make nice DSLs out of it (while staying within the same core syntax rules).

These languages, while they don't actually let you define new syntax in the sense of Perl filters, have a sufficiently flexible core that you can use it to build DSLs and add language constructs.

The important common feature is that they let you define language constructs that work as well as built-in ones, using features exposed by the languages. The degree of support for this feature varies:

Many older languages provided built-in functions like sin(), round(), etc., without any way to implement your own.

Java has built-in control structures (for(;;){}, while(){}, if(){}else{}, do{}while(), synchronized(){}, strictfp{}) and doesn't let you define your own. Scala instead defines an abstract syntax that lets you call functions using a convenient control-structure-like syntax, and libraries use this to effectively define new control structures (e.g. react{} in the actors library).

Also, you might look at Mathematica's custom syntax functionality in the Notation package. (Technically it is in the Lisp family, but has some extensibility features done differently, as well as the usual Lisp extensibility.)

Rather than define specific syntax, everything in Rebol is a function call - there are no keywords. (Yes, you can redefine if and while if you truly desire to). For example, this is an if statement:

if now/time < 12:00 [print "Morning"]

if is a function that takes 2 arguments: a condition and a block. If the condition is true, the block is evaluated. Sounds like most languages, right? Well, the block is a data structure, it's not restricted to code - this is a block of blocks, for example, and a quick example of the flexibility of "code is data":

As long as you can stick to the syntax rules, extending this language is, for the most part, going to be nothing more than defining new functions. Some users have been backporting features of Rebol 3 into Rebol 2, for example.

"flexible syntax" is VERY different from "extensible syntax". It's been a long while since I've programmed in Ruby, but Rake just looks like a well designed use of built-in syntax. In other words, this is a non-example.
–
Thomas EdingSep 12 '12 at 20:50

2

Isn't that a matter of degree, not of kind? How would you distinguish between an "extensible syntax" language that can extend some aspects of its syntax but not others, and a "flexible syntax" language?
–
comingstormSep 12 '12 at 21:13

1

If the line is blurry, then let's push back the line so that C is considered to support extensible syntax.
–
Thomas EdingSep 12 '12 at 22:04

Extending the syntax the way you're talking about allows you to create domain specific languages. So perhaps the most useful way to rephrase your question is, what other languages have good support for domain specific languages?

Ruby has very flexible syntax, and a lot of DSLs have flourished there, such as rake. Groovy includes a lot of that goodness. It also includes AST transforms, which are more directly analogous to Lisp macros.

R, the language for statistical computing, allows functions to get their arguments unevaluated. It uses this to create a DSL for specifying regression formula. For example:

y ~ a + b

means "fit a line of the form k0 + k1*a + k2*b to the values in y."

y ~ a * b

means "fit a line of the form k0 + k1*a + k2*b + k3*a*b to the values in y."

Groovy's AST transforms are very verbose compared to Lisp or Clojure macros. For example the 20+ line Groovy example at groovy.codehaus.org/Global+AST+Transformations could be rewritten as one short line in Clojure, e.g. `(this (println ~message)). Not only that but you also wouldn't have to compile the jar, write the metadata, or any of the other stuff on that Groovy page.
–
Vorg van GeirSep 13 '12 at 9:13

Edit: I forgot about the most interesting example - JetBrains MPS. It is not only very distant from anything Lispish, it is even a non-textual programming system, with an editor operating directly on an AST level.

Edit2: To answer an updated question - there is nothing unique and exceptional in the Lisp macros. In theory, any language can provide such a mechanism (I even did it with plain C). All you need is an access to your AST and an ability to execute code in compilation time. Some reflection might help (querying about the types, the existing definitions, etc.).

Prolog allows defining new operators which are translated to compound terms of the same name. For example, this defines a has_cat operator and defines it as a predicate for checking if a list contains the atom cat:

The xf means that has_cat is a postfix operator; using fx would make it a prefix operator, and xfx would make it an infix operator, taking two arguments. Check this link for more details about defining operators in Prolog.

TeX is totally missing from the list. You all know it, right? It looks something like this:

Some {\it ``interesting''} example.

… except that you can redefine the syntax without restrictions. Every (!) token in the language can be assigned a new meaning. ConTeXt is a macro package which has replaced curly braces with square braces:

Some \it[``interesting''] example.

The more common macro package LaTeX also redefines the language for its purposes, e.g. adding the \begin{environment}…\end{environment} syntax.

But it doesn’t stop there. Technically, you could just as well redefine the tokens to parse the following:

Some <it>“interesting”</it> example.

Yes, absolutely possible. Some packages use this to define small domain-specific languages. For instance, the TikZ package defines a concise syntax for technical drawings, which allows the following:

Furthermore, TeX is Turing complete so you can literally do everything with it. I have never seen this used to its full potential because it would be pretty pointless and very convoluted but it’s entirely possible to make the following code parseable just by redefining tokens (but this would probably go to the physical limits of the parser, due to the way it’s built):

Boo lets you customize the language heavily at compile-time through syntactic macros.

Boo has an "extensible compiler pipeline". That means the compiler can call your code to do AST transformations at any point during the compiler pipeline. As you know, things like Java's Generics or C#'s Linq are just syntax transformations at compile time, so this is quite powerful.

Compared to Lisp, the main advantage is that this works with any kind of syntax. Boo is using a Python-inspired syntax, but you could probably write an extensible compiler with a C or Pascal syntax. And since the macro is evaluated at compile time, there's no performance penalty.

Downsides, compared to Lisp, are:

Working with an AST is not as elegant as working with s-expressions

Since the macro is invoked at compile time, it doesn't have access to runtime data.

(Unfortunately, Boo's online documentation is always hopelessly outdated and doesn't even cover advanced stuff like this. The best documentation for the language I know is this book: http://www.manning.com/rahien/)

The web documentation for this feature is currently broken, and I don't write Boo myself, but I thought it would be a pity if it was overlooked. I appreciate the mod feedback and will reconsider how I go about contributing free information in my spare time.
–
Dan FinchSep 14 '12 at 7:44

The evaluation Mathematica is based on pattern matching and replacement. That allows you to create your own control structures, change existing control structures or change the way expressions are evaluated. For example, you could implement "fuzzy logic" like this (a bit simplified):

This overrides the evaluation for the predefined logical operators &&,||,! and the built-in If clause.

You can read these definitions like function definitions, but the real meaning is: If an expression matches the pattern described on the left side, it's replaced with the expression on the right side. You could define your own If-clause like this:

SetAttributes[..., HoldRest] tells the evaluator that it should evaluate the first argument before the pattern matching, but hold evaluation for the rest until after the pattern has been matched and replaced.

This is used extensively inside the Mathematica standard libraries to e.g. define a function D that takes an expression and evaluates to its symbolic derivative.

Metalua is a language and a compiler compatible with Lua that provides this.

Full compatibility with Lua 5.1 sources and bytecode: clean, elegant semantics and syntax, amazing expressive power, good
performances, near-universal portability.
- A complete macro system, similar in power to what's offfered by Lisp dialects or Template Haskell; manipulated programs can be seen
as source code, as abstract syntax trees, or as an arbitrary mix
thereof, whichever suits your task better.

A dynamically extensible parser, which lets you support your macros with a syntax that blends nicely with the rest of the language.

A set of language extensions, all implemented as regular metalua macros.

Differences with Lisp:

Don't bother developers with macros when they aren't writing one: the language's syntax and semantics should be best suited for those
95% of the time when we aren't writing macros.

Encourage developers to follow the conventions of the language: not only with "best practices rants" nobody listen to, but by offering
an API that makes it easier to write things the Metalua Way.
Readability by fellow developers is more important and more difficult
to achieve than readability by compilers, and for this, having a
common set of respected conventions helps a lot.

Yet provide all the power one's willing to handle. Neither Lua nor Metalua are into mandatory bondage and discipline, so if you know what
you're doing, the language won't get in your way.

Make it obvious when something interesting happens: all meta-operations happen between +{...} and -{...}, and visually stick
out of the regular code.

An exemple of application is the implementation of ML-like pattern matching.

If you're looking for languages that are extendable, you ought to take a look at Smalltalk.

In Smalltalk, the only way to program is to actually extend the language. There is no difference between the IDE, the libraries or the language itself. They're all so entwined that Smalltalk is often referred to as an environment rather than as a language.

You don't write stand-alone applications in Smalltalk, you extend the language-environment instead.

There are tools that allow making custom languages without writing a whole compiler from scratch. For example there is Spoofax, which is a code transformation tool: you put in input grammar and transformation rules (written in a very high level declarative way), and then you can generate Java source code (or other language, if you care enough) from a custom language designed by you.

So, it would be possible to take grammar of language X, define grammar of language X' (X with your custom extensions) and transformation X'→X, and Spoofax will generate a compiler X'→X.

Currently, if I understand correctly, the best support is for Java, with C# support being developed (or so I heard). This technique could be applied to any language with static grammar (so, e.g. probably not Perl) though.

Funge-98

Funge-98's fingerprint feature allows could be done to allow a complete restructuring of the entire syntax and semantics of the language. But only if the implementor supplies a fingerprint mechanism that allowed the user to programmatically alter the langauage (this is theoretically possible to implement within normal Funge-98 syntax and semantics). If so, one could literally make the rest of the file (or whatever parts of the file) act as C++ or Lisp (or whatever he wants).