Every time I define a new function, I wonder which construct I should use: true functions obtained by using Function, or rule-based syntax. For example, these are two ways of defining a square function:

I don't think there's much of a difference. If a function is sufficiently complicated, I might prefer the second construction over the first one, though. Note that recursive functions are easily done with the second construct, but are a bit tricky to do with the first, as in this Fibonacci example: fib = (If[#1 == 1 || #1 == 2, 1, #0[#1 - 1] + #0[#1 - 2]]) &.
–
Guess who it is.♦Jan 26 '12 at 1:10

but using the pure function gives you a packed list which means faster evaluations and less memory for storage:

N@ByteCount[tmp2]/ByteCount[tmp1]
3.49456

This example used Map but you would observe the same thing with Table, Nest, Fold and so on. As to why this is the case (@Davids question) the only answer I have is the circular one that autocompilation using functions with down values hasn't been implemented. I haven't found out what the difficulties are in implementing this, i.e. whether it hasn't been done because it can't be or because it just hasn't been. Someone else may know and can comment.

2) functions with down values may (in all likelihood will) cause a security warning when present in an embedded CDF.

I'm sure others will be able to expand on this and add many more differences.

Of course, if the application will involve compiled functions anyway (e.g. numerical work), one would want to use Compile[] directly instead of Function[].
–
Guess who it is.♦Jan 26 '12 at 1:55

What's the reason for 1), or do you have something for further reading? This kind of looks like a good reason to use Function in many cases.
–
DavidJan 26 '12 at 2:19

3

@Mike Compilation of pattern-based functions would require to compile the entire pattern-macther. While theoretically this might be possible, the task is extremely difficult, because the pattern-matcher has very complex semantics (Attributes, infinite evaluation, various types of global rules, defaults, ...), and Mathematica is an untyped language. So, for all practical purposes, this is plain impossible to do. IMO, intermediate (perhaps more strongly typed) language layer would be needed to make some subset of the rule-based code compilable.
–
Leonid ShifrinJan 26 '12 at 12:38

1

@kirma And I am interested. At some point long ago, as my first project at WRI, I wrote a simplistic interpreter / compiler from a subset of Matlab to Mathematica. I can now tell that much, because that project is likely to be open-sourced some time soon. Compiler was generated from interpreter, using partial evaluation. I think this is somewhat similar in spirit to what PyPy does. Anyway, I'd be very happy to dig deeper into PyPy, but so far just didn't have the time for that.
–
Leonid ShifrinMay 8 at 12:22

1

@kirma In any case, I am very interested in this topic, and particularly the one you mentioned. Just hope I will get some good chunk of free time rather soon, and will be able to dig deeper into this.
–
Leonid ShifrinMay 8 at 12:24

These two forms may be similar on the surface, but they are very different in terms of the underlying mechanisms invloved. In a sense, Function represents the only true (but leaky) functional abstraction in Mathematica. Functions based on rules are not really functions at all, they are global versions of replacement rules, which look like function calls.

One big difference is in the semantics of parameter-passing. For rules (and therefore functions based on rules), it is more intruding, in the sense that they don't care about the inner scoping constructs, while Function (with named arguments only) will care. Here is an example of what I mean:

Function[{x},Module[{x=2},x]][3]
(*
==> 2
*)

while

In[46]:=
ClearAll[fn,a];
fn[x_]:=Module[{x=2},{Hold[x],x}]
fn[3]
During evaluation of In[46]:= Module::lvset: Local variable specification
{3=2} contains 3=2, which is an assignment to 3; only assignments to symbols are allowed. >>
Out[48]= Module[{3=2},{Hold[3],3}]

and especially

In[49]:=
Clear[a];
fn[a]
Out[50]= {Hold[a$840],2}

Another difference I want to mention is that, due to Function with names arguments beaing a leaky abstraction, passing it as an argument to another function is risky. In addition to the mentioned above post by @WReach, which exposes the essence of the problem, see this post for an example of a trouble that one can get into, in a more "realistic" situation.

That said, in many cases one may think of "functions" defined by rules as normal functions. Because of the powerful builtin rule - dispatching mechanism of rule-application, functions based on rules can be overloaded in very powerful ways, which adds expressive power, as noted by @Sal.

Additionally, while this is not often used, one can often use global pattern-based function definitions as local rules, to achieve rather non-trivial things. In this answer, I use this technique to illustrate how one can dynamically generated Mathematica code at run-time, with the ability to test it in the "interpreter" mode. To illusrate the power of this approach here, I will write a (simplistic, sloppy) macro which will remove the Map auto-compilation limitation mentioned by @Mike Honeychurch, at least for simple pattern-based functions:

where what happened is that my macro has expanded the call to square2 before Map was used on it. This is quite non-trivial, becuase it was able to expand the cal square2[#] inside a pure function. This is actually the case of a constructive use of the mentioned above leaky functional abstraction - were Function a complete black box, and this would not be possible. Note that the transformation f->f[#]& is not always correct, since it leaks evaluation. I used it here as an example, but if one were to do it for real, more care must be taken.

Functions are more concise and generally faster but patterns are a lot more expressive. When you don't need the expressive power of patterns you should probably use functions. I use down values more to set up the high level structure of my program and functions to implement the algorithms. But often I am lazy and use down values out of habit. When I am in exploratory mode I pretty much don't worry about the difference.

This example from the Mathematica docs shows one way down values are more expressive (dynamic programming):

Of course, you add conditional logic to functions and get a similar effect.

It's also a good idea to use functions when they are side-effect free and down values when there are possible side-effects but this is not something Mathematica enforces and is more of one of my philosophical preferences that comes from exposure to functional programming languages.

I am not sure why down values cause problems in CDF's and I would hope this gets fixed or lessened in future versions. It seems overly restrictive.

For novice readers: there is a fair bit of difference between Sal's implementation of the Fibonacci recurrence here, and the pure function implementation I gave in the comments. Sal's implementation caches values by virtue of the f[i_] := f[i] = (* stuff *) construction. This is more efficient for large argument values, but at the expense of some storage.
–
Guess who it is.♦Jan 26 '12 at 5:56

Note that the second snippet can alternatively be done in terms of any of If[], Which[], Switch[], or Piecewise[], but the approach Sal gives might be better if the cases involved are a bit more elaborate.
–
Guess who it is.♦Jan 26 '12 at 5:57

The first example (using Function) is called a "pure function" or an "anonymous function". It is used (for example) when you do not want to bother to give a special name to your function. You just describe what it does when applied to its arguments. This is very similar to the pure mathematics

$x\mapsto f(x)$

notation. The emphasis here is on the "abstract" functional aspect (i.e. you think at the function as an object in a functional space).

Typical use is with some kind of options (often from built-in functions) as in:

For my purposes I typically use the second form because it is easier to set up multiple definitions given the same number of inputs and cleaner to perform validation on the inputs. I also find the code easier to understand and debug.

I'm at a loss at the moment for how to code that BlankNullSequence bit that would allow g[1] to return $Failed as f[1] would.

One place I often do use the "pure function" style (outside of say graphics options) is when programmatically building up functions.

Here is another toy example that will create a function on the fly that takes 2 or more arguments depending on the inputs to h and when applied to the same number of args, multiplies them. Silly, but you get the idea.

The second form is usually preferred over the first. This is because it is easier to write functions with multiple parameters (rather than #1, #2, ...), it is more straightforward to associate patterns and conditions on the parameters, and to define functions with variable numbers of parameters.

Preamble

There are some important differences (or, more precisely, features of Function which can't be reproduced with symbols and rules), that have not been reflected in answers here, but that I think deserve a separate answer. These are related to some more advanced uses, involving evaluation control, closures, and garbage collection.

Emulating Hold attributes for SubValue - type definitions

Normally, you can't hold arguments sub__ in a call

fn[args___][sub___]

if your definitions are given as SubValues, like

fn[args___][sub___]:=Hold[args, sub]

because of the way evaluation process work. However, you can define instead

fn[args___]:=Function[##,Hold[args,##],HoldAll]

which would effectively also hold sub. You can't make this work without using a pure function, AFAIK.

Closures and symbol management

Basically, Function is indispensable for creating (nested) closures, and the reason for that is that it spares you from manual symbol management, since there are no explicit names / symbols, to which the action is attached.

It is not very easy to find a good example, because the style of programming based on closures is not very widely used in Mathematica. I will use some very simplified example of what I needed to do at some point. Imagine that you have some data, which you want to split to chunks and present in a form, in which you can apply some transformations to that chunked data in a delayed (lazy) fashion, so that they are only carried out when some specific chunk of data is requested. Here is one way to create such a construct:

The advantage of the nested function construction above is that the inner function serves as a pointer to the data - I can operate on it without extracting the data itself. Now, I can implement a lazy Map operation, for example:

You can see that this version of Map isn't doing anything when applied, it just transforms the inner function (a "pointer" to actual piece of data), so that Map would actually be applied when we extract the data.

Indeed, we can do:

llm = Map[#^2 + 1 &, ll];
llm1 = Map[Sin, llm];

and no actual work was yet done by Map. When we extract the data, we get:

and both Map operations only executed now, when we extracted the data, and only on a specific piece of data we wanted.

Returning back to the original question, the above functionality does not need any symbol management, since all functions used here were pure functions. This is a big advantage. This means, in particular, that we can pass expressions involving so constructed lists (LL[...]) anywhere we like, can construct a large number of them, if needed - and in all cases, the destruction of their inner state (when they are no longer used / referenced) is handled automatically by the garbage collector.

Another very interesting, and in fact useful, related feature is that when we leak Module - variables into the closures (creating closures with a state), then those leaked Module variables are garbage-collected once the closure is no longer referenced.

where we can see that we have now a specific Module - generated variable y$324 as a part of the function's body. Each time we call this function on some argument, that variable gets incremented:

myFun /@ {1, 2, 3}
(* {2, 4, 6} *)

So, we have constructed a closure that was closed over a mutable variable. We can inspect the names:

Names["Global`*"]
(* {"myFun", "var", "var$", "y", "y$324"} *)

to confirm that the variable is in fact visible on the top level. Now, let us call Remove on myFun variable, which stores (references) the functions, and inspect the names of global variables again:

Remove[myFun];
Names["Global`*"]
(* {"var", "var$", "y"} *)

What you see is that the variable y$324 has been garbage - collected. Which is exactly the behavior we want.

The myFun variable was a proxy, which I used to illustrate the mechanism. In practice, it won't be there - we would just pass the pure function directly. So, as soon as it's no longer a part of any expression, the leaked variables will be automatically garbage-collected. Now, should we use named symbols, and that wouldn't be the case, simply because for named symbols, there isn't any automatic garbage collection, so we'd have to do it manually. This means that not only those symbols used for function names would hang around, but also all internal stateful symbols, in such a case.

Summary

I tried to illustrate a few less obvious advantages of having pure functions (Function) in the language, related to the uses of closure and garbage collection.

"You can't make this work without using a pure function, AFAIK." Technically you do use a pure function here but not as you do here. Did you forget that or do you mean something else?
–
Mr.Wizard♦May 8 at 11:54

@Mr.Wizard Well, that's not really a proper way to do that. Perhaps, I should have put it differently. But that hack based on Stack isn't something I'd put into production code.
–
Leonid ShifrinMay 8 at 12:16

Sorry for the code dump, I just wanted to show this possibility: SetAttributes[fn,HoldAll];Module[{fnHelpVar,fnHelpFu},SetAttributes[fnHelpFu, HoldAll];fn[args___]:=(fnHelpVar=Hold[args];fnHelpFu);fnHelpFu[sub___]:= Join[fnHelpVar,Hold[sub]]];fn[c++][d++]
–
Jacob AkkerboomMay 8 at 17:55

@JacobAkkerboom Well, but this relies on a global variable. So, in my book this doesn't count, I think. Will revisit this later, right now this is just my first impression.
–
Leonid ShifrinMay 8 at 20:02

Mathematica is a registered trademark of Wolfram Research, Inc. While the mark is used herein with the limited permission of Wolfram Research, Stack Exchange and this site disclaim all affiliation therewith.