Answer: Disorder is order (14 Votes)

You write: "How can this paradigm be used to build predictable software that works as intended, when we have no guarantee when and where an expression will be evaluated?"

When an expression is side-effect free the order in which the expressions are evaluated does not affect their value, so the behavior of the program is not affected by the order. So the behavior is perfectly predictable.

Now, side effects are a different matter. If side effects could occur in any order, the behavior of the program would indeed be unpredictable. But this is not actually the case. Lazy languages like Haskell make it a point to be referentially transparent, i.e. making sure that the order in which expressions are evaluated will never affect their result. In Haskell this is achieved by forcing all operations with user-visible side effects to occur inside the IO monad. This makes sure that all side-effects occur exactly in the order you'd expect.

Answer: Haskell and databases (14 Votes)

If you are familiar with databases, a very frequent way to process data is:

Ask a question like select * from foobar

While there is more data, do: get next row of results and process it

You do not control how the result is generated and in which way it is generated (indexes? Full table scans?), or when (should all the data be generated at once or incrementally when being asked for?). All you know is: if there is more data, you will get it when you ask for it.

Lazy evaluation is pretty close to the same thing. Say you have an infinite list defined as ie. the Fibonacci sequence—if you need five numbers, you get five numbers calculated; if you need 1000 you get 1000. The trick is that the runtime knows what to provide where and when. It is very, very handy.

(Java programmers can emulate this behavior with Iterators—other languages may have something similar)

Answer: Modularity (23 Votes)

A lot of the answers are going into things like infinite lists and performance gains from unevaluated parts of the computation, but this is missing the larger motivation for laziness: modularity.

The classic argument is laid out in the much-cited paper "Why Functional Programming Matters" (PDF link) by John Hughes. The key example in that paper (Section 5) is playing Tic-Tac-Toe using the alpha-beta search algorithm. The key point is (p. 9):

[Lazy evaluation] makes it practical to modularize a program as a generator that constructs a large number of possible answers, and a selector that chooses the appropriate one.

The Tic-Tac-Toe program can be written as a function that generates the whole game tree starting at a given position, and a separate function that consumes it. This does not intrinsically generate the whole game tree, only those sub-parts that the consumer actually needs. We can change the order and combination in which alternatives are produced by changing the consumer; no need to change the generator at all.

In an eager language, you can't write it this way because you would probably spend too much time and memory generating the tree. So you end up either combining the generation and consumption into the same function, or implementing your own version of laziness

Think you know what makes lazy evaluation useful? Disagree with the opinions expressed above? Bring your expertise to the question at Stack Exchange, a network of 80+ sites where you can trade expert knowledge on topics like web apps, cycling, patents, and (almost) everything in between.

This function produces an infinite list of successive applications of a function. (A similar but differently implemented function is in the standard library.) A similar built-in function is fix function:

Code:

fix f = let x = f x in x

Lazy recursion allows you to separate logic defining data and logic consuming it: simply define data to extend infinitely and consume as much of it as you want. So long as you consume finitely much of it, you're okay. (Technically, how you consume the data structure matters as well, but I won't get into that here.)

A deeper and very cool application of this pattern is an "exhaustive" search over the Cantor space, or the space of infinite binary sequences (with a certain topology), which takes an arbitrary predicate on its domain and determines whether the predicate is ever true. While this task seems impossible, it is possible because the space is compact and thus each predicate must depend on a bounded number of digits of its input. The implementation is strikingly simple.

It presupposes some knowledge of Haskell (it was part of a discussion within the Haskell community), so don't go there if you're still at "what is this Haskell and should I be looking at it?".

the most important point Lennart makes is that laziness is good for function reuse. (this is also mentioned in one of the SO answers.) a basic example : to find the minimum of a list of numbers, you can just compose the "sort" function and the "head" (first element of the list) function. in a strict language, this is a terrible way (algorithmically) to get the minimum element, as it does all the work of sorting the list (which is unnecessary.) but in a lazy language, it only does enough of work to find the minimum, e.g. O(n). in haskell you write this as

min = head . sort

lennart gives another, maybe better example, which i'll repeat : suppose i have a list of values, and i want to find out if at least one satisfies some boolean property (which is expensive to compute.) in java you'd write this as a for loop with an early termination (to prevent yourself from doing unnecessary work.) but in haskell, you just compute the predicate on every element (lazily!) using the "map" function

map pred lst

and then select one that you want using "or"

any pred lst = or (map pred lst)

read lennarts post for more details. by the way, this is a general strategy in lazy programming : write down an expression which computes everything you might ever want, and then write a selector which only pulls out the values you need. laziness ensures that the composition of the two does no extraneous work. a mind-blowing example of this is richard bird's sudoku solver

the first time i read this, i think brain fluid dripped out of my ear.

the functional modularity that laziness gives you is so powerful that in haskell you almost never need to resort to explicit recursion (like in scheme or ML.) most recursive "patterns" are encoded in basic higher order functions (map, foldl, mapAccumL, etc.) imagine being able to write all your java code without any loops, or ML without any recursion!

In .NET world the delegate is used to build up or compose the final expression tree. So something is executing, just doesn't happen be on your data until you ask it to fill a particular data structure. So it's a function you are defining through meta programming and you shouldn't expect an output until you define the complete mapping F(A)->B and not just F(x). Not sure what Haskell does but I'm sure it's a lot more deterministic than the question makes it out to be.

A lot of smart people, including my very modest self, have looked at pure functional programming.

Its the glass bead game of programming languages. Its something so pure and yet so remote from actual programming, that its hard to know where to begin critiquing it.

The fundamental problem is that the functional model of computation bears little relation to the actual means of computation using CPU and memory.

What this means is that programmers of the functional model of computation need PhD's in functional programming in order to explain why their programs aren't performing as expected. This bears repeating: its takes a PhD level brain to explain the performance of a lazy functional program. One misplaced comma or ill-defined type can turn your program from O(N) to O(N^2), and normal human beings will have a snowflake's chance in hell of explaining why.

The most common data structure of our time - the most powerful and the most used - the hash-table - doesn't fit in the functional programming paradigm. It just cant be represented in functional programming - not without a massive loss of efficiency to the point of being unusable. What this tells us is that there is a mismatch between the functional programming paradigm, and the actual computational substrate we all use.

I like functional programming at a high level - its great to use pure functions when coding, and I hope languages like Java and C# and C and C++ come to support pure functions more and more, but using an absolutely pure functional programming language is like trying to run with your shoelaces tied together - in this case, purity is like virginity - great until you try to use it.

One misplaced comma or ill-defined type can turn your program from O(N) to O(N^2)

At least in a strongly-typed functional language like Haskell, unintended grouping/ordering of function arguments (though not done with a comma) is almost always caught at compile time.

And although lazy evaluation can cause unexpected space complexity (sometimes O(N) to O(N^2) as you say), the time complexity of algorithms has in my experience always been easier to eyeball in Haskell than in strict, imperative languages.

I agree that there's a lot of work to be done in making the costs of lazy functional programming decisions very visible upfront. But diagnosing the unexpected memory usage of a lazy functional program is a skill no more difficult to learn than debugging a program with multi-threaded communication. Even a computer scientist with a PhD would need practice to diagnose and solve these problems quickly; but with enough practice, anyone can learn to debug these problems skillfully.

One of the very common uses of lazy evaluation is staring you right in the face every time you use a computer that relies on overlapping windows. Just about every modern OS with a GUI interface uses full or partial lazy evaluation when it comes to performing draw calculations on windows that are hidden (or parts of windows thereof). If they didn't do this, your processor would be pegged with calculations from window draws that don't matter. Some implementations are buggier than others. In and of itself though, the idea that everything doesn't HAVE to be evaluated at THIS second in time is a very necessary part of programming for exactly this reason: it saves CPU cycles. At times this isn't, or doesn't seem to be, very relevant. But as you scale up from your tiny piece of code to multiple pieces of code running concurrently, and then requiring GUI output, it becomes more and more relevant. Yes, we have faster processors than ever before. We are able to do more with them though in part thanks to lazy evaluation.

While it's a nice idea, and it can generally be a good thing, I have to say that I'm not all that comfortable with relying on a compiler or runtime to decide on deferral. I generally prefer to defer things manually, as aside from being much clearer what's going on, it also means I can ensure that a complex expression isn't triggering a bunch of complex deferred operations at the same time.

That said, deferral can be great. My favourite example is a turn-based strategy game; in many such games all processing is done when the player hits "next turn", however human players aren't usually all that demanding on the game during their turns, leaving comparatively long periods of inactivity between their actions. So a great way to shorten the time a new turn takes to execute is to defer most of the work and evaluate it in the background later, or immediately if the data is required for something. This way you can use up all the wasted time between turns and shave a significant amount of wasted processing from new turn execution; though of course it adds a lot of complexity so you need to have a good idea of what you're doing when you begin, as bolting it on later isn't easy.

The real question is whether a language with lazy expressions could achieve the same (or even a greater) benefit without introducing performance spikes when an expression has a lot of (or several complex) dependents? Personally I'm not convinced, though this is a good example of when the ability to choose what is or is not deferred can be beneficial; as you can potentially just leave everything for deferral, then tweak statements after profiling, but this requires a good profiler that is able to show which deferred statements are executing and when.

I will be very impressed when someone manages to write an article on Haskell that deals with some sort of actual practical example that is accomplished in smart/intuitive way: show me how to do something with it that does not primarily focus on a long theoretical description of why it is a good idea to use Haskell!

The most common data structure of our time - the most powerful and the most used - the hash-table - doesn't fit in the functional programming paradigm. It just cant be represented in functional programming - not without a massive loss of efficiency to the point of being unusable. What this tells us is that there is a mismatch between the functional programming paradigm, and the actual computational substrate we all use.

First, hash tables are not the "most common data structure", arrays are, by an order of magnitude, especially since you build hash tables from arrays.

But what you should be measuring is the most common abstract data type, not the implementation of that type. And that type would be maps (or associative arrays or whatever) and those can be expressed just fine using red-black trees, as is done in functional languages.

And if you're avoiding a functional language because of uncertain efficiency... do you have any idea how inefficient hash tables are? Load a million records using hash tables (assume you've interned the keys), and you're commonly wasting huge amounts of space because all the tables are initialized to hold 16 items. Trees, of course, use only one node if they're empty. Second, the hashing functions tend to be complete crap, so your hash table is either 3/4 full at most, or you may as well have put everything in a list and do a linear search on it.

What this means is that programmers of the functional model of computation need PhD's in functional programming in order to explain why their programs aren't performing as expected. This bears repeating: its takes a PhD level brain to explain the performance of a lazy functional program. One misplaced comma or ill-defined type can turn your program from O(N) to O(N^2), and normal human beings will have a snowflake's chance in hell of explaining why.

Of course, that's also true with any language that allows concurrent threads of execution. And even with a PhD-level brain, explaining the performance of any non-trivial program is completely impossible, so everyone just uses a profiler.

I will be very impressed when someone manages to write an article on Haskell that deals with some sort of actual practical example that is accomplished in smart/intuitive way: show me how to do something with it that does not primarily focus on a long theoretical description of why it is a good idea to use Haskell!

I will be very impressed when someone manages to write an article on Haskell that deals with some sort of actual practical example that is accomplished in smart/intuitive way: show me how to do something with it that does not primarily focus on a long theoretical description of why it is a good idea to use Haskell!

Funny, I just wrote a simple but nontrivial program in Haskell to solve a geometric puzzle. I could have written it in Java but it would have been at least 4 times as long. If you're tied to enterprise infrastructure, go with Java/.Net. If you can throw off the shackles of over-engineering, Haskell is a refreshing and productive alternative.

My limited understanding is that lazy evaluation is more like "deferred" evaluation where evaluating everything up-front would be "expensive", so you evaluate on demand.

edit: Based on my understanding of "lazy" from outside of Haskell.

Some constructs look like lazy evaluation but aren't. A Future object, in Java, isn't lazy but asynchronous. You schedule it with an executor and processing starts right away.

a && b, assuming common C notation is "lazy" in that it will only evaluate b when a is false, but that is really a control flow construct. Any &&, || or ? : expression, after all, can be refactored as an if / else.

An iterator is a genuinely lazy construct. Proxy objects can be lazy constructs, e.g. if I have a class in Python:

What database_fetch can do is rewrite the object (including type, interestingly enough) to be an actual Record.

A lazy language, however, is one where there's no (logical) distinction between evaluated and unevaluated expressions. So you can write (and this is in the standard library) a function hGetContents that gets an entire file as a string. "The rest of the string that we haven't read yet" is not some special data structure, it's just the tail of the list, as with any list.

A lazy language, however, is one where there's no (logical) distinction between evaluated and unevaluated expressions. So you can write (and this is in the standard library) a function hGetContents that gets an entire file as a string. "The rest of the string that we haven't read yet" is not some special data structure, it's just the tail of the list, as with any list.

Good call. More specifically, in Haskell a program written in the following form is a streaming program:1. read entire contents of file into a string2. split string into list of lines3. transform lines into new list via one-pass algorithm4. concatenate transformed list into a string5. write entire string to a file

When dealing with these pure functional languages, Dan Friedman (of scheme fame) always maintained "don't debug, rewrite". This is the guy who (literally) wrote the book on scheme and re-implemented prolog in scheme (mini-kanren) for fun.

This has always struck me as true with functional languages. Threaded applications sometimes, as well. After all, it's usually a typo that causes the problem, not a logic error.

First, hash tables are not the "most common data structure", arrays are, by an order of magnitude, especially since you build hash tables from arrays.

In a lot of languages, every single object is a hash table.

I'm not going to get into a debate of "what is the most common data structure", because that would need a research study. But they are certainly used all over the place.

scooby509 wrote:

And if you're avoiding a functional language because of uncertain efficiency... do you have any idea how inefficient hash tables are? Load a million records using hash tables (assume you've interned the keys), and you're commonly wasting huge amounts of space because all the tables are initialized to hold 16 items. Trees, of course, use only one node if they're empty. Second, the hashing functions tend to be complete crap, so your hash table is either 3/4 full at most, or you may as well have put everything in a list and do a linear search on it.

Really? If hash tables are so slow, then why does Objective-C use them to associate a method in a class definition to the memory address of the code it will execute? Seems like calling a method on a class would need to be pretty fast - it happily runs millions of times every second even on ARM devices.

First, hash tables are not the "most common data structure", arrays are, by an order of magnitude, especially since you build hash tables from arrays.

In a lot of languages, every single object is a hash table.

I'm not going to get into a debate of "what is the most common data structure", because that would need a research study. But they are certainly used all over the place.

scooby509 wrote:

And if you're avoiding a functional language because of uncertain efficiency... do you have any idea how inefficient hash tables are? Load a million records using hash tables (assume you've interned the keys), and you're commonly wasting huge amounts of space because all the tables are initialized to hold 16 items. Trees, of course, use only one node if they're empty. Second, the hashing functions tend to be complete crap, so your hash table is either 3/4 full at most, or you may as well have put everything in a list and do a linear search on it.

Really? If hash tables are so slow, then why does Objective-C use them to associate a method in a class definition to the memory address of the code it will execute? Seems like calling a method on a class would need to be pretty fast - it happily runs millions of times every second even on ARM devices.

Hash tables are plenty fast enough.

Not only that, but compilers do it the same way. They call it a symbol table, but it's the same data structure.

Hash tables are plenty fast and in practice are probably the fastest general purpose lookup tables. Purely functional languages can suffer by (I believe) at most a logarithmic overhead to account for inability to modify state. They sometimes require clever techniques to efficiently update data structures, such as zippers. But in practice it's unlikely to hurt performance much, and Haskell does provide a back door to mutate state if you really need it (thus why there are hash table libraries for Haskell). Choosing a language for this reason is ridiculous unless your problem domain has special requirements.

And if you're avoiding a functional language because of uncertain efficiency... do you have any idea how inefficient hash tables are? Load a million records using hash tables (assume you've interned the keys), and you're commonly wasting huge amounts of space because all the tables are initialized to hold 16 items. Trees, of course, use only one node if they're empty. Second, the hashing functions tend to be complete crap, so your hash table is either 3/4 full at most, or you may as well have put everything in a list and do a linear search on it.

Really? If hash tables are so slow, then why does Objective-C use them to associate a method in a class definition to the memory address of the code it will execute? Seems like calling a method on a class would need to be pretty fast - it happily runs millions of times every second even on ARM devices.

I didn't say they were slow, I accused them of "uncertain efficiency", which was the criticism of functional languages that I was addressing. Purpose-built implementations of hash tables can be highly tuned: they may be rarely updated, or you know the key distribution, or you can do perfect hashing. In cases where they're used ubiquitously in the language you'll often see there are constraints, e.g. Perl 5 hashes only allow string keys.

But when you're dealing with hash tables as a generic implementation of associative arrays, the downsides include things like periodically recopying the entire table, wasted memory for small tables, etc. The worst problem is that when hash tables are small they really do perform beautifully. Determining what the performance will be when they are large or heavily used requires statistical analysis of key distribution, and you have to amortize the cost of recopying and collision resolution! Be honest, have you ever done that, outside of class?

Look at systems where indexing needs to be used in a concurrent environment with large amounts of data, like DBMSs. Most indexes are B-trees, partly because they can be sorted, but also because they play nice with concurrency: updating won't require holding locks while potentially megabytes of index is recalculated. (The exception is some temporary indexes used for hash-joins, but in those cases the number of buckets can be estimated beforehand.)

I remember that a report to conference on 'real world' use of Haskell, there was a complaint that performance was impredictable due to Haskell's lazyness by default.IMHO lazyness has interesting properties, but it should only be an option..

I think that an interesting use of Functional Programming would be in the Big Data space. There are so many questions that are being asked of massive amounts of data (streaming data may not have even arrived yet) that Lazy Execution would seem to be the way to go.

A very interesting concept, this is the first time I hear about it. The devil must be in the details. After all, the compiler applies some algorithm to convert this to a sequence of instructions, which cannot be universally best/good for all problems. Not knowing up front whether things would work well enough does make me uncomfortable, but when they work, that is all that matters.

If anyone knows a couple good references about how some of these compilers work, please post them here.