Monday, December 5, 2011

2 Misconcepts About Functional Programming (relating to context and monad)

It's interesting that almost every old and new language that claims to
have some functional support has to go through the trial of its purity,
like Perl, Scala,
Erlang,
Ruby,
JavaScript and Python.
Candle is no
exception. Candle's last release announcement to
LtU triggered some interesting discussion about whether the functional
features in Candle are pure. I finally have sometime to share more
about my thoughts on this issue. I'd like to talk about 2 misconcepts
about functional programming. The first misconcept is that dynamic scope variables are not functional.
The second misconcept is that monads
are functional. I know the following writing is going to
kick up the dust, but it is better than we settle without thorough
thinking.
Of course, to continue the discussion, we must first define the judging
criteria(s) to tell whether a programming feature is functional or not.
I think everyone would agree the single most important criteria is referential transparency. Other
salient features, like function
composition, lazy evaluation and anonymous
functions, are more prominent in functional programming, but they
are not exclusive to just functional programming; imperative
programming can sometimes exhibit or exploit these features as well.

1st Misconcept: Dynamic Scoped Variables are not Functional

This question arise due to the discussion whether the put statement in
Candle is functional or not. Yes, by strict definition, dynamic
scoped variables are not referentially transparent. But I'd argue
it is still functional. I'd like to introduce a new concept here,
so as to extend the scope of functional programming, a concept called contextually functional.

In other words, features in a functional programming language can be
divided into two categories: purely
or strictly functional and contextually
functional. Strictly functional features are those obey the
referential transparency rule. And a contextually
functionalfeature is a
feature that follows these rules:

Some API function or language construct is defined to allow user
to retrieve information about the context at certain place in a
program.

Some corresponding API function or language construct is defined
to allow user to push value unto the context, which can be retrieved
later by the previous API function or language construct. The
pushed value must be hold constant
when it is in scope. When the
pushed value goes out of scope, they are popped out of the stack
automatically.

The pushed value can shadow or
occlude previous values of same
identity. This is very important. Certain features like global
variables, current date and time function (like in XQuery), data input
functions are contextual information, but as they cannot be occluded,
they can still be classified as strictly functional.

Some contextual functional features that I can think of are:

Dynamical scoped variable,
as in Common Lisp, Candle and XSLT (which is named as tunneled
parameters);

The context item in
XPath/XQuery and current item
in XSLT. Most people take for granted that XSLT and XQuery are purely
functional. They are not.

Closure.

'this' and 'super' accessors in functional OOP.

I won't argue that contextual functional features are RT (referentially
transparent). As pointed out in the LtU
discussion, the Wikipedia article
is wrong or inaccurate in classifying dynamic scoped variable as RT.
But I'd argue that they are still functional.
The justification I can put forward is that a program
containing contextual functional features can be transformed into
a program containing just pure RT functions. The brute force way of
transformation is to pass all context information as parameters to all
functions in the program.

As can be seen from the LtU discussion, some people might not accept
that as valid proof. They would argue that even procedural program can
be transformed into functional program. But that's a misconception,
which I'll elaborate later. The truth is that a procedural program with
external side effects cannot be transformed into pure functional program. By this
proof, I believe, Candle's put statement, even in its earlier version put
expr = value { ... } is still functional. In the latest release,
I change the put statement to put qname = value { ... },
not because I think the earlier version would render Candle impure, but
because I think that the later version is more friendly to static
analysis.

But I do acknowledge the concern that contextual functional features,
if misused, can make a program look less functional. For example, say
if a language goes to the extreme to allow any addressable data to be
occluded, then a programmer can essentially program in a procedural
style by occluding any data when he wants an assignment and passing
on the continuation.

But when contextual functional features are carefully defined, we can
come up with idiomatic usages, and our programs will not look any less
functional then pure RT programs. Just look at XQuery and
XSLT. We accept them as 'pure' functional programming languages without
second thoughts.

Programming is both science and engineering. When used in science,
which often deals with complicated problems, it is often enough to just
use RT functions. But when used in engineering and real life
programming, which often deals with complex problems, there is often a
lot of context information. In such application system, it is often
cumbersome to pass all necessary contextual information to a function.
Contextual functional features are very important in making these large
and complex application systems more modular and maintainable. The
price that we have to pay is that contextual functions are harder to
analyze and process than pure functions, for example, in caching the
output of a function.

As programming evolves, I'm sure we'll see more contextual functional
features. CSS (cascading style sheet) and SVG drawing context are good
examples of functional contextual dependency in complex application
modeling. And we can come up with new idiomatic usages when we
selectively allow some context information to be shadowed. For example,
if we allow the current date-time to be shadowed, then we can emulate a
past or future time, which can be very useful in automatic testing. If
we allow the OS specific information to be shadowed, then we may
emulate a UNIX environment on a Windows platform.

If we restrict functional programming to just mean RT programming, then
we lose all those rich contextual programming features.

2nd Misconcept: Monads are Functional

Monad is probably one of the mostly misunderstood programming feature.
I myself have attempted very hard to understand it by carefully
studying different tutorials for over 10 times at least, and I think I
still just have elementary understanding of it.

Most people think monad is a 'functional' programming feature. But this
really depends on how we define functional.
If it is defined to be RT, as above, then I claim that monads are not
functional when they are used to allow functional programming languages
to carry out side effects.

Monad itself is just a way of computation. It transform a program into
another program. Monad itself does not introduce additional
side-effects. But if the result program is procedural, I cannot accept
that monad used in such way can be called functional.

Monad doesn't have the magic power of transforming a procedural program
into a purely functional program, as some might have mis-believed. This
misconcept might be related to the misconcepts that "all programs (whether functional or
procedural) are essentially the same" based on the Church-Turing
thesis (for example in this LtU post). But this is
wrong interpretation of the thesis. I don't think the thesis is
actually of any relevance in judging functional programming, because
the effective method in the
thesis can be functional or non-functional (although the descriptions
sound like functional computation). So we can put the thesis aside.

Here, I'll put it very clearly that pure
functional program is distinctively different from procedural program.
The fundamental assumption of pure functional program is that the
inputs to a function must not change during the evaluation of that
function. If the inputs might change, then it loses RT and all
other fancy functional features. One obvious derivation of the
assumption is that the output of a
pure function must not overlap its input. Thus a pure function
can never to used to directly perform the external side-effects.

A pure function can be used to compute the desired side-effects,
but that is just a pending list. The actual side-effects have to be
carried out by a procedural program. One good example to understand
this is the XQuery Update Facility. This extension to XQuery only
generates pending update lists. And the lists can be further processed
(merged, validated) in a functional manner. The final 'blast' is left
to the underlying engine. So XQuery Update can be considered purely
functional, but the final 'blast' is not.

If a pure functional program can never carried out effective side
effects, then a true procedural program (which performs side effects)
can never be transformed into a pure functional program. Of course,
there are many procedural programs out there that can be completely
replaced by pure functional program, for example those programs that
perform static analysis, report generation and program compilation. But
it doesn't mean all procedural programs can be replaced by pure
functional programs.

This misconcept is also probably due to the way monad is advocated in
Haskell.
People advocate Haskell as a pure functional
language even when monads are used to perform side effects, which I don't accept.
I think the classification of functional languages into pure and impure
(e.g. in this wiki
article) is simply wrong.
You can say XQuery or XSLT are purely
functional, but if you say Haskell is still purely functional when it
is used to perform side effects, that is simply cheating. When you
start mixing black with white, you get all shades of gray; you can't
classify a gray color as white, no matter how light it is.

Furthermore, I don't think Haskell's monads when used to perform side
effects are in any way superior than other ways of mixing functional
and procedural programming. The model to carry out side-effects by
monads is different from XQuery Update. XQuery Update only produces
pending update lists, and the update commands in these PULs are
unordered, and the PULs are mergable. But the side-effects performed by
monads are threaded just like procedural program. They can't be
reordered or merged. When you use monads to perform side effects, you
are just writing a so called 'functional' program to generate a
traditional procedural program. If that can be called 'pure' functional
programming, then any
C program can be called pure functional as well.

I don't doubt
monads usefulness. I think it is a powerful meta-programming mechanism.
But when monads are use to mimic every ways of procedural programming,
like I/O monad, state monad, async monad, concurrent monad, I don't see
any extra values that they bring into programming than the procedural
counterparts. They are more like the
Emperor's New Clothes, sophisticated enough that whoever don't praise
them looks like a fool.

In the next blog article, I'll talk about the procedural design in
Candle. The mechanisms in Candle to allow procedural programming to be
cleanly mixed with functional programming.

(Update: I've posted a new blog article in response to the comments and feedback.)

7 comments:

I can understand your reasoning here, and agree somewhat with *what* you say (the ideas) but I disagree with *how* you have said it (the terminology). Normally I wouldn't pick up on problems with terminology, but since that's part of the reason for the blog post I think it's justified.

Dynamic scope: My problem here is that it seems you're moving the goalposts of "functional programming" such that it includes Candle's non-functional aspects. I think that it would be simpler just to state that Candle does contain some non-functional aspects, but these are invaluable when programming real-world systems. It doesn't matter if a language is or is not purely functional; it matters whether it is good at its job. If these features make Candle better at its job then they are good features to have, but that still doesn't make them (purely) functional.

Monads: There's more to monads than IO! Your argument is sort-of valid when it comes to (in the case of Haskell) a "main" function in the IO monad, but that's only because *in that particular case* there is an implied expectation that it will be run by a procedural program, as you say. However the actual *concept* of monads makes has no such expectation, and it's perfectly possible to use all kinds of monads in a purely functional way. For example, a library which makes use of the Maybe monad is still purely functional (libraries don't have a main function). There is an argument against this, namely that libraries only matter in the context of programs (main functions), but that would also apply to every other pure function and there would be no reason to single out monads over, say, addition.

I explained at the above link, that I think you (and others) are conflating imperative vs. declarative with functional programming.

Afaics, the axis of your taxonomy is incorrect. Declarative vs. imperative is only concerned with the immutability of stored values, or more generally the referential transparency (RT) of the functions (sub-expressions) invoked in the declarative expression.

As the link below confirms, if all the expressions inside a function are declarative, then the containing function is RT. But it is also possible for the containing function to RT, even if the contained expressions are not declarative and only mutate _LOCAL_ state.

I also explained at the first link above, that monadic composition is _always_ functional and declarative, although the IO monad is arguably not w.r.t. state external to the program (i.e. the World).

Ehud Lamm censored me from LtU so I can't post there, but the following comment is incorrect. The "within the monad" is declarative RT functional composition. Although the order of execution is sequential because the functions are nested, this does remove the RT in any way at any scope.

The author of the comment linked below (dmbarbour) is conflating how in C the expressions are not usually RT w.r.t. to the state in the program.

It is wrong to define imperative as "order of execution dependent" and it is wrong to conflate RT with imperative. (Computer) Science requires we have precise definitions and do not conflate the terms.

I'm not trying to move goalposts of "functional programming" just for Candle, but to help clarify what is "functional programming". I found it funny that people accept languages like XQuery/XSLT as pure functional language, but don't accept dynamic scope variable as a functional feature. That's simply self-contradictory.

Regarding the concept of 'contextual functional', I think it is more than just an idea or terminology, there can be a formal mathematical model or type theory behind it. However, I'm not an academic guru to formulate it. It can help classify a set of programming features that are not RT, but also distinctively different from procedural, destructive state-mutating programming features.

If we classify these non-RT features as 'contextual procedural features', I think that would be even less appropriate and against most people's intuitive feeling.

shelby wrote:>> "The author of the comment linked below>> (dmbarbour) is conflating how in C the>> expressions are not usually RT w.r.t. to the>> state in the program.">> And the key point is that when C expression> are not RT, then it is impossible to> encapsulate them in RT functions in the> 'environment passing style'. So that is why> dmbarbour's analogy is not valid.

The difference is that each C statement mutates the global state, whereas "inside" the state monad each corresponding statement generates a _COPY_ of the state (so modified). The latter is RT-- former is not. Monadic composition is always RT (referentially transparent).