bug in language definition (strictness)

bug in language definition (strictness)

It has been brought to my attention (as errata editor of the revised
H'98 report) that there is a bug in the language definition,
concerning strictness annotations on datatypes.

In section 4.2.1, the translation of strict components of a data
constructor is defined as
> (\ x1 ... xn -> ( ((K op1 x1) op2 x2) ... ) opn xn)
>
> where opi is the non-strict apply function $ if si is of the form
> ti, and opi is the strict apply function $! (see Section 6.2) if si
> is of the form ! ti. Pattern matching on K is not affected by
> strictness flags.
>

yet, because of the definition of $!, this applies the constructor to
its arguments right-to-left instead of the intuitive left-to-right.
All extant compilers in fact evaluate the strict fields left-to-right
in violation of the Report.

The same non-intuitive behaviour can be seen more clearly in the
simple expression:

(f $! x) $! y

in which you might expect x to be evaluated before y, but in fact it
is the other way round. (And here, the compilers do follow the Report.)

The fix I propose for H'98 (and equally for Haskell Prime) is to
change the definition of $! as follows

Re: bug in language definition (strictness)

On 2009-08-06 11:08, Malcolm Wallace wrote:
> yet, because of the definition of $!, this applies the constructor to
> its arguments right-to-left instead of the intuitive left-to-right.

I do not think that there is a bug: x `seq` y `seq` e has the same
denotation as y `seq` x `seq` e.

--
/NAD

This message has been checked for viruses but the contents of an attachment
may still contain software viruses, which could damage your computer system:
you are advised to perform your own checks. Email communications with the
University of Nottingham may be monitored as permitted by UK legislation.

Re: bug in language definition (strictness)

On 6 Aug 2009, at 14:37, Nils Anders Danielsson wrote:

> On 2009-08-06 11:08, Malcolm Wallace wrote:
>> yet, because of the definition of $!, this applies the constructor
>> to its arguments right-to-left instead of the intuitive left-to-
>> right.
>
> I do not think that there is a bug: x `seq` y `seq` e has the same
> denotation as y `seq` x `seq` e.

Re: bug in language definition (strictness)

>
> On 6 Aug 2009, at 14:37, Nils Anders Danielsson wrote:
>
>> On 2009-08-06 11:08, Malcolm Wallace wrote:
>>> yet, because of the definition of $!, this applies the constructor to
>>> its arguments right-to-left instead of the intuitive left-to-right.
>>
>> I do not think that there is a bug: x `seq` y `seq` e has the same
>> denotation as y `seq` x `seq` e.
>
> Not if one considers the "kind" of bottom one receives:
>
> undefined `seq` error "it exploded" `seq` e will print "Prelude.undefined"
> while
> error "it exploded" `seq` undefined `seq` e will print "Error: it exploded"

There's only one kind of bottom in Haskell 98. And even with the
imprecise exceptions extension, both expressions still have the same
denotation - they denote the same set of exceptions, one of which is
non-deterministically picked when the program is run.

Re: bug in language definition (strictness)

On 06/08/2009, at 10:59 PM, Simon Marlow wrote:

> On 06/08/2009 13:49, Thomas Davie wrote:
>> On 6 Aug 2009, at 14:37, Nils Anders Danielsson wrote:
>>
>>> On 2009-08-06 11:08, Malcolm Wallace wrote:
>>>> yet, because of the definition of $!, this applies the
>>>> constructor to
>>>> its arguments right-to-left instead of the intuitive left-to-right.
>>>
>>> I do not think that there is a bug: x `seq` y `seq` e has the same
>>> denotation as y `seq` x `seq` e.
>>
>> Not if one considers the "kind" of bottom one receives:
>>
>> undefined `seq` error "it exploded" `seq` e will print
>> "Prelude.undefined"
>> while
>> error "it exploded" `seq` undefined `seq` e will print "Error: it
>> exploded"
>
> There's only one kind of bottom in Haskell 98. And even with the
> imprecise exceptions extension, both expressions still have the same
> denotation - they denote the same set of exceptions, one of which is
> non-deterministically picked when the program is run.

If the FFI Addendum is considered part of Haskell 98, then we have
unsafePerformIO, and so an appeal to denotational equivalence is not
sufficient. When grafting a pure interface onto a notionally-pure
library (specifically a BDD library), I often used seq to get these
effects buried in "pure" values under control.

I also think the principle of least surprise is clearly violated here.

Re: bug in language definition (strictness)

On 06/08/2009 14:20, Peter Gammie wrote:

> On 06/08/2009, at 10:59 PM, Simon Marlow wrote:
>> On 06/08/2009 13:49, Thomas Davie wrote:
>>> On 6 Aug 2009, at 14:37, Nils Anders Danielsson wrote:
>>>
>>>> On 2009-08-06 11:08, Malcolm Wallace wrote:
>>>>> yet, because of the definition of $!, this applies the constructor to
>>>>> its arguments right-to-left instead of the intuitive left-to-right.
>>>>
>>>> I do not think that there is a bug: x `seq` y `seq` e has the same
>>>> denotation as y `seq` x `seq` e.
>>>
>>> Not if one considers the "kind" of bottom one receives:
>>>
>>> undefined `seq` error "it exploded" `seq` e will print
>>> "Prelude.undefined"
>>> while
>>> error "it exploded" `seq` undefined `seq` e will print "Error: it
>>> exploded"
>>
>> There's only one kind of bottom in Haskell 98. And even with the
>> imprecise exceptions extension, both expressions still have the same
>> denotation - they denote the same set of exceptions, one of which is
>> non-deterministically picked when the program is run.
>
> If the FFI Addendum is considered part of Haskell 98, then we have
> unsafePerformIO, and so an appeal to denotational equivalence is not
> sufficient. When grafting a pure interface onto a notionally-pure
> library (specifically a BDD library), I often used seq to get these
> effects buried in "pure" values under control.

That sounds like a very dangerous use of seq and unsafePerformIO to me!

The presence of unsafePerformIO doesn't change the meaning of the rest
of Haskell. You can use it to write programs that don't behave
according to the denotational semantics if you want, but if you do that
it's considered an unsafe use of unsafePerformIO.

What semantics would you like Haskell to have, in which (x `seq` y `seq`
e) and (y `seq` x `seq` e) are not equal?

> I also think the principle of least surprise is clearly violated here.

I do have some sympathy with that. The fact that we're having this
discussion is evidence that something is wrong - and indeed it took
quite a while before we noticed that seq doesn't actually enforce a
sequential evaluation order. And seq was originally introduced to fix
space leaks, but sometimes it can't be used for this because it doesn't
provide enough operational guarantees (I haven't seen such cases myself,
but Malcolm W tells me it really happens).

I'm against the change originally proposed in this thread, because on
its own it doesn't make any difference. But by all means let's think
about ways to restore the lack of surprise, but which hopefully don't
curtail the compiler's ability to optimise, or run programs in parallel.

Re: bug in language definition (strictness)

> What semantics would you like Haskell to have, in which (x `seq` y
> `seq` e) and (y `seq` x `seq` e) are not equal?

I can easily imagine that (x `seq` y `seq` e) might have *two*
semantic denotations: bottom (Exception: stack overflow), and e. And
I would like to be able to choose which one I get (please). This is
the declared purpose of seq, namely "to improve performance by
avoiding unneeded laziness".

Re: bug in language definition (strictness)

On Thu, Aug 06, 2009 at 03:33:40PM +0100, Malcolm Wallace wrote:
> >What semantics would you like Haskell to have, in which (x `seq` y
> >`seq` e) and (y `seq` x `seq` e) are not equal?
>
> I can easily imagine that (x `seq` y `seq` e) might have *two*
> semantic denotations: bottom (Exception: stack overflow), and e.
> And I would like to be able to choose which one I get (please).
> This is the declared purpose of seq, namely "to improve performance
> by avoiding unneeded laziness".

There is no stack overflow in the denotational semantics. Either you
would get an answer with a bigger stack (and that's the denotation),
or you wouldn't (in which case the value is bottom). As Simon said,
the denotational semantics of seq is very limited for specifying
performance.
_______________________________________________
Haskell-prime mailing list
[hidden email]http://www.haskell.org/mailman/listinfo/haskell-prime

Re: bug in language definition (strictness)

On Thu, Aug 6, 2009 at 11:00 AM, Ross Paterson<[hidden email]> wrote:
> There is no stack overflow in the denotational semantics.

There is currently no denotational semantics simulator with an
infinite stack. Until there is, Haskell programs will have to be
executed on lowly physical computers with all the limitations that is
implied. Can we please make things work well on them?
_______________________________________________
Haskell-prime mailing list
[hidden email]http://www.haskell.org/mailman/listinfo/haskell-prime

Re: bug in language definition (strictness)

On 6 Aug 2009, at 17:34, Mathias Stearn wrote:

> On Thu, Aug 6, 2009 at 11:00 AM, Ross Paterson<[hidden email]>
> wrote:
>> There is no stack overflow in the denotational semantics.
>
> There is currently no denotational semantics simulator with an
> infinite stack. Until there is, Haskell programs will have to be
> executed on lowly physical computers with all the limitations that is
> implied. Can we please make things work well on them?

When writing a document that defines the denotation of values, no.
Perhaps when considering operational semantics, but that's an entirely
different matter.

Re: bug in language definition (strictness)

On 06/08/2009 15:33, Malcolm Wallace wrote:
>> What semantics would you like Haskell to have, in which (x `seq` y
>> `seq` e) and (y `seq` x `seq` e) are not equal?
>
> I can easily imagine that (x `seq` y `seq` e) might have *two* semantic
> denotations: bottom (Exception: stack overflow), and e. And I would like
> to be able to choose which one I get (please). This is the declared
> purpose of seq, namely "to improve performance by avoiding unneeded
> laziness".

I'm afraid I don't really comprehend what you're getting at. What do
you mean by an expression having two semantic denotations, and how would
you like to choose which one you get? And I'm baffled by the mention of
stack overflow, where does that come in?

Re: bug in language definition (strictness)

>
>>> What semantics would you like Haskell to have, in which (x `seq` y
>>> `seq` e) and (y `seq` x `seq` e) are not equal?
>>
>> I can easily imagine that (x `seq` y `seq` e) might have *two*
>> semantic
>> denotations: bottom (Exception: stack overflow), and e. And I would
>> like
>> to be able to choose which one I get (please). This is the declared
>> purpose of seq, namely "to improve performance by avoiding unneeded
>> laziness".
>
> I'm afraid I don't really comprehend what you're getting at. What
> do you mean by an expression having two semantic denotations, and
> how would you like to choose which one you get? And I'm baffled by
> the mention of stack overflow, where does that come in?

Whether it is a stack overflow, or some other kind of resource-related
exception like out-of-memory, or too-many-open-file-handles, does not
really matter. As you were saying, semantically they are all just
bottom.

My real point is that seq looks for all the world like it is intended
to affect the operational behaviour of a program, yet some people
insist that it is a purely denotational device, and that operational
questions are not relevant. The Report hints that using seq can
"improve performance" (although the opposite is also true), and
changes in performance are certainly an operational notion. Yet I
think it would be valid to say that seq can turn a non-terminating
(exceptioning) program into a terminating one.

Re: bug in language definition (strictness)

Simon Marlow wrote:
> I'm against the change originally proposed in this thread, because on
> its own it doesn't make any difference.

I wonder if in practice, if we modified the definition of $! in the GHC
libraries, whether GHC would compile things differently enough that we
would notice the difference. Probably not often enough to really notice...

Re: bug in language definition (strictness)

> On 06/08/2009 14:20, Peter Gammie wrote:
>> On 06/08/2009, at 10:59 PM, Simon Marlow wrote:
>>> On 06/08/2009 13:49, Thomas Davie wrote:
>>>> On 6 Aug 2009, at 14:37, Nils Anders Danielsson wrote:
>>>>
>>>>> On 2009-08-06 11:08, Malcolm Wallace wrote:
>>>>>> yet, because of the definition of $!, this applies the
>>>>>> constructor to
>>>>>> its arguments right-to-left instead of the intuitive left-to-
>>>>>> right.
>>>>>
>>>>> I do not think that there is a bug: x `seq` y `seq` e has the same
>>>>> denotation as y `seq` x `seq` e.
>>>>
>>>> Not if one considers the "kind" of bottom one receives:
>>>>
>>>> undefined `seq` error "it exploded" `seq` e will print
>>>> "Prelude.undefined"
>>>> while
>>>> error "it exploded" `seq` undefined `seq` e will print "Error: it
>>>> exploded"
>>>
>>> There's only one kind of bottom in Haskell 98. And even with the
>>> imprecise exceptions extension, both expressions still have the same
>>> denotation - they denote the same set of exceptions, one of which is
>>> non-deterministically picked when the program is run.
>>
>> If the FFI Addendum is considered part of Haskell 98, then we have
>> unsafePerformIO, and so an appeal to denotational equivalence is not
>> sufficient. When grafting a pure interface onto a notionally-pure
>> library (specifically a BDD library), I often used seq to get these
>> effects buried in "pure" values under control.
>
> That sounds like a very dangerous use of seq and unsafePerformIO to
> me!

Without the seq, a recursive use of a quantifier will screw up the
effect of bdd_assoc. How is this unsafe?
IMHO I'm using seq in the standard way, to shuffle the data
dependencies to get better (correct) operational behaviour.

I grant that it is dodgy in a concurrent setting, but the library
itself is not thread safe.

> What semantics would you like Haskell to have, in which (x `seq` y
> `seq` e) and (y `seq` x `seq` e) are not equal?

As Malcolm observes, an operational one. seq is difficult to motivate
without such considerations anyway. I'm sure you could imagine an
example where space use differs between the two.

My point was that in the presence of unsafePerformIO, even if the
denotational values are the same for the two seq variants, the end-
user observable effects are not. I meant this as an example of
violating least-surprise in a context a bit larger than pure Haskell
semantics.

BTW I have no opinion on the change Malcolm proposes, but would like
to see the issue resolved.

Re: bug in language definition (strictness)

>>
>>>> What semantics would you like Haskell to have, in which (x `seq` y
>>>> `seq` e) and (y `seq` x `seq` e) are not equal?
>>>
>>> I can easily imagine that (x `seq` y `seq` e) might have *two* semantic
>>> denotations: bottom (Exception: stack overflow), and e. And I would like
>>> to be able to choose which one I get (please). This is the declared
>>> purpose of seq, namely "to improve performance by avoiding unneeded
>>> laziness".
>>
>> I'm afraid I don't really comprehend what you're getting at. What do
>> you mean by an expression having two semantic denotations, and how
>> would you like to choose which one you get? And I'm baffled by the
>> mention of stack overflow, where does that come in?
>
> Whether it is a stack overflow, or some other kind of resource-related
> exception like out-of-memory, or too-many-open-file-handles, does not
> really matter. As you were saying, semantically they are all just bottom.
>
> My real point is that seq looks for all the world like it is intended to
> affect the operational behaviour of a program, yet some people insist
> that it is a purely denotational device, and that operational questions
> are not relevant.

The fact remains that seq *is* defined denotationally, and any
implementation that respects its semantics is legal. If you want to
change that, you need to make a Haskell Prime proposal.

I think it might be difficult to do that. The Haskell report has no
framework for talking about operational semantics at all. The pure
subset of Haskell is timeless, there's no specified evaluation order.
Before we think about changing that, let's remember why it's that way:
one reason is that evaluation order doesn't affect the denotational
semantics, so it's unnecessary. Another reason is that it lets Haskell
admit many different implementations, including things like automatic
speculative parallelism. If you start nailing down evaluation orders,
you rule out interesting implementations. (this is why I've complained
about lazy I/O in the past - it starts to creep in this direction).

There's nothing stopping you from making a compiler in which seq has the
behaviour you want. Indeed, Control.Parallel.pseq is the kind of seq
you want (I think - we haven't put any thought into precisely specifying
what it means). pseq is a valid implementation of seq, it's just not
the one we normally want to use, because it restricts the compiler's
ability to optimise. And pseq isn't something we want to mandate in the
language, because it only makes sense in certain kinds of
implementations. I'm completely happy with asking users to use pseq
when they really want an evaluation order guarantee.

> Yet I think it would be
> valid to say that seq can turn a non-terminating (exceptioning) program
> into a terminating one.

Re: bug in language definition (strictness)

> On 07/08/2009, at 12:00 AM, Simon Marlow wrote:
>
>> On 06/08/2009 14:20, Peter Gammie wrote:
>>> On 06/08/2009, at 10:59 PM, Simon Marlow wrote:
>>>> On 06/08/2009 13:49, Thomas Davie wrote:
>>>>> On 6 Aug 2009, at 14:37, Nils Anders Danielsson wrote:
>>>>>
>>>>>> On 2009-08-06 11:08, Malcolm Wallace wrote:
>>>>>>> yet, because of the definition of $!, this applies the
>>>>>>> constructor to
>>>>>>> its arguments right-to-left instead of the intuitive left-to-right.
>>>>>>
>>>>>> I do not think that there is a bug: x `seq` y `seq` e has the same
>>>>>> denotation as y `seq` x `seq` e.
>>>>>
>>>>> Not if one considers the "kind" of bottom one receives:
>>>>>
>>>>> undefined `seq` error "it exploded" `seq` e will print
>>>>> "Prelude.undefined"
>>>>> while
>>>>> error "it exploded" `seq` undefined `seq` e will print "Error: it
>>>>> exploded"
>>>>
>>>> There's only one kind of bottom in Haskell 98. And even with the
>>>> imprecise exceptions extension, both expressions still have the same
>>>> denotation - they denote the same set of exceptions, one of which is
>>>> non-deterministically picked when the program is run.
>>>
>>> If the FFI Addendum is considered part of Haskell 98, then we have
>>> unsafePerformIO, and so an appeal to denotational equivalence is not
>>> sufficient. When grafting a pure interface onto a notionally-pure
>>> library (specifically a BDD library), I often used seq to get these
>>> effects buried in "pure" values under control.
>>
>> That sounds like a very dangerous use of seq and unsafePerformIO to me!
>
> How so? Take this code:
>
> newtype BDD = BDD (ForeignPtr Int)
>
> exists :: Group BDD -> BDD -> BDD
> exists group bdd = bdd `seq` unsafePerformIO $
> withGroup group $ \ gid ->
> do bdd_assoc bdd_manager gid
> withBDD bdd ({#call unsafe bdd_exists#} bdd_manager) >>= addBDDfinalizer
>
> Without the seq, a recursive use of a quantifier will screw up the
> effect of bdd_assoc. How is this unsafe?
> IMHO I'm using seq in the standard way, to shuffle the data dependencies
> to get better (correct) operational behaviour.
>
> I grant that it is dodgy in a concurrent setting, but the library itself
> is not thread safe.

If, as I understand it, you are relying on the fact that seq's first
argument is evaluted before its second, then you really want pseq rather
than seq.

This is the sense in which I mean it's a dangerous use of seq and
unsafePerformIO - because the compiler is within its rights to evaluate
the second argument to seq "first".

In GHC we provide a way to do what you want (pseq), I'm just not
convinced it should be the required behaviour of seq.

Re: bug in language definition (strictness)

On 07/08/2009, at 6:04 PM, Simon Marlow wrote:

> If, as I understand it, you are relying on the fact that seq's first
> argument is evaluted before its second, then you really want pseq
> rather than seq.
>
> This is the sense in which I mean it's a dangerous use of seq and
> unsafePerformIO - because the compiler is within its rights to
> evaluate the second argument to seq "first".
>
> In GHC we provide a way to do what you want (pseq), I'm just not
> convinced it should be the required behaviour of seq.

OK, thanks for pointing that out. I would have thought that what I was
doing (or trying to do) was a reasonable use of the FFI and
unsafePerformIO - from the library user's POV the interface really is
pure, at least in a sequential setting. So should there be a mechanism
for guaranteeing left-to-right evaluation ala pseq in Haskell Prime?

The only difference between foldl and foldl' is strictness (the use of
seq). By "non-terminating (exceptioning)" I of course really meant
"terminating with bottom" as opposed to "terminating with a defined
value", but since non-termination and exceptions are semantically both
bottom, you won't mind that slip. :-)

Re: bug in language definition (strictness)

> If, as I understand it, you are relying on the fact that seq's first
> argument is evaluted before its second, then you really want pseq
> rather than seq.
>
> In GHC we provide a way to do what you want (pseq), I'm just not
> convinced it should be the required behaviour of seq.

Whilst looking for something else, I have discovered that the official
list of changes from Haskell 1.2 to Haskell 1.3 has this explicit
statement:

I assume that it was simply inattention whilst editing the Report
text, that caused the removal of the official requirement that seq
evaluate its first argument before its second, because I can find no
discussion by the committee that suggested such an important change.

If you can point me to the committee discussion that led to changing
this requirement, I would be grateful.

Re: bug in language definition (strictness)

I wonder if this discussion has veered too far into legalistic
reasoning (of various sorts). From where I'm standing the
state-of-play is this:

1. Compiler writers (and some users) want a "liberal" version of seq
(that's slippery about evaluation order) for optimizations and better
correspondence with the denotational semantics.
2. Other users want a version of seq that guarantees evaluation order
for use cases where that matters.

Is there any deep reason why we don't just figure out what the best
way to give people both versions of seq is and do that?

>> If, as I understand it, you are relying on the fact that seq's first
>> argument is evaluted before its second, then you really want pseq rather
>> than seq.
>>
>> In GHC we provide a way to do what you want (pseq), I'm just not convinced
>> it should be the required behaviour of seq.
>
> Whilst looking for something else, I have discovered that the official list
> of changes from Haskell 1.2 to Haskell 1.3 has this explicit statement:
>
> "The seq function evaluates its first argument before returning the second
> one."
> http://haskell.org/definition/from12to13.html>
> In addition, the 1.3 and 1.4 Reports, in describing how seq should be
> understood for any particular type, states:
>
> "The _case_ is used to force evaluation of the first argument to `seq`
> before returning the second argument."
>
> The Haskell'98 committee decided to remove the Eval class (that had seq as
> one of its methods), see
> http://www.cs.chalmers.se/~rjmh/Haskell/Messages/Decision.cgi?id=204>
> I assume that it was simply inattention whilst editing the Report text, that
> caused the removal of the official requirement that seq evaluate its first
> argument before its second, because I can find no discussion by the
> committee that suggested such an important change.
>
> If you can point me to the committee discussion that led to changing this
> requirement, I would be grateful.
>
> Regards,
> Malcolm
>
> _______________________________________________
> Haskell-prime mailing list
> [hidden email]> http://www.haskell.org/mailman/listinfo/haskell-prime>