However we know that really the parameters to this continuation could
really be unboxed since we construct all these continuations explicitly
in this module. We could re-write the monad type by manually explicitly
unboxing the Buffer argument:

Note that we could still get a continuation in from the 'outside' with
the original type and convert it to one with the above type, though
obviously that involves unpacking the arguments. This unpacking is
basically what the wrapper of a worker/wrapper pair does of course.

Obviously this is ugly. We'd much rather write the original and with
just a couple annotations get the above representation. This is what I
would like to write:

So I'm declaring a type and a data constructor that contains a function
that is strict in one of it's arguments.

I do not wish to distinguish the strictness in the type however, that is
it should be perfectly ok to do this:

foo :: Foo -> Buffer -> [B.ByteString]

Put foo newBuffer :: Put Foo

Suppose that in this case we do not know locally that foo is strict and
can take it's args unboxed then upon applying the Put constructor we
just have to apply a wrapper function that boxes up the args and
supplies them to the wrapped function.

Obviously that'd be a pessimisation, having to re-box args, however we
can expect programmers to only use that UNPACK pragma when they know
that it is going to be a win overall.

So the ! on the arg is a semantic change and the pragma is a
representation change but not a semantic change. The ! means the obvious
thing, that the function is strict in that arg. So either the caller or
calle must ensure the arg is evaluated to WHNF.

Hmm, is the UNPACK actually needed then? Normally when ghc determines
that a func is strict in an arg it make the worker unbox that arg if
possible and that always seems to be a win. Mine you, in that case we
know what the function is going to do with each arg, where as here we do
not so perhaps a separate UNPACK makes sense.

This issues crops up quite a bit with Monads I think. For example GHC
defines it's IO monad to return an unboxed tuple:

newtype IO a = IO (State# RealWorld -> (# State# RealWorld, a #))

This is fine for low level internal code, but for everyday monads it'd
be nice to be able to write nice code and still get great performance.
This applies to monads particularly since they very often fully
encapsulate the function used as the internal representation, that is
all the sites where the representation function is constructed and taken
apart are in a single module. So we can 'see' that all uses are strict
in some arg. Or if they're not we might want them to be.

So instead of doing an analysis to figure out if we're always using it
strictly (which is something that jhc/grin might be able to do?) we can
just declare the data representation to be strict like we do with
ordinary algebraic data types, that way it propagates strictness to the
usage sites which is much more convenient than going around strictifying
all the usage sites in an attempt to make an analysis notice that it's
safe to do a representation change.

RE: More speed please!

Can you give me a small program that demonstrates the issue? (With any support modules it needs, but the less the better.)

Thanks

S

| -----Original Message-----
| From: [hidden email] [mailto:[hidden email]] On
| Behalf Of Duncan Coutts
| Sent: 16 March 2007 00:56
| To: [hidden email]| Subject: More speed please!
|
| All,
|
| So here's the Put monad for the binary serialisation stuff:
|
| newtype Put a = Put {
| runPut :: (a -> Buffer -> [B.ByteString])
| -> Buffer -> [B.ByteString]
| }
|
| data Buffer = Buffer {-# UNPACK #-} !(ForeignPtr Word8)
| {-# UNPACK #-} !Int -- offset
| {-# UNPACK #-} !Int -- used bytes
| {-# UNPACK #-} !Int -- length left
|
| This is all good, and pretty quick. Code like the following gets turned
| into great low level code:
|
| foo :: Word8 -> Put ()
| foo !n = do
| word8 n
| word8 (n+1)
| word8 (n+17)
|
| It does a straight line sequence of writes into memory (there are rules
| that combine the bounds checking of adjacent writes). After that it
| calls the continuation with a new buffer.
|
| While almost everything is unboxed, the parameters to the continuation
| are not unboxed. So we have to allocate a new Buffer structure and pass
| that to the continuation:
|
| let {
| sat_s1F8 =
| NO_CCS PutMonad.Buffer! [ww1_s1Es
| ww2_s1Et
| ww3_s1Ew
| sat_s1F4
| sat_s1F6];
| } in
| w_s1DY
| GHC.Base.()
| sat_s1F8;
|
| w_s1DY being the continuation here.
|
| However we know that really the parameters to this continuation could
| really be unboxed since we construct all these continuations explicitly
| in this module. We could re-write the monad type by manually explicitly
| unboxing the Buffer argument:
|
| newtype Put a = Put {
| runPut :: (a -> Addr# -> ForeignPtrContents -> Int# -> Int# -> Int# -> [B.ByteString])
| -> Addr# -> ForeignPtrContents -> Int# -> Int# -> Int# -> [B.ByteString]
| }
|
| Then we'd get no allocations and no heap checks in the fast path.
|
| Note that we could still get a continuation in from the 'outside' with
| the original type and convert it to one with the above type, though
| obviously that involves unpacking the arguments. This unpacking is
| basically what the wrapper of a worker/wrapper pair does of course.
|
| Obviously this is ugly. We'd much rather write the original and with
| just a couple annotations get the above representation. This is what I
| would like to write:
|
| newtype Put a = Put {
| runPut :: (a -> {-# UNPACK #-} !Buffer -> [B.ByteString])
| -> {-# UNPACK #-} !Buffer -> [B.ByteString]
| }
|
| So I'm declaring a type and a data constructor that contains a function
| that is strict in one of it's arguments.
|
| I do not wish to distinguish the strictness in the type however, that is
| it should be perfectly ok to do this:
|
| foo :: Foo -> Buffer -> [B.ByteString]
|
| Put foo newBuffer :: Put Foo
|
| Suppose that in this case we do not know locally that foo is strict and
| can take it's args unboxed then upon applying the Put constructor we
| just have to apply a wrapper function that boxes up the args and
| supplies them to the wrapped function.
|
| Obviously that'd be a pessimisation, having to re-box args, however we
| can expect programmers to only use that UNPACK pragma when they know
| that it is going to be a win overall.
|
| So the ! on the arg is a semantic change and the pragma is a
| representation change but not a semantic change. The ! means the obvious
| thing, that the function is strict in that arg. So either the caller or
| calle must ensure the arg is evaluated to WHNF.
|
| Hmm, is the UNPACK actually needed then? Normally when ghc determines
| that a func is strict in an arg it make the worker unbox that arg if
| possible and that always seems to be a win. Mine you, in that case we
| know what the function is going to do with each arg, where as here we do
| not so perhaps a separate UNPACK makes sense.
|
| This issues crops up quite a bit with Monads I think. For example GHC
| defines it's IO monad to return an unboxed tuple:
|
| newtype IO a = IO (State# RealWorld -> (# State# RealWorld, a #))
|
| This is fine for low level internal code, but for everyday monads it'd
| be nice to be able to write nice code and still get great performance.
| This applies to monads particularly since they very often fully
| encapsulate the function used as the internal representation, that is
| all the sites where the representation function is constructed and taken
| apart are in a single module. So we can 'see' that all uses are strict
| in some arg. Or if they're not we might want them to be.
|
| So instead of doing an analysis to figure out if we're always using it
| strictly (which is something that jhc/grin might be able to do?) we can
| just declare the data representation to be strict like we do with
| ordinary algebraic data types, that way it propagates strictness to the
| usage sites which is much more convenient than going around strictifying
| all the usage sites in an attempt to make an analysis notice that it's
| safe to do a representation change.
|
| Ok, I'm done. :-)
|
| Duncan
|
| _______________________________________________
| Glasgow-haskell-users mailing list
| [hidden email]| http://www.haskell.org/mailman/listinfo/glasgow-haskell-users_______________________________________________
Glasgow-haskell-users mailing list
[hidden email]http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

> | foo !n = do
> | word8 n
> | word8 (n+1)
> | word8 (n+17)
> |
> | It does a straight line sequence of writes into memory (there are rules
> | that combine the bounds checking of adjacent writes). After that it
> | calls the continuation with a new buffer.
> |
> | While almost everything is unboxed, the parameters to the continuation
> | are not unboxed. So we have to allocate a new Buffer structure and pass
> | that to the continuation:
> |
> | let {
> | sat_s1F8 =
> | NO_CCS PutMonad.Buffer! [ww1_s1Es
> | ww2_s1Et
> | ww3_s1Ew
> | sat_s1F4
> | sat_s1F6];
> | } in
> | w_s1DY
> | GHC.Base.()
> | sat_s1F8;
> |
> | w_s1DY being the continuation here.
> |
> | However we know that really the parameters to this continuation could
> | really be unboxed since we construct all these continuations explicitly
> | in this module. We could re-write the monad type by manually explicitly
> | unboxing the Buffer argument:
> |
> | newtype Put a = Put {
> | runPut :: (a -> Addr# -> ForeignPtrContents -> Int# -> Int# -> Int# -> [B.ByteString])
> | -> Addr# -> ForeignPtrContents -> Int# -> Int# -> Int# -> [B.ByteString]
> | }
> |
> | Then we'd get no allocations and no heap checks in the fast path.
> |
> | Note that we could still get a continuation in from the 'outside' with
> | the original type and convert it to one with the above type, though
> | obviously that involves unpacking the arguments. This unpacking is
> | basically what the wrapper of a worker/wrapper pair does of course.
> |
> | Obviously this is ugly. We'd much rather write the original and with
> | just a couple annotations get the above representation. This is what I
> | would like to write:
> |
> | newtype Put a = Put {
> | runPut :: (a -> {-# UNPACK #-} !Buffer -> [B.ByteString])
> | -> {-# UNPACK #-} !Buffer -> [B.ByteString]
> | }
> |
> | So I'm declaring a type and a data constructor that contains a function
> | that is strict in one of it's arguments.
> |
> | I do not wish to distinguish the strictness in the type however, that is
> | it should be perfectly ok to do this:
> |

| foo :: Foo ->> Buffer -> [B.ByteString]

> |
> | Put foo newBuffer :: Put Foo
> |
> | Suppose that in this case we do not know locally that foo is strict and
> | can take it's args unboxed then upon applying the Put constructor we
> | just have to apply a wrapper function that boxes up the args and
> | supplies them to the wrapped function.
> |
> | Obviously that'd be a pessimisation, having to re-box args, however we
> | can expect programmers to only use that UNPACK pragma when they know
> | that it is going to be a win overall.
> |
> | So the ! on the arg is a semantic change and the pragma is a
> | representation change but not a semantic change. The ! means the obvious
> | thing, that the function is strict in that arg. So either the caller or
> | calle must ensure the arg is evaluated to WHNF.
> |
> | Hmm, is the UNPACK actually needed then? Normally when ghc determines
> | that a func is strict in an arg it make the worker unbox that arg if
> | possible and that always seems to be a win. Mine you, in that case we
> | know what the function is going to do with each arg, where as here we do
> | not so perhaps a separate UNPACK makes sense.
> |
> | This issues crops up quite a bit with Monads I think. For example GHC
> | defines it's IO monad to return an unboxed tuple:
> |
> | newtype IO a = IO (State# RealWorld -> (# State# RealWorld, a #))
> |
> | This is fine for low level internal code, but for everyday monads it'd
> | be nice to be able to write nice code and still get great performance.
> | This applies to monads particularly since they very often fully
> | encapsulate the function used as the internal representation, that is
> | all the sites where the representation function is constructed and taken
> | apart are in a single module. So we can 'see' that all uses are strict
> | in some arg. Or if they're not we might want them to be.
> |
> | So instead of doing an analysis to figure out if we're always using it
> | strictly (which is something that jhc/grin might be able to do?) we can
> | just declare the data representation to be strict like we do with
> | ordinary algebraic data types, that way it propagates strictness to the
> | usage sites which is much more convenient than going around strictifying
> | all the usage sites in an attempt to make an analysis notice that it's
> | safe to do a representation change.
> |
> | Ok, I'm done. :-)
> |
> | Duncan
> |
> | _______________________________________________
> | Glasgow-haskell-users mailing list
> | [hidden email]> | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users> _______________________________________________
> Glasgow-haskell-users mailing list
> [hidden email]> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

RE: More speed please!

On Fri, 2007-03-16 at 08:33 +0000, Simon Peyton-Jones wrote:
> Can you give me a small program that demonstrates the issue? (With
> any support modules it needs, but the less the better.)

Ok, so the main thing to notice in this example is that the function
that we keep in the Put (newtype) constructor is encapsulated in this
module. The Put constructor is not exported and all the functions
wrapped in Put are explicitly constructed in this module, we don't take
in unknown functions from the outside. So it's an example where we could
safely change the data representation with no external effect.

Interesting. Is that what you meant? An undesirable consequence would be that
case (MkBar bot) of MkBar f -> f `seq` 0
would return 0, because the MkBar constructor puts a lambda inside. This seems bad. Maybe you can only put a ! inside the function type if you have a bang at the top (like MkBaz).

> Interesting. Is that what you meant? An undesirable consequence would be that
> case (MkBar bot) of MkBar f -> f `seq` 0
> would return 0, because the MkBar constructor puts a lambda inside.
> This seems bad. Maybe you can only put a ! inside the function type
> if you have a bang at the top (like MkBaz).

Hmm, yes I see.

Well that seems like a reasonable restriction. In my original example I
was using newtype rather than data (which of course is like data with !
on the only component).

Afterall, in practise I think the main use of this semantic change will
be to take advantage of faster calling conventions and so we'd be
perfectly happy with being strict in function itself.

RE: More speed please!

I'm not sure how to make progress with this thread (see below). On the one hand there is an interesting idea here. On the other, I don't want to put more ad-hoc-ery into GHC.

What this cries out for is a notion of strict function *in the types*. So if
f :: !Int -> Int
then you know that f is strict, and you can use call-by-value. GHC has no such notion at the moment. The bangs in constructors are very specific to constructors, and handled in an ad-hoc way.

Duncan wants to make them more first class, which is good. But that would mean making !T into a Core type, not just a source-language annotation on data constructors. Doing this in a systematic way is attractive, but slippery. Ben Rudiak-Gould has spent quite a bit of time thinking about it. There are many questions; e.g: can ! appear to the right of an arrow? inside tuples (!a,!b)? inside lists [!a]? Can a polymorphic function be called at a bang-type? etc

Anyway, I'm inclined to make haste slowly on this one. If someone feels like working out the details, the way lies open. Alternatively, the ad-hoc solution might be so important that it's worth implementing despite its ad-hocery.

Re: More speed please!

I'm replying to a rather old thread here, about unboxing in functions. Duncan
had a continuation monad which passed around some data type that would be nice
to unbox. You discussed strictness annotations in function types as a potential
solution. I have a different tack on the problem which seems potentially
useful. I've experimented with doing local defunctionalization on the module.
This is a long mail as I will try to explain in some detail what it is that I
have done. Please be patient.

Normal defunctionalization is about replacing the primitive function type
"a -> b" with an algebraic data type which I'll call "Fun a b". Not all
functions will be eliminated as we will see but the program will be first
order after the transformation. The core of the transformation is that every
lambda in the program gives rise to a new constructor in the Fun data type and
whenever we apply a function we instead call a newly created "apply function"
with the following type "Fun a b -> a -> b". This is basically what JHC does.

Defunctionalization is normally a whole program transformation (which is why
JHC is a whole program compiler). But sometimes it can be done on a per module
basis. This is where *local* defunctionalization comes in. The key to local
defunctionalization is that we often can divide the data type Fun into several
disjoint data types. We can do this whenever there are several different
function spaces that never get mixed up. And sometimes we're even so lucky
that a function space is totally contained in one module. Then we can do
local defunctionalization of that particular function space only and
completely within that module without changing it's interface. This case often
comes up when using the continuation monad and Duncan's code is not an
exception.

So, I've manually done local defunctionalization on Duncan's code. It gives
rise to two types which I've called Fun1 and Fun2. They look like follows
(including the Put monad):

Now, the cool thing is that once GHC starts optimizing away on these apply
functions they will be unboxed and no Buffer will ever be created or passed
around. Here is the core type for apply1:
\begin{core}
$wapply1_r21p :: forall a_aQu.
PutMonad.Fun1 a_aQu
-> a_aQu
-> GHC.Prim.Addr#
-> GHC.ForeignPtr.ForeignPtrContents
-> GHC.Prim.Int#
-> GHC.Prim.Int#
-> GHC.Prim.Int#
-> [Data.ByteString.Base.ByteString]
\end{core}
This is exactly what Duncan wanted, right? I declare victory :-)

However, things are not all roses. There are some functions that will
not be unboxed as we hope for with this approach, for instance the function
flushOld (see Duncan's code). To achieve the best possible optimization I
think one would have to perform strictness analysis and the worker-wrapper
transformation twice, once before doing local defunctionalization and then
again on the apply functions generated by the defunctionalization process.
This should give the code that Duncan wants I believe.

I think it should be relatively straightforward to implement local
defunctionalization in GHC but it should not be turned on by default as the
number of modules where it is beneficial is rather few.

The complete defunctionalized version of Duncan's module is attached.

I'm sure there are a lot of things that are somewhat unclear in this message.
Feel free to ask and I'll do my best to clarify.

RE: More speed please!

| I'm replying to a rather old thread here, about unboxing in functions. Duncan
| had a continuation monad which passed around some data type that would be nice
| to unbox. You discussed strictness annotations in function types as a potential
| solution. I have a different tack on the problem which seems potentially
| useful. I've experimented with doing local defunctionalization on the module.

Interesting suggestion, Josef. In general, local defunctionalisation would be an intersting transformation to try. I'm not sure how well it would scale: the larger the scope, the bigger the more distinct functions and the bigger the dispatch table.

Also your transformation is semantically transparent (no effect) whereas Duncan is prepared to add ! annotations that really make things stricter, just as ! annotations in data type decls do today. So presumably he will get further than you will, because he is making more assumptions.

Meanwhile, I've thought a bit more about Duncan's idea. One attractive aspect is that you can regard it as a direct extension of Haskell's existing mechanism of ! on data types, making the {-# UNPACK #-} pragma look inside function types as well as looking inside data types. I like that. It makes it sounds less ad hoc than I previously thought. I'll open a Trac ticket for this thread, http://hackage.haskell.org/trac/ghc/ticket/1349

Re: More speed please!

On 5/11/07, Simon Peyton-Jones <[hidden email]> wrote:
> | I'm replying to a rather old thread here, about unboxing in functions. Duncan
> | had a continuation monad which passed around some data type that would be nice
> | to unbox. You discussed strictness annotations in function types as a potential
> | solution. I have a different tack on the problem which seems potentially
> | useful. I've experimented with doing local defunctionalization on the module.
>
> Interesting suggestion, Josef. In general, local defunctionalisation would be an intersting transformation to try. I'm not sure how well it would scale: the larger the scope, the bigger the more distinct functions and the bigger the dispatch table.
>
Indeed the dispatch table could grow big, but I'm not sure it would be
a scalability problem. Note that all the code that goes in to these
dispatch tables (I call them apply functions) are ripped out from
other places in the program. So there is really no new code being
added, it's only shuffled around.

On the other hand I don't know how GHC deals with large case
expressions and if they are a problem, be it that they can increase
the compilation time or the runtime of the program, then there might
of course be a problem.

> Also your transformation is semantically transparent (no effect) whereas Duncan is prepared to add ! annotations that really make things stricter, just as ! annotations in data type decls do today. So presumably he will get further than you will, because he is making more assumptions.
>
Indeed. But I think the main advantage for Duncan's approach, over
local defunctionalization, is its general applicability. Local
defunctionalization only kicks in under very special circumstances and
even then isn't always a net win (or so my intuition tells me). The
bang annotations otoh can be inserted wherever you like and would
presumably work transparently across module borders.

>
> Meanwhile, I've thought a bit more about Duncan's idea. One attractive aspect is that you can regard it as a direct extension of Haskell's existing mechanism of ! on data types, making the {-# UNPACK #-} pragma look inside function types as well as looking inside data types. I like that. It makes it sounds less ad hoc than I previously thought. I'll open a Trac ticket for this thread, http://hackage.haskell.org/trac/ghc/ticket/1349>
Sounds good! It would be a cool thing to have. I'm looking forward to
seeing it implemented in GHC :-)