Summary: I have made the simplified version Duncan speculates about...
Duncan Coutts wrote:
> Yep, definitely interested. Sounds like we could make something that
> would satisfy the needs of existing users of the binary and
> binary-strict packages.
I see this as incremental development of incremental get. The comments in
binary-strict's code point to one additional useful feature: When the parser
suspends and asks for more input it could potentially also return a list or
Sequence of the output-so-far.
I believe this is possible to add to my existing code (even the simplified
version below), so there will be an even fancier version eventually. This would
make the parser a controllable part of a pipeline.
> We'll have to look closely at the performance costs of the new features
> but my intuition is that a non-transformer but continuation based
> version that has error handling (and plus/alternative) and can request
> more input should have minimal cost.
>> Duncan
>
To modify down "MyGet.hs" to produce that type is a matter of using the delete
key (which I have done, see below). I only have my Apple powerpc/G4 laptop to
run Haskell, this and the lack of either need or available time means I will not
be making performance measurements. I have written for the language shootout
before and I had binary's Get.hs to look at, and so I claim my code has no
show-stopping performance killers in it. A bit of !strictness and even more
INLINE pragmas should be all it needs.
And so I just took such a knife to MyGet.hs to make MyGetSimplified.hs.
It is at
http://darcs.haskell.org/packages/protocol-buffers/Text/ProtocolBuffers/
with the other files.
The simplified form of the data definitions mostly fits in this email:
newtype Get a = Get {
unGet :: forall b. -- the forall hides the CPS style
Success b a -- main continuation
-> S -- parser state
-> FrameStack b -- error handler stack
-> Result b -- operation
}
type Success b a = (a -> S -> FrameStack b -> Result b)
data S = S { top :: {-# UNPACK #-} !S.ByteString
, current :: {-# UNPACK #-} !L.ByteString
, consumed :: {-# UNPACK #-} !Int64
} deriving Show
data FrameStack b = [snip details]
data Result a = Failed {-# UNPACK #-} !Int64 String
-- the Int64 is amount consumed successfully
| Finished {-# UNPACK #-} !L.ByteString {-# UNPACK #-} !Int64 a
-- the bytestring is the unconsumed part, the Int64 is amount consumed
| Partial (Maybe L.ByteString -> Result a)
-- passing Nothing indicates that there will never be more input
This could be streamlined further by making (FrameStack b) a field of S.
The MonadError/Plus/Alternative should still work. It will still suspend and
resume.
A few more short functions and this will have the same exported signatures as
binary's Get.hs.
Cheers,
Chris