As someone who's not used these library methods before, I would expect splitBy
and splitLines to work differently to each other. When splitting into lines,
I would assume that it is repeatedly applying the regular expression "([^t]*)
(t|$)" where t is the line-terminator. You return the first group each time,
and discard the rest. The 2nd group also handles the end-of-string boundary
condition.
As others have said, I would expect splitBy to return all of the zero-length
matches as well - interlieving a "[^t]*" match-and-return with a "t"
match-and-discard. The collapsed form of the output is the same as
interleving a "[^t]" match-and-return with a "t*" match-and-discard.
Matthew
On Thursday 13 July 2006 10:16, Jon Fairbairn wrote:
> On 2006-07-12 at 23:24BST "Brian Hulley" wrote:
> > Christian Maeder wrote:
> > > Donald Bruce Stewart schrieb:
> > >> Question over whether it should be:
> > >> splitBy (=='a') "aabbaca" == ["","","bb","c",""]
> > >> or
> > >> splitBy (=='a') "aabbaca" == ["bb","c"]
> > >>
> > >> I argue the second form is what people usually want.
> > >
> > > Yes, the second form is needed for "words", but the first form is
> > > needed for "lines", where one final empty element needs to be removed
> > > from your version!
> > >
> > > Prelude> lines "a\nb\n"
> > > ["a","b"]
> > > Prelude> lines "a\n\nb\n\n"
> > > ["a","","b",""]
> >
> > Prelude.lines and Prelude.unlines treat '\n' as a terminator instead of a
> > separator. I'd argue that this is poor design, since information is lost
> > ie lines . unlines === id whereas unlines . lines =/= id whereas if '\n'
> > had been properly conceived of as a separator, the identity would hold.
>> Hooray! I've been waiting to ask "Why aren't we asking what
> laws hold for these operations?" but now you've saved me the
> effort. I've been bitten by unlines . lines /= id already;
> it's something we could gainfully change without wrecking
> too much code, methinks.
>> > So I vote for the first option ie:
> >
> > splitBy (=='a') "aabbaca" == ["","","bb","c",""]
>> Seconded.
>> As far as naming is concerned, since this is a declarative
> language, surely we shouldn't be using active verbs like
> this? (OK I lost that argument way back in the mists of
> Haskell 0.0 with take. Before then I called "take" "first":
> "first n some_list" reads perfectly well).
>> Jón