This page lists proposed extensions to the Haskell list functions, whether in the [http://www.haskell.org/ghc/docs/latest/html/libraries/base/Prelude.html Prelude] or [http://www.haskell.org/ghc/docs/latest/html/libraries/base/Data-List.html Data.List].

+

Please discuss the proposals on the Talk Page or the libraries list, and use this page to record the results of discussions.

+

However, since the advent of [[Hackage]]DB and [[Cabal-Install]] it is preferred to provide such functionality in specialised packages, rather than extending the already large base library.

−

We need these useful functions in Data.List; I'll call them 'split' (and variants) and 'replace'. These are easily implemented but everyone always reinvents them. The goal is clarity/uniformity (everyone uses them widely and recognizes them) and portability (I don't have to keep reimplementing these or copying that one file UsefulMissingFunctions.hs).

+

== Splitting on a separator, etc ==

−

Use this page to record consensus as reached on the Talk Page. (Use four tildes to sign your post automatically with your name/timestamp.) Diverging opinions welcome! Note: a lot of good points (diverging opinions!) are covered in the mailing lists, but if we include all these various cases, split* will have 9 variants! I'm working on trying to organize all this into something meaningful.

+

We need these useful functions in Data.List; I'll call them 'split' (and variants) and 'replace'. These are easily implemented but everyone always reinvents them. Various versions have been proposed, but there was no consensus on which was best, e.g.

Note: a lot of good points (diverging opinions!) are covered in the mailing lists, but if we include all these various cases, split* will have 9 variants! The goal is to reach some kind of reasonable consensus, specifically on naming and semantics. Even if we need pairs of functions to satisfy various usage and algebraic needs. Failing to accommodate every possible use of these functions should not be a sufficient reason to abandon the whole project.

−

Hacking up your own custom split (or a tokens/splitOnGlue) must be one

+

−

of the most common questions from beginners on the IRC channel.

+

−

Anyone remember what the result of the "let's get split into the base

+

The goal is clarity/uniformity (everyone uses them widely and recognizes them) and portability (I don't have to keep reimplementing these or copying that one file UsefulMissingFunctions.hs).

−

library" movement's work was?

+

−

+

−

ISTR there wasn't a consensus, so nothing happened. Which is silly,

+

−

really - I agree we should definitely have a Data.List.split.

+

−

</i>

+

−

A thread July 2006

+

Note: I (Jared Updike) am working with the belief that efficiency should not be a valid argument to bar these otherwise universally useful functions from the libraries; regexes are overkill for 'split' and 'replace' for common simple situations. Let's assume people will know (or learn) when they need heavier machinery (regexes, ByteString) and will use it when efficiency is important. We can try to facilitate this by reusing any names from ByteString, etc.

First of all: Check out whether the [http://hackage.haskell.org/cgi-bin/hackage-scripts/package/split split] package provides, what you need.

−

+

−

http://www.haskell.org/pipermail/libraries/2004-July/thread.html#2342

+

−

+

−

== Goal ==

+

−

+

−

The goal is to reach some kind of reasonable consensus, specifically on naming and semantics. Even if we need pairs of functions to satisfy various usage and algebraic needs. Failing to accommodate every possible use of these functions should not be a sufficient reason to abandon the whole project.

+

−

+

−

Note: I (Jared Updike) am working with the belief that efficiency should not be a valid argument to bar these otherwise universally useful functions from the libraries; regexes are overkill for 'split' and 'replace' for common simple situations. Let's assume people will know (or learn) when they need heavier machinery (regexes, FPS/ByteString) and will use it when efficiency is important. We can try to facilitate this by reusing any names from FastPackedString and/or ByteString, etc.

Implemented for instance in [http://hackage.haskell.org/packages/archive/utility-ht/0.0.1/doc/html/Data-List-HT.html#v%3Areplace utility-ht].

=== join (working name) ===

=== join (working name) ===

Line 112:

Line 94:

join sep = concat . intersperse sep

join sep = concat . intersperse sep

</haskell>

</haskell>

+

+

Note: this function has been implemented as 'intercalate' in Data.List.

'''TODO: copy-paste things from threads mentioned above'''

'''TODO: copy-paste things from threads mentioned above'''

Line 117:

Line 101:

'''TODO: list names and reasons for/against'''

'''TODO: list names and reasons for/against'''

−

===Filter===

+

== Sorted lists ==

−

Am I the ''only'' person who thinks 'filter' is a misleading name? For example:

+

The following are versions of standard prelude functions, but intended for sorted lists. The advantage is that they frequently reduce execution time by an O(n). The disadvantage is that the elements have to be members of Ord, and the lists have to be already sorted.

−

filter odd

+

<haskell>

+

-- Eliminates duplicate entries from the list, where duplication is defined

+

-- by the 'eq' function. The last value is kept.

+

sortedNubBy :: (a -> a -> Bool) -> [a] -> [a]

+

sortedNubBy eq (x1 : xs@(x2 : _)) =

+

if eq x1 x2 then sortedNubBy eq xs else x1 : sortedNubBy eq xs

+

sortedNubBy _ xs = xs

−

Now, to me, that looks like it ought to ''filter out'' all odd values, leaving only even ones. In fact (as you all know) it does precisely the opposite.

+

sortedNub :: (Eq a) => [a] -> [a]

+

sortedNub = sortedNubBy (==)

−

I would suggest that 'select' would be an infinitely better name. (It works for SQL!)

+

-- Merge two sorted lists into a new sorted list. Where elements are equal

<hask>mergeBy</hask> is implemented in [http://hackage.haskell.org/packages/archive/utility-ht/0.0.1/doc/html/Data-List-HT.html#v%3AmergeBy utility-ht].

−

[[Category:Standard libraries]]

+

−

[[Category:Idioms]]

+

== Generalize groupBy and friends ==

+

+

In the Haskell 98 List library, <hask>groupBy</hask> assumes that its argument function defines an equivalence, and the reference definition returns sublists where each element is equivalent to the first. The following definition, comparing adjacent elements, does the same thing on equivalence relations:

Latest revision as of 11:14, 16 June 2012

This page lists proposed extensions to the Haskell list functions, whether in the Prelude or Data.List.
Please discuss the proposals on the Talk Page or the libraries list, and use this page to record the results of discussions.
However, since the advent of HackageDB and Cabal-Install it is preferred to provide such functionality in specialised packages, rather than extending the already large base library.

We need these useful functions in Data.List; I'll call them 'split' (and variants) and 'replace'. These are easily implemented but everyone always reinvents them. Various versions have been proposed, but there was no consensus on which was best, e.g.

Note: a lot of good points (diverging opinions!) are covered in the mailing lists, but if we include all these various cases, split* will have 9 variants! The goal is to reach some kind of reasonable consensus, specifically on naming and semantics. Even if we need pairs of functions to satisfy various usage and algebraic needs. Failing to accommodate every possible use of these functions should not be a sufficient reason to abandon the whole project.

The goal is clarity/uniformity (everyone uses them widely and recognizes them) and portability (I don't have to keep reimplementing these or copying that one file UsefulMissingFunctions.hs).

Note: I (Jared Updike) am working with the belief that efficiency should not be a valid argument to bar these otherwise universally useful functions from the libraries; regexes are overkill for 'split' and 'replace' for common simple situations. Let's assume people will know (or learn) when they need heavier machinery (regexes, ByteString) and will use it when efficiency is important. We can try to facilitate this by reusing any names from ByteString, etc.

The following are versions of standard prelude functions, but intended for sorted lists. The advantage is that they frequently reduce execution time by an O(n). The disadvantage is that the elements have to be members of Ord, and the lists have to be already sorted.

assumes that its argument function defines an equivalence, and the reference definition returns sublists where each element is equivalent to the first. The following definition, comparing adjacent elements, does the same thing on equivalence relations: