Peter De Rijk (2003) Several of the commands moved to sets here, present another functionality on lists because of the implementation of sets as hashes: The order of elements in a set will not necessarily be retained. Things like removing a set of elements from a list, or checking for presence of an element while retaining the list order are useful.

A function treating a list as list of lists (a matrix) and returning a specified column of that matrix (items of the outer list are rows). This is a projection operation.

A function computing the length of the longest item in a list, the items are treated as strings.

List universal/existential quantifiers: An expression has to hold for all items or item sequences in a list (universal), or for at least one item/item sequence (existential).

Searching in lists.

A function extending the known kernel command lsearch to return all matching elements from a list. Allow negation, i.e. returning all non-matching elements ?

Extend the above to allow multiple patterns.

Special functions, not fitting anywhere else

A function to merge two or more lists into one, by interleaving the contents. Now to have lazy lists too :-).

A function to split a single list into two or more. The reverse of the last function.

A function to set several places in a list to an item, controlled by a list of indices. (Basically the second function in 3., extended to work for multiple items).

A function to turn a flat list into a nested list by creating a sublist of every n elements where n is known as the stride.

An option which can be added (in a consistent manner) to list functions that return a flat list to post process the flat list by converting in into a nested list using a stride value. Two candidates for this new option are 'array get' and 'split'.

It goes without saying that we would like the list manipulation routines to be fast and efficient. Manipulation routines (sometimes called mutators) like insert, replace, remove (described above) change the state of objects. Unfortunately, the core list routines almost always require the list to be passed by value. For example, to simply replace a single element in a long list, you have to copy the list like this:

set myList [lreplace $myList 20 20 "new Element"]

Tcl's object model is smart enough that it doesn't copy all elements in the list, but it does have to make a new list object of pointers to the list elements. If the list is short (100 elements) then this isn't a big deal. If your list is long (1000+) then list manipulation will be slow.

One way to avoid extraneous list copies is to offer additional list functions which can manipulate the list contents "in place," without the extra overhead of list copies. It is necessary to refer to the lists by name instead of by value, just as the core command lappend does. As an example (although we're not specifying syntax here!), we could replace a single element in a long list with a command like

replace-in-place myList 20 20 "new Element"

Donal Fellows suggests a K() operator on the Tcl Performance page which can help with pure-Tcl implementations of these "in place" operations.

RWT

DKF: As an aside, there are four classic combinators from theoretical CS: o, I, K and S.

AK: Not quite. A list of this type can actually be used when managing caches (LRU, i.e. Least Recently Used) to determine which part of the cache to overwrite if the cache is full and something is added. Namely the elements at the end of the list as they are used least.

RS 2004-11-17: See A size-bounded cache for an implementation of that - and here's the LRU functionality for a list alone, in case someone needs it (e.g. for a list of the n files most recently opened):