[Ocaml-lib-devel] Data structures table

Nicolas Cannasse came up with the idea of making a table of all data
structures, all operations on those data structures, and the big-O cost of
performing that operation. What I'm thinking of is something like:
Data Structure
| | | | D |
| | | H | y |
| | | a | n |
| | A | s | a |
| L | r | h | r |
| i | r | t | r |
| s | a | b | a |
Operation | t | y | l | y |
---------------+-----+-----+------+-----+
Find first elem| 1 | 1 | N/A | 1 |
---------------+-----+-----+------+-----+
Find last elem | N | 1 | N/A | 1 |
---------------+-----+-----+------+-----+
Find any elem | N | N | lg N | N |
---------------+-----+-----+------+-----+
Prepend elem | 1 | N | N/A | N |
---------------+-----+-----+------+-----+
Append elem | N | N | N/A | 1 |
---------------+-----+-----+------+-----+
Insert elem | N | N | lg N | N |
---------------+-----+-----+------+-----+
Obviously, using HTML and tables would be a better idea. How you read
this is if you're interested in a given operation- say Append (add an
element at the end of the structure), then doing that with a list is O(N)
(you have to duplicate the list), doing that with an array is O(N) (you
have to allocate a new array and copy the data over), the operation is not
meaningful with a hash table (where conceptually the data is unordered),
and doing that with a Dynarray is O(1).
Hmm. Actually, to prevent loads of N/As we probably want to break
operations up into (at least) four different categories:
- Basic operations all data structures support, for example searching for
an element the fullfills a given criteria.
- Operations that depend upon ordered data- for example finding the first
or last element. Examples of data structures which are ordered include
lists, arrays, and dynarrays.
- Operations that depend upon having keys associated with the data- for
example, searching for the element that has is associated with a given
key. Examples of data structures which have keys are hash tables and
priority search queues.
- Operations that depend upon having a priority relationship among
elements, for example finding the highest priority element. Examples of
data structures which have priorities are queues, priority queues, and (of
course) priority search queues.
Only data structures which "naturally" support those operations would be
listed. Alternatively, we could provide reasonable estimates on how
expensive that operation would be to implement with the given data
structure, and mark it somehow as needing to be implemented (different
color?).
Thoughts? Comments?
Brian

Thread view

Nicolas Cannasse came up with the idea of making a table of all data
structures, all operations on those data structures, and the big-O cost of
performing that operation. What I'm thinking of is something like:
Data Structure
| | | | D |
| | | H | y |
| | | a | n |
| | A | s | a |
| L | r | h | r |
| i | r | t | r |
| s | a | b | a |
Operation | t | y | l | y |
---------------+-----+-----+------+-----+
Find first elem| 1 | 1 | N/A | 1 |
---------------+-----+-----+------+-----+
Find last elem | N | 1 | N/A | 1 |
---------------+-----+-----+------+-----+
Find any elem | N | N | lg N | N |
---------------+-----+-----+------+-----+
Prepend elem | 1 | N | N/A | N |
---------------+-----+-----+------+-----+
Append elem | N | N | N/A | 1 |
---------------+-----+-----+------+-----+
Insert elem | N | N | lg N | N |
---------------+-----+-----+------+-----+
Obviously, using HTML and tables would be a better idea. How you read
this is if you're interested in a given operation- say Append (add an
element at the end of the structure), then doing that with a list is O(N)
(you have to duplicate the list), doing that with an array is O(N) (you
have to allocate a new array and copy the data over), the operation is not
meaningful with a hash table (where conceptually the data is unordered),
and doing that with a Dynarray is O(1).
Hmm. Actually, to prevent loads of N/As we probably want to break
operations up into (at least) four different categories:
- Basic operations all data structures support, for example searching for
an element the fullfills a given criteria.
- Operations that depend upon ordered data- for example finding the first
or last element. Examples of data structures which are ordered include
lists, arrays, and dynarrays.
- Operations that depend upon having keys associated with the data- for
example, searching for the element that has is associated with a given
key. Examples of data structures which have keys are hash tables and
priority search queues.
- Operations that depend upon having a priority relationship among
elements, for example finding the highest priority element. Examples of
data structures which have priorities are queues, priority queues, and (of
course) priority search queues.
Only data structures which "naturally" support those operations would be
listed. Alternatively, we could provide reasonable estimates on how
expensive that operation would be to implement with the given data
structure, and mark it somehow as needing to be implemented (different
color?).
Thoughts? Comments?
Brian

Yeah yeah, replying to myself is bad. Anyways, I wanted to send out an
example of HTML for people to look at and critique.
Having played with this, I definately want to write a HTML generator for
this page. In Ocaml, natch. Otherwise maintainance will be nigh unto
impossible.
Brian

Brian Hurt wrote:
> Yeah yeah, replying to myself is bad. Anyways, I wanted to send out an
> example of HTML for people to look at and critique.
>
> Having played with this, I definately want to write a HTML generator for
> this page. In Ocaml, natch. Otherwise maintainance will be nigh unto
> impossible.
>
Use interscript!
http://interscript.sourceforge.net
Here is some code:
------
@set_title('Extlib Performance')
@head(1,'Operation costs')
Here is a performance table for
various data structures
@p()
@begin_table('oper','list','array''hash')
@table_row('append','O(1)','O(N)','N/A')
@table_row('prepend','O(1)','O(N)','N/A')
@table_row('magicr','O(1)','O(N)','O(1)')
@end_table()
@p()
--------------
This input can be used to simultaneously
generate HTML (in two formats), LaTeX, Lout,
and plain text (and even nroff if someone
would care to write an nroff weaver).
BTW: the stuff after the @ sign is not
magical interscript stuff. No,
it is any Python code you want to write
(the functions such as 'begin_table' are just part of the
standard library).
This implies you can generate documentation easily
with Python script. Indeed, you can generate
code too (since its actually a literate programming
tool).
--
John Max Skaller, mailto:skaller@...
snail:10/1 Toxteth Rd, Glebe, NSW 2037, Australia.
voice:61-2-9660-0850