Introduction

Each of these implements the same interface as the STL class in the second part of its name. In addition,
each one also implements the STL interface for the vector<T> class, providing fast array access via
operator[] and at(). T is the type of the contained class. Key is the type of the
lookup key for pair-associative containers. Cmp is the comparison predicate and must be a binary predicate
with arguments of type T for set and type Key for map. If you do not provide a comparison
predicate, it will default to std::less<Arg>.

The classes are all derived from an AVL tree that I borrowed form Adreas Jager (copyright included) and modified
to allow for the indexed access. Each tree node contains an extra long integer that holds the number of child nodes
below it. The algorithm to update these counts does not add to the complexity of any of the original tree algorithms.
Since the interfaces are documented in the STL, I will simply provide the complexity analysis below. I believe I
have covered the most-used functions of these containers, but some functions may not be included. Please contact
me if you wish to add another STL function and I will be glad to update the source.

Complexity Analysis

index_list<T>

An index_list is a sequence which combines the interfaces of array, deque,
and list.

Here are the complexity comparisons with other sequence containers:

array

deque

list

index_list

Indexed access

1

1

(N)

lgN

Insert/delete

N

N

1

1+

Front insertion

(N)

1+

1

1+

Back insertion

1+

1+

1

1+

It tends to work best for large sets (N >> 1).

It also has the extra function insert(const T&), usually used for sets,
which chooses an insert position that keeps the tree balanced with a
minimal number of rotation operations.

These classes implement the STL interfaces for set, multiset, map, and multimap respectively, along with the indexed access (operator[], at()).
Here are the complexity comparisons with other associative containers:

Comments and Discussions

Hello, just wondering if these classes are still being supported. I just tried to compile with g++ version 4.2.3 on kubuntu 8.04, and I got some errors. The first few are:

index_set/iterator_base.h:42: error: to refer to a type member of a template parameter, use ‘typename I::value_type’
index_set/iterator_base.h:42: error: to refer to a type member of a template parameter, use ‘typename I::distance_type’
index_set/iterator_base.h:75: error: type ‘I’ is not derived from type ‘stl_bidirectional_iterator<I>’

I am not uploading new version, but you can fix this issue by adding the "typename" parameter in front of the class type definitions as reported for line 42. That may get rid of the one on 75. There may be several other places where "typename" must be added.

I compiled on MSVC which does not enforce the typename rule which I believe is required by ANSI.

Thanks Ray. I actually wasn't able to get it to compile as I'm not well versed in typenames, class templates, and the other tools I need to get this to work. ( I also got an error about an expected "nested-name-specifier" ). I mainly use STL classes, but don't add anything into the implementation.

Basically, I need a data structure that is an STL set, but where I can select elements at random. So, your index_set would be perfect. Can you recommend another implementation that compiles readily in Ubuntu? Or an implementation of a red-black tree that allows random selection of elements that compiles readily on Ubuntu?

Sorry, never saw another container like you describe. I think you should get a friend who knows STL pretty well and have him/her make the correction. It should be easy. Wherever you see a class definition that has "typedef X:<name1> <name2>", you replace it with "typedef typename X:<name1> <name2>". For example:

typedef T:iterator_type iterator_type

should be changed to:

typedef typename T:iterator_type iterator_type

I hope that helps. I would be a shame not be able to use the class just because of simple compiler error. I'm sorry I did not compile under ANSI restrictions myself.

I just can't understand why one would want to map linear structures onto binary search trees. I can understand if one wants to track the order items were put into a tree, but as far as I can tell, that's not possible using these structures.

Search trees are used when there is a functional relation between the domain and range (key/value pairs), and linear structures when there exists no such relation.

If I understand the code correctly, an index into a tree is not fixed. By that I mean tree[1] may not necessarily be the same value as tree[1] after an insertion or deletion of a value in the tree due to rotations. That doesn't seem too advantageous to me. It looks like a linear structure, but doesn't behave like one.

I cannot see what problems indexed access to tree structures can solve. Am I missing something here?

I can understand your puzzlement, but you don't have to use the code if you don't need it

But to offer a suggestion, what this container does is to allow lists, sets, and maps to have a random access iterator. Normally, they only have bi-directional iterators. This allows them to work inside algorithms that require random access iterators. You can do a lot of things with and indexed list that you could normally only do with an array. As for sets and maps, it is a little more obscure as you have mentioned. The algorithms you use on these must be non-mutating.

Well as Ray Virzi wrote it seems that one has to put a lot of "typename" keywords, especially before "iterator"(e.g. "AVLTree<t>::iterator"->
"typename AVLTree<t>::iterator"), "const_iterator", "reference",
"const_reference", "value_type", "size_type".
So it compiles under VS 2005.

Sometimes can be useful to know the index where new elements are inserted or elements are found. Some overloaded member functions find() and insert() can be easily implemented based on the provided functionality, for example for index_set:

Interesting extensions to the standard library. I must point out that the GetIndex() function does not work for the _endnode marker, so you must check for that condition. What is 'index' set to when the item is not found or cannot be inserted? Perhaps -1?

I think the conventional way to get the index is to take the iterator returned by both find() and insert() and subtract it from the beginning of the container, like so:

This should give you the 0-based index. If the function fails, at is set to myset.end() and index will be the total size of the container, so this condition must still be checked for.

But I have also noticed a bug in my code as well. The insert function won't work this way for set and map if the item cannot be inserted. This is because the end maker must point to _endnode. My last line of the insert function should read:

I tried the sources on linux, gcc 3.0.1..
first I needed to do some adjustments in sources in order to make it work..

I made 3 classes, 1 std::map, 1 indexed_map, 1 std::hash_map, each inserted with 1000000 elements, key type of long int, value type of my class A (consists of int, char, std::string).
I run tests, where I was generating random numbers using random() which was my key and I read the data from containers using at() or find() methods.

Filling the index_map was a little faster than filling std::map but a lot slower than filling std::hash_map.
Finding an element in index_map was slower (about 1.3 times slower) than finding an element in std::map and about 5 times slower than finding and element in std::hash_map.

This results are absolutly obvious. std::hash_map works faster then std::map because it uses hash table algorithm that has O(1) time estimation for search|add|delete operations, while map uses "red-black tree" data structure that has O(log N) for this opearions, but it has a capability to store data in sorted order while hash_map could not. And in conclusion the "red-black tree" is the fastest structure in general case from the all wide balanced trees structures.

I use an AVL tree for this implementation. If your point is that a red-black tree is faster than an AVL tree, my research indicates that they are essentially the same average speed. But these containers could be just as easily implemented with a red-black tree. It's just a matter of choice.

The unique thing about it is that I have added fast indexing capability to the tree, which is not possible for a normal tree structure and has no meaning for a hash-map. So until someone does this with a red-black tree instead of AVL, its the best one here .

By the way, Red black tree in fact is AVL tree, but more advanced. It used the same idea of subtree rotaion, but Red-Black tree is optimized for best perfomance in case of large number of random additions and removing from the tree, because it could be more disbalanced then simple AVL, so it requres less rotations in common case, but could require some much time to search operations it the same time. Anyway the speed diffrence between this realizations is minimal.