polygonalhttp://lab.polygonal.de
Blog of Michael BaczynskiSat, 04 Jan 2014 21:50:12 +0000en-UShourly1http://wordpress.org/?v=4.0.1Reassembling the polygonal libraryhttp://lab.polygonal.de/?p=2900
http://lab.polygonal.de/?p=2900#commentsWed, 25 Jul 2012 22:22:19 +0000http://lab.polygonal.de/?p=2900I’ve just finished restructuring my library into smaller parts – this was necessary because the previous monolithic approach has become really hard to maintain. Now, I can push updates more frequently and adding new stuff should be also much easier.
The various parts of the library are now hosted on github: https://github.com/polygonal.
I apologize for any inconvenience and hope the reorganization does not break that much code.
BTW, the library now fully supports Haxe 2.10 :)
]]>http://lab.polygonal.de/?feed=rss2&p=290017Using sprintf with Haxehttp://lab.polygonal.de/?p=1939
http://lab.polygonal.de/?p=1939#commentsMon, 30 Apr 2012 18:39:21 +0000http://lab.polygonal.de/?p=1939In contrast to ActionScript, Haxe allows you to customize the trace() function to better fit your requirements. For example, you can format or delegate the input to your favorite logger, resulting in less verbose code.
My library supports the sprintf syntax, so let’s take a look on how to use it seamlessly with trace().

First, we need to redefine the behavior of the trace() function, which is done by calling Root.init():

From now on, trace() calls are redirected to Root.log().debug(…), and the default log handler writes to flashlog when targeting Flash. Compiling and running the above example, the log file should look like this:

And if you want to get rid of all trace statements, you simply compile with --no-traces ;)

]]>http://lab.polygonal.de/?feed=rss2&p=19391A fast notification system for event-driven designhttp://lab.polygonal.de/?p=2548
http://lab.polygonal.de/?p=2548#commentsTue, 24 Apr 2012 13:08:55 +0000http://lab.polygonal.de/?p=2548Back in 2009 I started to work on a custom solution for dispatching and listening to events. At this time I switched to the Haxe language, and so my main motivation was to write a cross-platform system. I also never really liked the event dispatcher in Flash – too java-ish and boilerplate heavy for my taste. I guess this is why other libraries like AS3 Signals, or HSL (Haxe Signaling Library) appeared.

In short, here are the key facts:

I’m using the classic Observer pattern, because it’s simple and widely known.

Instead of listening to single events, I’m registering with multiple related events at once (“event group“).

Updates are not identified by strings, but are represented as integers.

The observers are managed in a linked list.

Notice that in the following text, the terms event/update, listener/observer and event dispatcher/observable are used interchangeably. The source code is hosted on github.
Let’s break down each point:

1. The interface

My design follows the observer pattern – chances are good that it’s the first design pattern covered in many books so I won’t go into much detail.
Actually, it’s quite simple: An observable manages a list of observers. Whenever the state of the observable changes, the observable iterates over all observers and calls observer.update(). The entire system consists of three components:

de.polygonal.core.event.IObservable, an interface which merely defines three methods:

de.polygonal.core.event.Observable, a ready-to-use implementation of IObservable.

Thinking in flash terms, 2. and 3. correlate with flash.events.IEventDispatcher and flash.events.EventDispatcher. There is however, no flash.events.Event object. Instead the observer has to pull this information from the source (always points to the object that triggered the update) or use the userData parameter, which is often used for convenience (e.g. keyboard events store the current key code in it). Since I prefer short names, addEventListener(), removeEventListener() and dispatchEvent() are simply called attach(), detach() and notify().

2. Event groups

We often have to set up listeners for similar events. For example, let’s assume we want to download a file from a server – in most cases a simple “onLoadComplete” event is not enough: we also need to handle errors or visualize the download progress. So the main idea is to replace multiple calls to addEventListener() with a single call to attach(), as you need those events anyway. This drastically reduces boilerplate code.
A good example is the Resource class from my library, responsible for loading all kind of external assets. By attaching to a resource object, you get all important updates at once:

So far, so good. But what about a situation where you don’t use all those updates? Let’s consider the case that we have tons of observers listening to user input events. We are only interested in keyboard updates, and both key and mouse events are defined in a single event group. Mouse position updates are fired at a high rate and might slow down your application. The solution is to register with a subset of events only by explicitly naming them:

myObserver.attach(myObservable, UIEvent.KEY_DOWN | UIEvent.KEY_UP);

This will only invoke myObserver.update() for those two event types. If we don’t need an event at all (usually application-wide), we can blacklist the event on the dispatcher and disable it completely, further improving performance:

myObservable.mute(UIEvent.MOUSE_MOVE);

3. Why integers?

Using integers for identifying events has some advantages:

We can pack multiple properties – in this case a group id and an update id.

We can efficiently check against multiple updates by combining them with bitwise OR or test if an update belongs to a particular group:

Comparing integers is wickedly fast. It won’t make a big difference, but it’s good to know ;).

However, there is a major drawback: one has to make sure integers are unique across the application! Assigning ids by hand is prone to errors and dangerous, so I hand it over to the compiler and let a macro do all the work. In short, a macro is a piece of code written in the Haxe language itself that modifies existing code or generates new one at compile-time. This way, creating an event boils down to this:

The syntax might look a bit strange at first, but it’s just an array of field names that gets passed to the build method. The macro then generates those fields, encodes and assigns integer values and adds some convenience methods, like for printing out the human readable name of an update. Before Nicolas added the macro-system, I defined them manually, which was a total mess.

Speaking about the format: each 32-bit integer stores a group id and an update type. 5 bits are reserved for the group id, while the remaining bits represent a bit field for the update type. Therefore, a total of 32 groups (2^5-1) and 27 individual types (32-5) can be encoded. In total the system supports 32×27=135 unique events which should be enough for any application. By sacrificing one update bit, we get over 1500 different types.

4. Observer storage

Another decision I made is to use linked lists instead of arrays for storing observers. This way there is no need to make a copy of the observer list before dispatching updates. Arrayed implementations do this, because the observer list can be modified by observers themselves:

The above code updates the observer list instantly, but due to the nature of linked list, the update phase won’t break.
Notice that when an observer calls notify() on its observable, the current update stops while the new update is carried out to all observers, and resumed as soon as the nested iteration has been completed. Let’s assume we have 3 observers: [A, B, C] – when B invokes an update the order will be: [A, B, [A, B, C], C]. Now, when C detaches from its source (when C.update() is called for the first time), C will disappear from all “levels”, and the last C at the root level won’t receive an update. In practice this shouldn’t be a problem and it’s usually the desired behavior.

Using linked lists require that observer objects are “boxed”: a helper objects has to be allocated for storing the observer and list pointers. To help with performance, I allocate and reuse those helper structures once an observer gets detached from its observable:

The cache size is set to 10 by default, but you can easily customize this value by providing a different size to the constructor:

var x = new Observable(50);

Performance

First: I can’t imagine a situation where dispatching events could be the bottleneck of an application. That said, it still does not hurt to provide an efficient solution. The most important thing for Flash is to use typed function calls, which are roughly 10x faster. This is achieved by forcing each observer to implement the IObserver interface:
Also, iterating a linked-list is very fast in Flash. Compared to flash.events.EventDispatcher, the overall result is pretty good:

To sum it up, the system is optimized for updates fired at a high rate and listeners with a short live span, e.g.
an AI system, were lots of agents are interacting with other agents by sending messages.

Tips

Observable.dump() prints out all observers in use along with their number of observers. Useful for profiling.

Observable.release() destroys all observable objects and removes all observers to make sure that the garbage collector can free up all used resources.

Observable.bind(func) can be used to bind a function to an update. The binding is active as long as the closure returns true:

]]>http://lab.polygonal.de/?feed=rss2&p=25485Introduction to dshttp://lab.polygonal.de/?p=2513
http://lab.polygonal.de/?p=2513#commentsWed, 11 Jan 2012 14:42:22 +0000http://lab.polygonal.de/?p=2513“Introduction to ds” is a slide presentation/manual about ds which I started last year. I never had enough time to finish it quickly so I wrote it now and then and just recently finished it.

The presentation explains the concepts and design decisions behind ds, provides short code examples to demonstrate basic usage and covers each data structure in a few words. I tried to squeeze in as much information as possible but due to the complexity of the topic it’s impossible to go into detail – each data structure alone could easily fill a presentation…

Code examples are all in haXe language, but converting to AS3 is simply a matter of removing type parameters <T> and renaming Int to int, Float to Number and so on.

I hope you find it useful! (if you are an experienced game programmer you will be probably bored ;))

]]>http://lab.polygonal.de/?feed=rss2&p=25131Linked list performancehttp://lab.polygonal.de/?p=1826
http://lab.polygonal.de/?p=1826#commentsMon, 19 Dec 2011 09:49:11 +0000http://lab.polygonal.de/?p=1826It’s a fact that haXe creates swf files running faster compared to mxmlc. Amazingly, haXe is also faster at compile time. But to which extent are haXe generated swf files faster? Optimizing code especially pays off for “low-level stuff” like data structures, the building blocks of many algorithms. With this in mind, I picked the doubly linked list class from my ds library (extensively used in almost all my projects) and played around with different optimization levels to figure out how those translate into run time improvements of the flash platform.

So, which options do we have?
First, we can create “true generic classes” at compile time by adding -D generic to the haXe command line. We gain almost 40%. Why? Because we get rid of dynamic access (*) – perfect for AVM2, which is a statically typed runtime. You can read more about this here.
Next, let’s turn on function inlining, so we don’t have to deal with the overhead of calling many small functions. Inlining is always enabled by default, but you can turn it off by compiling with --no-inline, just so you can see the difference. This increases performance by another 70%.
So far we have twice the performance, thanks to smart compiler optimizations. Regarding linked lists (and many other linked structures in ds) we have one secret weapon left: “node caching”, a feature that was introduced in ds version 1.12 back in 2010. By passing a size parameter to the constructor we can reserve additional memory to minimize costly instantiation of linked list nodes:

From now on, every time you remove an element from the list, the node object storing that element is kept for later reuse – ready to store another incoming element. The resulting behavior is an incrementally filled object pool. And don’t worry if the size exceeds the predefined size – in this case objects are simply created on-the-fly. And here are the final numbers:

Notice that due to additional bytecode instructions for debugging, input validation and assert statements the debug build is awfully slow, so better not use it for deploying your application ;). The release build is effectively what you get when compiling ds_release.swc with mxmlc, on par with other libraries like the LinkedList class found in the as3commons Collection package. Actually, haXe compiled SWC files are a bit faster because the generated bytecode is more efficient and private methods are still inlined, but in order to unleash the full potential of the flash player there is currently no way around haXe.

About “-D generic”

This is a custom conditional compilation switch which is available in ds. If defined, classes defining linked structures will implement the marker interface haxe.rtti.Generic. This tells the compiler to create specialized classes for each type parameter. Here is a simplified example:

Now everything is strictly typed and we don’t need slow dynamic access.

]]>http://lab.polygonal.de/?feed=rss2&p=18260A* pathfinding with dshttp://lab.polygonal.de/?p=1815
http://lab.polygonal.de/?p=1815#commentsFri, 22 Jul 2011 15:11:23 +0000http://lab.polygonal.de/?p=1815I recently did a complete rewrite of my graph-based A* pathfinder example because I received a lot of questions on how to implement path-finding using the new ds library. So here is an updated demo that works with ds 1.32:

Sources should now be in {haxe_install_dir}/lib/polygonal/1,18/src/impl/sandbox/ds/astar, where {haxe_install_dir} is usually C:/Motion-Twin/haxe on Win7.
The demo can be then compiled with:$ cd C:\Motion-Twin\haxe\lib\polygonal\1,18\build
$ haxe.exe compile-ds-examples.hxml

Extending the Graph class

You have basically two options to extend the functionality of the Graph object: by composition or inheritance. While I highly recommend to use composition whenever possible, I’ve also included a version using inheritance – just so you see the difference.

The composition version looks like this:

The Graph object manages GraphNode objects, and each GraphNode holds a Waypoint object, which defines the world position of the waypoint as well as intermediate data used by the A* algorithm. Notice that GraphNode and Waypoint are cross-referencing each other as a Waypoint object has to query the graph for adjacent nodes. As a result, you have a clean separation between the data structure (Graph, GraphNode) and the algorithm (AStar, Waypoint) and don’t need object casting, which is good news because casting is a rather slow operation.

Now let’s look at the version using inheritance:
Here, Waypoint directly subclasses GraphNode. Since the Graph is defined to work with GraphNode objects, we need a lot of (unsafe) down-casts to access the Waypoint class. Furthermore, the use of haxe.rtti.Generic will be very restricted or even impossible (implementing this marker interface generates unique classes for each type to avoiding dynamic).

]]>http://lab.polygonal.de/?feed=rss2&p=181522Heaps and Priority Queues in ds 1.31http://lab.polygonal.de/?p=1710
http://lab.polygonal.de/?p=1710#commentsMon, 11 Apr 2011 19:03:57 +0000http://lab.polygonal.de/?p=1710I recently revised and improved the Heap and PriorityQueue classes, mainly because the difference between both has always been somewhat blurry. The new implementations are included in ds 1.31, so let’s talk a bit about the changes.

The Heap class

By definition, a heap is a dense binary tree for which every parent node has a value that is less (or greater; henceforward I assume elements are sorted in ascending order) than or equal to any of its children. A heap defines a set of simple operations:

Insert element

Query smallest element

Remove smallest element

In the old implementation, insertion was done by calling enqueue(), and removal by calling dequeue(). This was a bit misleading because a heap is not a queue, instead it’s used as an efficient data structure for implementing priority queues. Thus the existing heap API has been modified and now consists of the methods add(), top() and pop(), corresponding to the old methods enqueue(), front() and dequeue(), respectively. I’ve also added some additional methods:

replace() – replaces the smallest element with a new element

change() – updates an existing element after it has been changed (restores heap condition)

Note that replace() and change() is a lot faster than a combined remove() and add() operation. Also, bottom() runs in O(n) wheres top() is O(1). This can be further improved by a Min-Max Heap, which I may add in a future release.

Last but not least I did some optimizations, lowering memory requirements and almost doubling performance. And you don’t have to enable element-removal capabilities through the constructor as the heap now always supports removal of arbitrary elements.

The PriorityQueue class

As mentioned above, heaps are used for implementing priority queues and as the name implies, it’s a queue ADT so PriorityQueue now implements the Queue interface, defining the following methods:

The Heap class requires that every element provides a comparison function by implementing the Heapable interface. This is not needed for the PriorityQueue class, because here the elements are sorted using integer keys, resulting in a much better performance. So if you are only interested in managing prioritized data this is the perfect choice, while the Heap class is best used as an all-round tool for solving common problems like finding the min, max or k-th largest element.

Usage

Elements to be inserted into a heap have to implement the Heapable interface, which is defined in haXe like so:

interface Heapable implements Comparable
{
/**
* Tracks the position inside a binary heap.
* This value should never be changed by the user.
*/
var position:Int;
}

Declaring a field in an interface is not supported in ActionScript 3.0, so when you inspect the compiled class from an SWC file, it has become an empty marker interface. So don’t forget to define the position field as you don’t get compile-time errors. I could have defined position as a getter/setter style function, but I don’t want to give up all those nice haXe features because it has become my primary language for a long time now. So the AS3 implementation of a heap element would look like this:

What is a deque?

A deque is a double-ended queue and pronounced “deck”. In contrast to a queue, which allows insertions at one end and removals at the opposite end, a deque allows insertions and removals at both ends. The new Deque interface defines four methods to modify the Deque along with two methods to access the element at the front (head of the list) and back (tail of the list):

There are two common ways to implement a deque – by using a doubly-linked list or by using arrays. While the first option is very simple and straightforward, the latter one is more of a challenge because there are many approaches which all have their pros and cons.

Linked implementation – de.polygonal.ds.LinkedDeque

This is basically just a stripped down, lightweight version of the doubly linked list class DLL. A doubly-linked list is capable of doing insertions & removals in constant time, but the operation itself is rather slow, since we need to instantiate a node object that stores the element and manage pointers to prevent the list from falling apart.
Another important thing to watch out is that the LinkedDeque class requires considerable more memory – for each and every element we need a container object for storing the “cargo” (+16 bytes), a reference to the previous and next node (+4 bytes each) and finally a reference that points to the element (+4 bytes). So in Flash we need a total of 28 extra bytes per element. This means that 86% of the space is wasted on nodes and pointers!
Despite those shortcomings it’s still useful for small to average sizes; and to minimize costly node object instantiation the LinkedDeque class (and all linked structures in ds) comes with built-in node pooling – all you have to do is to create a LinkedDeque object with a reserved size greater than zero:

var deque = new LinkedDeque<Int>(100); //reuses up to 100 nodes

This is very useful if the maximum size is known in advance, as performance nearly doubles. The benchmarks below were all done with “node-caching” turned on.

Arrayed implementation – de.polygonal.ds.ArrayedDeque

While most sites explain how a doubly-linked list relates to a Deque, there is not much information available on how to efficiently implement a deque on top of an array. My first attempt used a circular array similar to the ArrayedQueue class. It seemed to be a good solution until I realized that resizing the deque would be slow and difficult to implement so I discarded this idea and started from scratch. This time I decided to use an array of arrays (turned out the C++ STL deque uses this approach) and I was very happy with the result.
Let me briefly explain how it works. Data is organized in chunks or blocks of memory. A block is simply a fixed-sized array and all blocks are stored in a separate, dynamic array. Upon initialization the deque only contains a single block but once it fills up an additional block is allocated: adding elements to the back calls something like “blockListArray.push(newBlock)” and adding elements to the front calls “blockListArray.unshift(newBlock)”.
Inserting elements at the beginning of an array is slow but in this case it doesn’t matter because the list of blocks is usually very small and it doesn’t happen very often. Therefore we can say that the ArrayedDeque performs modifications at both ends in amortized constant time.

Performance

So how do both classes perform? Showing raw numbers would be pointless without being able to compare them to some reference values. For this reason I’ve created a minimal deque implementation called “NaiveDeque”:

As you see it just uses what the Flash language has to offers. Clearly, the problem which the code above is that modifications at the beginning of the array are slow because a lot of memory is shifted around, either to fill a gap or to make room for a new element. So we kinda have the worst-case at the front and the best-case at the back. It’s not very fair to use this as a reference but on the other hand it shows how things can be drastically improved by using some elbow grease to write clever data structures. Here are the results:

size

1000

2000

3000

4000

5000

NaiveDeque

1.0

1.0

1.0

1.0

1.0

LinkedDeque

1.65

4.4

6.6

7.2

7.7

ArrayedDeque

5.0

9.3

11.7

14.8

19.3

Speed relative to “naive deque”, higher is better

The benchmark results were taken by equally distributing n elements at both ends, like this:

Usage

One real-world application that comes to my mind is a software application’s list of undo operations; but as a deque is a very flexible abstract data structure it can be used in countless algorithms. The deque classes are all documented so I hope it’s clear how to use them. If you have problems feel free to contact me.

]]>http://lab.polygonal.de/?feed=rss2&p=14723Simple 2D Molehill Examplehttp://lab.polygonal.de/?p=1433
http://lab.polygonal.de/?p=1433#commentsSun, 27 Feb 2011 21:38:00 +0000http://lab.polygonal.de/?p=1433Here is a minimalistic FlashDevelop AS3 project that demonstrates how to use the Molehill API to draw a single 2D sprite on the screen. You can get the project here: fdproject-mole2d.zip.
Assuming you have followed the instructions below the result should look like this:

To compile the example you need the Incubator version of the Flash Player, a new FlexSDK (“Hero”) and an updated playerglobal.swc which includes the Molehill API.
All files can be downloaded from Adobe Labs.
After downloading and unzipping the SDK to the folder {flexSDK} create a directory named “10.1” in {flexSDK}frameworkslibsplayer and copy the new playerglobal.swc into it. Next fire up FlashDevelop and in Settings->AS3Context update the Flex SDK Location so it points to the new SDK. Finally just unzip & extract the FD project and hit F5 to compile.

Here are the basic steps to draw a textured quad:

Create a vertex buffer with 4 points to define a quad.

Create an index buffer that keys into this vertex buffer to define two triangles.

Create and upload a texture.

Compile the vertex and fragment programs (I’m using the AGAL mini-compiler from Adobe).

Then on every frame:

Assign the vertex and fragment program.

Transform the sprite and create a model-view-projection matrix for the vertex program. Since this is a 2D example we have to set up an orthographic projection.

Draw the triangles.

The mvp transformation could be done by the vertex shader but to keep it stupid simple I’m creating the matrix in AS3. Also note that we need to flip the y-axis and shift the origin to the upper-left corner.

Recently I started to adjust my code with the C++ and JavaScript haXe target in mind. I have to say it’s really amazing to see your code running on different platforms. This offers an incredible amount of flexibility! While almost everything worked out of the box, the biggest hurdle was to find a replacement for the flash.utils.Dictionary class, which I was using extensively in my library.
Although the haXe API already offers a Hash and IntHash class (both using the Dictionary for the flash target), I’ve decided to implement my own hash tables so they blend in nicely into my ds library.

What are hash tables?

A hash table is a very important and fundamental data structure in computer science. It offers rapid insertion, access and deletion of sparse data. This is achieved by distributing the data amongst a fixed number of slots using a hash function.

There are many ways of implementing hash tables, but from my experience (and from reading dozens of papers) the easiest solution is something called a ‘standard-chain hash table’, which simply uses linked lists as a collision resolution scheme. When linked lists are replaced with dynamic arrays, the result is called an ‘array hash table’, which offers a smaller memory footprint because it eliminates nodes and pointers all together and is more cache friendly.

Other implementations include ‘bucketized cuckoo hash tables’ or ‘open-address hash tables’ using techniques like linear probing or double hashing for collision resolution. Those are interesting if you are into computer science, but from the point of view of a game programmer, they don’t offer any advantages, but have drawbacks of some kind (problems with high load factor, difficult to implement, complex hashing and so on).

The hash table implementation which I’ve included in the ds library is somewhere in between a standard-chain hash table and an array hash table – the data is stored in an array while still offering basic advantages of a linked list, namely removal in constant time and move-front-on-access by pointer manipulation.

So the procedure for searching a key in my code is:

Hash the key to acquire a slot.

Check if the slot is empty. If it’s empty, the key does not exist and the search fails.

Otherwise scan the array a key at a time until a match is found or until the array is exhausted (search fails).

If the key was found, move it to the front of the list by pointer manipulation. This way further queries to the same key take less time (optional).

Implementation overview

The latest version of my ds library now contains a bunch of additional interfaces and classes:
Classes implementing de.polygonal.ds.Map:

IntIntHashTable – A hash table using 32-bit integers for both keys and values.

HashMap – The existing wrapper for the Dictionary class, which was updated so it now implements Mapinstead of Collection. Flash only!

Classes implementing de.polygonal.ds.Set:

IntHashSet – Similar to IntIntHashTable, this is a set for integer values.

HashSet<T> – A generic set. Values have to implement Hashable.

ListSet<T> – A simple set backed up by a list; this is a replacement for the former Set class that now defines an interface.

IntIntHashTable and IntHashSet are the most important ones since they provide the core functionality used in other classes. IntHashTable, HashTable and HashSet extend their behavior via composition.

The core classes use ‘alchemy’ memory and lots of inlining to achieve very good performance. I’ve documented all classes, and I hope it’s not that difficult to understand. Some notes:

Although hash tables implement a Map interface, they allow multiple entries of the same key following the FIFO rule. If you want to make sure that the hash table contains unique keys, use the setIfAbsent(key, val) method or manually check for existence using hasKey(key) before calling set(key, val). Example:

As you know, the Dictionary lets you use the object ‘identity’ as a key, and not the value returned from calling toString() on it. This behavior is not available in JavaScript and other targets. So objects used as keys in a HashTable or as values in a HashSet have to implement the Hashable interface or extend from the abstract class HashableItem. Hashable defines a single method getKey() (or just a field key if using haXe) returning an unique integer value to identify the object. This is a little awkward, but I currently don’t see any other (simple) solution. Example:

For the AS3 users out there, I’m providing the following swc files (xxx stands for debug or release):

ds_xxx_alchemy.swc: ds version with alchemy support for FP10.

ds_xxx_.swc: ds version without alchemy support but using Vectors for FP10.

ds_xxx_fp9.swc: ds version without alchemy support but using Arrays for FP9.

When using the alchemy version, it’s important to call free() to release all used memory once the object is no longer needed.

The Dictionary class

If you want to take a look at the hash table implementation used in the Dictionary, you can do so by downloading the Tamarin source code and open the file core/MultinameHashtable.cpp or core/avmplusHashtable.cpp. From my understanding (please correct me if I’m wrong!) flash uses an open-addressed hash table with quadratic probing for resolving collisions. This means that keys are directly stored within hash slots, while my hash table is open: in the case of a hash collision, a single slot stores multiple entries, which must be searched sequentially. Also the Tamarin implementation automatically rehashes the hash table when it’s full (load factor > 0.75), while I’m only allocating more capacity and leave rehashing to the user, because it’s an expensive operation and performance only slightly degrades when the load factor goes beyond one.

Performance

So here are the numbers for the IntIntHashTable. All results were done with the latest Flash Player for Windows. You can compile & run the benchmark on your own, but I guess you have to adjust the number of iterations or you get a script timeout.

The chart below shows how the HashTable behaves when a slot contains more than one key. A load factor of 1 means that each slot contains only one key/value pair. A load factor of 2 means each slot has two pairs that have to be searched sequentially and so on.

So it’s very predictable. A load factor of 10 makes the hash table 10x slower.

The chart and table below compares read/write/access operations between a IntIntHashTable and the Dictionary. Both structures are preallocated to minimize the time it takes to find a key.

size

1024

2048

4096

8192

16384

32768

65536

131072

IntIntHashTable

0,08

0,14

0,29

0,57

1,15

2,29

4,75

10,03

Dictionary

0,84

1,54

3,3

6,7

16,5

44,76

97,72

218,32

x times faster

10x

11x

11x

11x

14x

19x

20x

21x

Results are in milliseconds

The last chart compares the generic HashTable implementation against the Dictionary class. The difference is not too big, still the HashTable is about 20-30% faster.

size

1024

2048

4096

8192

16384

32768

65536

131072

HashTable

0,28

0,57

1,15

2,3

4,7

9,28

20,55

39,45

Dictionary

0,37

0,72

1,35

2,73

5,6

13,48

27,45

52,48

x times faster

1,32x

1,26x

1,17x

1,19x

1,19x

1,45x

1,34x

1,33x

Results are in milliseconds

Conclusion

While in most cases you won’t notice a big difference I have some applications in mind like hierarchical spatial hash tables for collision pruning where the extra speed should be very noticeable. So what’s next for ds? Probably a deque structure and lots of bug fixing of course ;).