Thursday, December 10, 2009

I've been experimenting with some purely functional collections, also known as persistent collections, and began to wonder precisely how much overhead they incur as compared to the standard framework collections on a VM platform such as .NET.

My open source Sasa library has long had a persistent stack type, considered a list in functional languages, and I was considering adding some additional persistent collections. I therefore wanted a better understanding of the various costs involved.

I had also wanted to test an interesting design alternative to standard class-based collections. One of the consistent nuisances on widely deployed VM platforms is the widespread presence of null values. C#'s extension methods somewhat mitigate this problem since you can call extension methods on null values and properly handle it, but calling instance methods on a null value throws a NullReferenceException. Therefore, you cannot invoke ToString or Equals without first checking for null.

This is particularly annoying for data types where null is in fact a valid value, as it is in linked lists, where null denotes the empty list. The idea I had was to make the data type a struct, wrapping an inner reference that would hold the actual collection contents, basically moving the null into a non-nullable type that I control:

A struct type can never be null, so I can overload Equals and ToString behaviour, but still handle null as a valid value -- a null inner reference indicates an empty list, for example.

I copied over the Sasa Seq type, implemented a struct-based version of it, and similarly implemented two persistent versions of a queue type. I then ran some quick benchmarks against the system collection classes, List<T>, Stack<T> and Queue<T> using a random sequence of 10,000,000 enqueues and dequeues. This test was repeated 10 times in each run, and the average runtime and memory use was taken.

This process was then repeated 13 times for each collection type, and I threw away the top and bottom two values. The results follow, where all values are in CPU ticks measured by System.Diagnostics.StopWatch. The results are displayed in sorted order, with the fastest at the top, and slowest at the bottom.

Queue Operations

Legend:

PQueue<T>: persistent class-based queue.

LinkedList<T>: System.Collections.Generic.LinkedList.

PQueue2<T>: persistent struct-based queue.

Queue<T>: System.Collections.Generic.Queue.

PQueue<T>

LinkedList<T>

PQueue2<T>

Queue<T>

7699324462.55

7890322380.36

7679249692.36

7761780264

7718837154.91

7983445563.64

7687912510.55

7803131562.18

7740207043.64

8023998067.64

7703769050.18

7810452602.91

7828445642.91

8032055720.73

7800245839.27

7811040914.91

8451346377.45

8201663652.36

7834188839.27

7850104080

8589153123.64

8282536765.09

7955433135.27

7935923176

8707905445.82

8484668378.91

8178008811.64

8118527255.27

8764140750.55

8742958103.27

8323106361.45

8348756291.64

8979406774.55

8752582049.45

8327369725.09

8350023904

Averages

8275418530.67

8266025631.27

7943253773.9

7976637783.43

Stack Operations

Legend:

Seq<T>: persistent class-based stack.

List<T>: System.Collections.Generic.List.

Seq2<T>: persistent struct-based stack.

Stack<T>: System.Collections.Generic.Stack.

List<T>

Seq<T>

Stack<T>

Seq2<T>

7710263064

7826307262.55

7802069434.91

7690359998.55

7762791045.82

7870793184

7808408813.82

7770721440.73

7849427981.09

8035572504.73

7840059360

7989185347.64

7905690538.18

8099787029.09

7883823349.09

8201203589.82

7912993893.82

8315915441.45

7907107597.82

8208994094.55

7947893535.27

8349995466.18

7914102997.82

8239609125.82

7952697770.18

8396224841.45

7927640684.36

8352575028.36

8131449759.27

8694619845.82

8155345951.27

8376635294.55

8388583725.09

8738260760

8294985193.45

8540294446.55

Averages

7951310145.86

8258608481.7

7948171486.95

8152175374.06

Analysis

The persistent collections perform rather well, generally within 5% of their mutable counterparts given this set of test data. While I didn't calculate it, you can see from the variance in the runs that there's a wider standard deviation from the mean for persistent collections, so their performance is ever so slightly less predictable, though not drastically so.

Interestingly enough, the struct-based persistent collections outperformed the class-based versions. I expected the reverse considering struct operations are not always properly optimized by the JIT. Even though the entire struct would fit into a register, the JIT may still allocate a stack slot for it, which would be more expensive than the guaranteed register-sized operations of a class type. Upon further thought, I suspected that perhaps the struct versions are faster simply because the VM doesn't need to perform a null check on dispatch, but the class-based versions use all-static method calls which don't perform null checks as far as I know.

I don't yet have a good explanation for this, but given the results, I believe I will move all Sasa collections to struct implementations as it simply provides more flexibility, immunity from null errors, and no appreciable runtime overhead.

If I had more time I would make these tests a little more rigourous by varying the test vectors in a more controlled fashion instead of just random, ie. use stepped, random, and other types of enqueue/dequeue sequences, to determine exactly how persistent and mutable collections behave based on inputs. I suspect the performance profiles will differ more drastically in such different scenarios.

This test was enough to demonstrate to me that persistent collections are sufficiently performant to be used in daily code, particularly given their numerous other advantages.