Rants of an ADO.NET Dev

Tag Archives: Collections

“The concept of a stack in programming is very similar to a stack of plates in real life, except that you can cheat a little – for instance, if you’re willing to accept that plates can float, then you can ignore gravity.”

One of the fantastic things about using a rich framework like .Net is that many of the basic data structures that programmers require already exists in a efficient and easy to use form. We also make sure to update these pre-built data structures with the new features that we introduce in the language, for instance in .Net 2.0 we introduced generics, and so the System.Collections.Generic namespace was created. With.Net 4.5 we are introducing new async APIs and the async\await keyword pair, meaning that programmers will now need to deal with concurrency and multithreading more often especially if their application has any shared data structures. Luckily enough, we already have the appropriate APIs that were introduced in 4.0: the System.Collections.Concurrent namespace.

Stack it. Queue it. Bag it.

The first couple of APIs I’d like to introduce you to is the ConcurrentStack and ConcurrentQueue. These are exactly as they sound: A stack and a queue that permit concurrent operations. One common trap when writing a multithreaded application is the pattern of checking the Count of the collection to see if an item is available, before attempting to take an item – which is fine with single threading, but in a multithreaded environment it is possible to have another thread jump in between your check and getting the item which then takes the last item before you can. Instead, the concurrent collections have the the TryPop and TryDequeue methods, which will atomically check the size of the structure and return an item to you if there is one available.

The other data structure I hinted at is the ConcurrentBag. Unlike the Stack or Queue, the ConcurrentBag has no guarantees about the order of output versus input – it’s an “Any In, Any Out” collection. This allows ConcurrentBag can have a much more efficient implementation when there is contention, since the collection can return any object that it currently holds, rather than having to coordinate with any of the threads accessing it in order to return objects in the correct order. One of the best uses of a ConcurrentBag is for a non-time sensitive resource cache, like a buffer pool – where you want to have the best performance even with contention, but it is ok to not return the most recently used object (as you would want to do with a connection pool).

Performance

One of the issues with writing multithreaded applications is attempting to measure performance, especially when you have a shared resource. If you are running a single threaded test with the above data structures, then you may notice that simply putting a standard Stack or Queue inside of a lock gives better performance than the Concurrent equivalent. However, introduce some contention (i.e. have multiple threads attempting to access the same object), and the Concurrent structures begin to shine. Additionally, you need to be careful when doing multithreaded micro-benchmarks as you may introduce too much contention (since a “real” application is likely to do some work with the object is just obtained, rather than handing it back to the collection).that would then skew your results.

However, unless you have a high-performance single threaded application or a multithreaded application with no shared resources, then a concurrent collection will be your best bet. It may be slower in an application with little load, but it will be much easier to scale it to a larger application if needed.