Delphi 2009 – StringPerformance Redux

It looks like I may have jumped the gun with my conclusions from the previous exercise to benchmark string performance in Delphi 2009. Following a useful exchange in the comments with Kryvich I corrected a small discrepancy in the tests and made some changes to the performance testing subsystem within the SmokeTest framework. I then re-ran my string performance benchmarks with some significant – and more encouraging – differences in the results.

I shall not go into the specifics of the tests again – if you’re interested and missed it the first time around you can read my previous post.

The major changes were made in the SmokeTest performance testing subsystem itself:

1. Corrected a bug in the CPU affinity code (oops!)

2. Implemented CPU instruction cache flushing

The CPU affinity fix now correctly ensures that the code under test is executed on the 2nd processor in any N-core or N-processor hardware. The main thread of the SmokeTest framework application itself is assigned to the 1st CPU.

This is intended to eliminate context switching artefacts and main thread impact from the code under test. The bug was that the main thread was inadvertently allowed to run on all available CPU’s – the impact on test results is negligible I think, but even so.

The CPU instruction cache flushing has a much more significant impact and is performed following each execution of a method under test. This has had a significant impact on the raw numbers coming out of the test results (i.e. the number of executions completed per second) but should ensure that the performance of the code itself is tested, not the efficiency of the CPU cache.

The source code being tested is essentially unchanged from that previously used, although it contained one minor correction to the Indexing test case already documented in the comments of that previous post.

I have updated the results download with the new data. This time I have included the Excel spreadsheet with the pretty formatting of the results data comparison and also all three of the raw results files emitted in CSV format by the test app itself.

Observations On The New Results

The new test results actually fit more intuitively with what we might expect, albeit still with one or two noteworthy results.

[dm]10[/dm]

1. Overall Unicode string performance is about 5-10% less efficient than ANSI string handling in Delphi 2007 but is comparable to and in many cases generally improved when compared to Delphi 7.

2. Char-wise operations are the most adversely affected as we might expect but, somewhat surprisingly, simple assignment of strings actually comes out as the most badly affected of the basic operations.

3. ANSI string handling is generally as good as if not slightly better than Delphi 2007 overall, even more so when compared to Delphi 7. Notable exceptions to this remain in the form of the lack of an ANSI implementation for IntToStr() and a significantly slower Replace() implementation.

4. There is still a question mark over the raison d’etre of TStringBuilder, although the difference in performance – whilst still very dramatic – is not perhaps as great as first it appeared.

Revised Conclusion

Concerns w.r.t the performance of ANSI strings were largely misplaced.

There remain a couple of potential gotcha’s in the form of IntToStr() and Replace() but as previously noted, the FastReplace() implementation remains the gold standard for anyone concerned enough to use it (after taking care to ANSI-fy the API of FastStrings itself of course).

Overall string handling performance at worst undoes some of the gains made in this area in Delphi 2007, but I think that is a reasonable trade for the Unicode capabilities added as a result.

I have also learned some valuable lessons that have improved the utility of my SmokeTest framework into the bargain, so thanks to all who questioned and probed the previous results.

Mmm, shouldn’t TStringBuilder be used in the following way to achive actual performance benefits:

1) Create builder instance
2) Set builder Capacity, either as the real predicted size of the future string or just as a big enough amount.
3) Now use convenient Add… methods
4) Get your result string.

Actually this way you’re preallocating memory, preventing its reallocations during additions. And this way you’re using nice methods instead of using SetLength(Result…) and manipulating with Result string or using a memory stream.

Regarding the StringBuilder – AFAIK, the reason it was introduced in .net was not (directly) a performance, but rather the unsuitability of the .net memory manager for frequent small changes in the strings (i.e., changes in their sizes). Indirectly this, of course, affects the performance too.

I don’t know, how far this affects Delphi and its memory manager, and could this be interpreted as preparations to introduction of garbage collection in Delphi.

@Bruce – I also think there are very interesting times ahead on the .NET side of things… very interesting indeed. And in a good way. 🙂

@Kashmi – I once had the same thought and created my own string builder-like class a while ago specifically to handle specialised cases that I thought could be optimised. e.g. building a delimited list from a known set of elements, where the resulting string size could be pre-calculated and pre-set and then the contents placed directly into the resulting string buffer.

Intuitively this should give some improvement in performance.

In reality I found that the expected performance gains were simply not realised, and performance was actually worse than building the string up using regular concatenation.

I didn’t analyse it too closely. I put it down to the fact that using a class to encapsulate this stuff necessarily introduces it’s own overhead (method calls vs – i.e. on top of – RTL code as well the construction and destruction of the class itself etc).

@Daniel (Luyo) – I haven’t compared EXE sizes and I think doing so is a little tricky as it isn’t possible to compare a Delphi 2009 ANSI executable with a Delphi 2009 Unicode one since of course the former is not possible to produce. We could only compare a 2009 exe with a pre-2009 exe and I think there are too many other factors involved between versions for such a comparison to provide any useful measure of the impact in this area of Unicode specifically.

@Kryvich – Thanks. I feel these results have a more “truthy” feel about them too. 🙂

I don’t know what you know about .NET so I can’t know whether I know something you don’t know, cos I don’t know if you don’t know it or not. But if I did know that I knew something that you didn’t know, then self-evidently I must able to tell that, so being simultaneously in a state where-at I could not tell would create an internal paradox and I would consequently most likely disappear in a puff of logic.

😉

But you only have to look at recent indications from CodeGear themselves that future Delphi.NET releases will not be so concerned with compatibility with Delphi.Win32 and will be more focused on leveraging the latest and greatest .NET technologies.