I originally titled this post NHibernate Stupid Perf Tricks, but decided to remove that. The purpose of this post is to show some performance optimizations that you can take advantage of with NHibernate. This is not a benchmark, the results aren’t useful for anything except comparing to one another. I would also like to remind you that NHibernate isn’t intended for ETL scenarios, if you desire that, you probably want to look into ETL tools, rather than an OR/M developed for OLTP scenarios.

There is a wide scope for performance improvements outside what is shown here, for example, the database was not optimized, the machine was used throughout the benchmark, etc.

To start with, here is the context in which we are working. This will be used to execute the different scenarios that we will execute.

Note that we create a separate session for each element. This is probably the slowest way of doing things, since it means that we significantly increase the number of connections open/close and transactions that we need to handle.

This is here to give us a base line on how slow we can make things, to tell you the truth. Another thing to note that this is simply serial. This is just another example of how this is not a true representation of how things happen in the real world. In real world scenarios, we are usually handling small requests, like the one simulated above, but we do so in parallel. We are also using a local database vs. the far more common remote DB approach which skew results ever furhter.

Anyway, the initial approach took: 21.1 minutes, or roughly a row every two and a half milliseconds, about 400 rows / second.

I am pretty sure most of that time went into connection & transaction management, though.

So the first thing to try was to see what would happen if I would do that using a single session, that would remove the issue of opening and closing the connection and creating lots of new transactions.

I expect that this will be much faster, but I have to explain something. It is usually not recommended to use the session for doing bulk operations, but this is a special case. We are only saving new instances, so the flush does no unnecessary work and we only commit once, so the save to the DB is done in a single continuous stream.

This version run for 4.2 minutes, or roughly 2 rows per millisecond about 2,000 rows / second.

Now, the next obvious step is to move to stateless session, which is intended for bulk scenarios. How much would this take?

As you can see, the code is virtual identical. And I expect the performance to be slightly improved but on par with the previous version.

This version run at 2.9 minutes, about 3 rows per millisecond and close to 2,800 rows / second.

I am actually surprised, I expected it to be faster, but it was much faster.

There are still performance optimizations that we can make, though. NHibernate has a rich batching system that we can enable in the configuration:

<propertyname='adonet.batch_size'>100</property>

With this change, the same code (using stateless sessions) runs at: 2.5 minutes and at 3,200 rows / second.

This doesn’t show as much improvement as I hoped it would. This is an example of how a real world optimization is actually failing to show its promise in a contrived example. The purpose of batching is to create as few remote calls as possible, which dramatically improve performance. Since we are running on a local database, it isn’t as noticeable.

Just to give you some idea about the scope of what we did, we wrote 500,000 rows and 160MB of data in a few minutes.

Now, remember, those aren’t numbers you can take to the bank, their only usefulness is to know that by a few very simple acts we improved performance in a really contrived scenario by 90% or so. And yes, there are other tricks that you can utilize (preparing commands, increasing the batch size, parallelism, etc). I am not going to try to outline then, though. For the simple reason that performance should be quite enough for everything who is using an OR/M. That bring me back to me initial point, OR/M are not about bulk data manipulations, if you want to do that, there are better methods.

For the scenario outlined here, you probably want to make use of SqlBulkCopy, or the equivalent for doing this. Just to give you an idea about why, here is the code:

So by this post, Oren, you has confirmed our tests for NH are near-optimal. We use almost identical code.

We shows our performance is 2 times higher, or just 1.5 times slower than SqlBulkCopy. And, as I've mentioned, today I'll explain how to get even higher performance (I expect we'll get ~ at least 15-20% more) in my blog (
http://blog.dataobjects.net ).

I think being even 1.5 times slower than SqlBulkCopy is more than good acceptable for complete storage independence.

Since when is bashing the competition a valid sales technique? The only thing is that you lose respect from potential customers. If this new tool is indeed so much better (in all aspects, because I only hear performance arguments, which is absolutely not the most important thing for an ORM) then the public will decide that for itself.

I also don't understand why Ayende is spending so much time on these silly things, why not ignore it than a lot less people would even know about it, and it is not like you can ever change these persons mind.

Yes... It was infected by a virus right after launch - our developers had forgot to tune up the security properly. We resolved the issue almost immediately, but Google still remembers this, although the site is safe now.

Alex, is the source code of the benchmark itself publicly available? Even if we can argue all year long about how useful the benchmark itself is, there might be some people interested in actually profiling the frameworks to see where the 'bottleneck' is.

@ Oren: Which one? With hummer? Yes. If you're talking about this picture, I can only repeat the same ("I explained many many times why we don't test SqlBulkCopy"). I agree with your point: appropriate tool must be used for bulk insertions. But I wrote many many times in fact we didn't measure perf. of bulk insertions.

Ok, 100 insertions can be considered as bulk insertion operation (think about many-to-many rel. operations)? 10 insertions? Note that exact number does not matter much for the purpose of this test.

Btw, you still didn't answer on my "bet". Sorry for pushing on you, but since you're criticizing me publicly, I think you must follow the same rules here as well.

If you need access to source code repository (there is most current test code), please write to info @ ormbattle.net.

As for your bet, assuming you mean batching, it is meaningless. NH has this for 4 years.

Well, I promised to describe our own batching & related techniques, and I'm writing the post about this ;) So I'm talking about the ideas I'm going to share.

I don't care about ADO.NET batching - i.e. obviously, I knew it exists, and I didn't mean it.

Moreover, I also wrote about materialization speed. If current materialization speed of NH is good enough, just say you won't optimize this further, and confirm that our results are meaningless in real life in action.

Ayende, i like reading your blog very much but please please stop blogging about performance and benchmark for a while. I am really tired of seeing this one special face here over and over again. I miss blogs like the one about the erlang stuff. I would like to read something about the Axum Incubation Project, F# or maybe about the sqlalchemy orm for pyhon. it should work with ironpython now. What is possible with an orm running on the DLR compared to CLR ones? Or is it more fun to use NH on ironpython ? Thanks for your very informative blogs. I am 20 years in the IT business and learn something knew with every post from you. I really enjoy it ...but not this alex stuff...

I can totally understand why you don't want to take this guys crap lying down, but i have to +1 as well - this post is a slam dunk. There just is no debating this fool if this doesn't make him realize the fallacy of his premise.

Alex I'm wondering, even if your orm does these kind of operations faster, the whole point is.. an orm isn't a batch processor, you can get as excited as you want that your orm is king of batching, but put in the ring with batching systems your orm won't stand a chance.

These sort of benchmarks just don't show a reality, maybe the reality is that your orm is faster, but you aren't convincing anyone with those kinda tests.. as you can see you are just alienating yourself, and I don't buy the argument that any press is good, not in such a professional environment.

You d better relax man. I really think Ayende should just ban you on his blog. The reason he haven't done this yet is that he is a respectful person. But you simply use others person blog for promotion that f..ing dirty. IMHO, nobody will take your product for serious for couple of reasons:

No tricks will force me to prefer a well known, open source, time proof, well supported system (NH) to a commercial product done by few guys from Ural. I am not that crazy.

The only company phone number I found on the web site is the cell which belongs to you personally (Megafon - Ural - cheapest russian network) - ridiculous

Your promotion is really dirty, you gain no respect from community. So I think your customer relation is the same.

I doubt that your company has any idea about delivery management. All that I see is that you keep saying - we fix this immediately. So I expect that the product lacks code coverage and so on.

U d better show at least some respect. Estimating orm tools from the phone numbers and geographical coordinates - unheard stupidity for my money. BTW I hope Ayende doesnt need any support from such emo boys. Anyway - that's ok! What is not ok is that LEXX represents the general level of community here - agressive, arrogant and disrespectful to other points of view.

No need to wonder if most of the guys bashing Alex have ever given a try to his tool.

Alex, I believe you're a good guy, with honest intentions and good product. But I also believe that you are doing yourself and your product a big disservice by crusading here. You obviously struck a nerve with Ayende, and vice versa, but stop this now, before it's too late.

for those of you who still don't know what the only intention of alex here is read this book:

Positioning: The Battle for Your Mind, 20th Anniversary Edition

...Positioning describes a revolutionary approach to creating a "position" in a prospective customer's mind that reflects a company's own strengths and weaknesses as well as those of its competitors...

@ALL: if you keep commenting alex posts he will continue posting and posting. Just stop doing this and in a while he will be forgotten....

Thanks ayende for this blog. NHibernate is giving me performance issues even on the most trivial of tasks, so I hope what you put in this article will help me. Regardless of how powerful and easy an ORM tool can make your life, peformance will always be at the top of the list. Great will be the day when an ORM can peform very close to straight ADO.Net. I find all your posts informative. Keep it up.