jeudi 2 janvier 2014

Performances of Gatling-vs-JMeter FACT CHECKING:

Fact

"Gatling has much better performances than JMeter, See for yourself!" and the following 2 graphs are shown":

Gatling 1.3.2:

JMeter 2.8:

Context

I have been following with interest the hype around Gatling that has been running since a year and few months.The tool looked interesting (asynchronous system, reactor pattern...), I used it a bit at home and on a non critical project to see it at work, I did not continue using it in a professional context due to highly unstable API, limited features compared to JMeter, incomplete HTTP protocol at the time I started with Gatling 1.HTTP protocol implementation seems now close to complete, but other issues mentioned remain.Note I wrote in a previous blog about my experience and thoughts on Gatling.

One of the KILLER argument of Gatling is that it is supposed to have much better performances than JMeter and if you look at their website, it is the first argument "High performances" which points to:

I don't know what was the exact intention of this page, was it to kind of discredit JMeter by saying it didn't inject in a stable way the load correctly on Tomcat, the conclusion being "how can you trust results guys ?", seems so:

I tweeted recently with Stéphane Landelle (Developer of Gatling) about it and his answer was that he had made the test on old machine but forgot to update website, I must say this gave me the idea to take some time and work on the benchmark myself:

At time of Gatling 1.3.2, I ran the same benchmark on my machine which is similar to the JMeter benchmark, and what disturbed me was the different behavior of Tomcat instance depending on wether it was Gatling or JMeter running the test (much lower number of connector threads were started when Gatling was the Load Test Engine).

I made again the test with 1.4.2 and still got this difference and always that perfect graph.

So I said to myself, man you're stupid, there's a mystery you don't understand...

Mystery Uncovered:

This became suddenly clear to me with the release of Gatling 2 and backport to Gatling 1.5.0:

Gatling: I added disableCaching call to be in the same conditions as JMeter benchmark which does not use Caching feature (no HTTP Cache Manager in Test Plan), script on Gatling website worked in Gatling 1.3.2 because there was also a bug in caching, currently with 1.5.3 it fails on status 200 check, as one GET gets cached (so 304 response).

Gatling script (in red the only modification I made):

The TRUTH

Results:

Gatling 1.5.3:

As you can see graph is a bit different from Gatling Home one !, less perfect !

Now lets zoom to be in the same conditions of JMeter (no ramp down):

Wow, the graph looks much less "PERFECT", unzoom is tricky yes !

JMeter 2.11:This is the result of using the great jmeter-plugins.To be in the same display conditions as Gatling which has one point per second, I set the option "Limit number of points in row" to 607 (Test lasts 10 minutes and 7 seconds):

And here is what we get:

Conclusion:

Now let's scale the JMeter Graph (bottom graph) to have the same display ratio as Gatling one (top one), and here is what we get:

WOW, No graph is perfect and they are pretty the same !!!!

Lessons learned:

Beware of Benchmarks conditions and what you think are "quite similar conditions"