Software tester, tea drinker.

Recently I have been diving into optimising performance of selenium cucumber tests, as there’s no single “make the tests faster” method I thought it’d be interesting to document all the different approaches I took (partly for my own future reference, and partly for anybody else who may find it useful.

Some of these are common sense, some of them don’t relate to code at all, some of them may be controversial, but at the end of the day YMMV, these are what worked for this particular project:

1) Reduce the number of tests

Ultimately the biggest performance improvement came from something this simple. An initial pass over the entire test suite revealed there were a number of tests that were either duplicates in some subtle way, redundant as they weren’t doing any more than a unit test, or otherwise unneeded. Pruning out these tests not only had the benefit of reducing total time taken for a test run, but also made reading the test suite as a whole much easier as they were now cleaner and only contained what was required.

2) Reduce redundant asserts in test steps

Again not really related to code, just common sense which often gets lost as a test suite builds up when there hasn’t been the time to go back over and rationalise everything as it grows. In the tests on this project there were many instances where we would assert something in one step and then assert the same thing again in a later step. Of course some of these were valid, but many were not, they were simply left in because the author didn’t realise that particular assert had already been done. There was also an example in this particular project of certain asserts being performed every single time a call was made to a page object; making it so these asserts were only performed the first time the object is called or whenever we were expecting the page to have changed had quite a big impact on performance.

3) Don’t use Thread.sleep()

This shouldn’t really need to be said, but don’t use Thread.sleep() anywhere, this is precisely what FluentWait or WebDriverWait is for.

4) FluentWait/WebDriverWait poll interval

Now we’re getting into the juicy details of WebDriver. We had quite a few instances in our tests of waiting for certain elements to appear on the page or contain certain text before continuing with the tests, nothing complicated:

Here we are specifying a timeout of 5 seconds in the wait, however we aren’t specifying what the polling interval is (i.e. how often it checks to see whether the condition is true). As it turns out, the default polling interval is 500ms, rather a long time to wait between checking. This 500ms rapidly builds into seconds and minutes if you’re extensively using waits throughout your test suite. By adding a third parameter we can specify our own polling interval:

A lot has been written on this subject elsewhere so I won’t go into too much detail, but not all element locators perform in the same way. Browsers are optimised to locate elements by ID, so try and use ID wherever possible. Badger your developers to add unique ID’s to all elements, it makes everything so much easier, I can’t stress this enough.

At the other end of the spectrum XPath is often slow and unreadable (though not always of course!), if you absolutely must use this form of locator then we’ve found CSS selectors are at least a little better and perform more reliably across browsers.

6) @BeforeAll instead of @Before hook

There were a few instances in our project where we had a before hook running something before each test that only really needed to be run once before all the tests (for example starting an email server, really didn’t need to stop/start it before every single test). Sadly for whatever reason cucumber-jvm has not implemented a global @BeforeAll hook, however as that post points out, writing your own is relatively simple.

7) Controversial – use JavaScript for form input rather than sendKeys()

sendKeys() is slow. However this is for good reason, Selenium tries to emulate user interaction, pressing each key as it types into a field. Though if you wish you can bypass this with JavaScript (setting the value of the element and triggering onchange for example):

Whether you would want to however is an entirely different matter. Going down this route does give a quite visible performance improvement, but at the cost of risking not triggering certain things if your website is expecting a user to type keys in a certain way, what if not everything is triggered onchange? This is a controversial one, and the purists amongst you will certainly not advocate this approach.

8) Use a different webdriver

In many cases this isn’t possible, you’re using selenium grid to conduct your testing across multiple browsers, but where you have the choice of webdriver to use, remember that Firefox isn’t the only one available!

On our own project we noticed for example that ChromeDriver performed our tests about 10% faster than the FirefoxDriver. Going beyond that you could even look into a headless browser driver such as GhostDriver which uses PhantomJS. Unfortunately for whatever reason some of our tests just refused to run with GhostDriver and there wasn’t much time to investigate. However as an aside we are using PhantomJS in conjunction with Jasmine for some of our unit tests, and just over 500 tests run around 5 seconds there.

For those that really want to stick with the FirefoxDriver then there is always the following snippet of code that may speed up things:

While this does speed things up by having the driver not wait for pages to finish loading before proceeding, it can introduce instability and unpredictable behaviour in tests. Use with care!

9) Use selenium grid

One final simple way to speed up things is to parallelise tests via Selenium Grid. There is a vast array of documentation on this elsewhere, but essentially you can have your tests running across multiple machines simultaneously.

In order to make this work of course you need to be following one of the most fundamental principles of writing automated tests, that your tests are independent. If there is any coupling between scenarios then clearly splitting up your scenarios to run across multiple machines isn’t going to work at all.

After finishing going through the code and implementing the above we had ended up reducing time spent running the entire test suite by 60%. The best thing about it all, none of this is complex! While there’s no catch-all “make the tests run faster”, a few simple steps can make all the difference.

As I’m sure everybody involved in the tech industry is aware by now there was a rather important security vulnerability in OpenSSL made public today, the so called “Heartbleed Bug“. Two factors come together to make this a particularly nasty incident; the impact of the bug allowing an attacker to view the contents of the memory on the compromised machine, and the widespread nature of vulnerable systems. With numbers such as “500,000” websites, and “millions of vulnerable systems” being thrown around, it’s easy for anybody to see how serious this is.

…or is it?

Have you heard anyone outside of the tech industry exhibit any concern? While it was mentioned on popular news outlets such as The BBC, and The Guardian, it didn’t ever feature in the most viewed articles of the day. Heck it didn’t even make the top 3 technologyarticles on The Guardian.

For the general public the layout of twitter is of more importance than the mysterious technobabble of “encryption” and “servers”. This is something that we as an industry need to address. How do we get users to engage with technology? How do we get users to engage with technological security? How could we have got the word out quick and wide enough that a majority of people stopped logging into Yahoo for example?

For a start I believe terminology. Does the average person on the street know what a server is? Encryption? A bug? This is a genuine question, I don’t know. Perhaps we need to start out with some research into how far tech words have reached into the general population. We cannot make any assumptions on this front as we all exist within the bubble. Assuming people do not understand those words, what would be a better headline? What gets people’s attention?

Is it worth being a little alarmist in such situations? I mean all the advice this morning was to avoid critical systems online until websites were patched. Did anybody outside of tech circles follow that advice? Did any actually see that advice? Certainly all the people still logging into Yahoo didn’t.

Secondly other than terminology is the problem that those in prominent communication positions (news outlets, politicians etc.) largely do not seem to understand technology themselves. If somebody from the tech department were chief editor of a newspaper, then I’d be willing to bet this would have been their front page story of the day.

What can we do? As individuals all we can do is continue to proselytise, keep your non-tech friends informed, keep writing to your politicians about technology issues. The more voices we add to the mix the more we will be listened to in future. Change in mindset will happen, not quickly but eventually.

But hold on a second… what if… maybe we’re looking at it from the wrong angle, maybe it isn’t about users not understanding technology.

Perhaps ultimately users shouldn’t have to care about technology.

Much of Apple’s success globally is down to the very principle that they shouldn’t. Are there technical measures that could have been implemented to safeguard customers without them needing to do anything? I realise this is drastic, but what would people have thought if an ISP blocked access to Yahoo for example until their vulnerability was fixed? Initial anger at not being able to access their mail, followed by a warm realisation that their ISP saved them from criminals stealing their passwords? As an industry we need to standardise on a single global voice for situations like this in future. If users do not understand enough to protect themselves then we need to do it for them. Far too many sites kept running instead of taking everything down until the problem was resolved. The action and onus should be on us to ensure end users are protected at all costs. If that means a sysadmin taking a unilateral decision to take down their multimillion dollar global website for a few hours then so be it. Explain it to the non-technical board/management afterwards. At the end of the day user safety and the reputation benefit is worth more to a company than a day of profit.

If users don’t care about technology, not a problem, we just have to make sure they don’t need to.

Set in the trendy tech area around Old Street (exposed brick and pipe work indoors abound!), the day started off with spotting a minor bug while walking towards the venue (it was an hour out). Arrived shortly after, checked into the conference, picked up the obligatory name badge and collection of stickers and grabbed a croissant and cup of tea to start the day (I must commend them on the selection available, I much prefer Twinings Assam to the standard English Breakfast).

The first talk was from the creator of Cucumber, Aslak Hellesøy, on where the tool is heading. I guess the big information was around the commercial side: Cucumber Ltd being formed to allow for more full time development, and Cucumber Pro (at this moment still in a sort of closed beta state). The Cucumber Pro tool looks quite interesting and I had come across it while looking for alternatives to Relish for making feature files easily accessible. Their goal with it sounds to be more engagement with the BA side of the business by abstracting away from the dev world of source control and jenkins reports. All of that will still exist under the covers of course, but this tool will hopefully put a pretty sheen over the top, make it easy to publish feature files as documentation, make it easy to see which features are passing and failing. Everything relating to the features of the system all in one easily accessible and easily usable place. It’s certainly something I’d be interested in taking a look at when it’s ready, but I think the big test will be working with such a tool in practice, how well will it integrate into existing processes, how well will this layer of abstraction above dev tools improve collaboration. All that is yet to be seen, but the future is promising.

The second talk of the day I attended was Liz Keogh giving a rapid fire run through of the history of BDD and the mistakes made throughout. Useful to see how far we’ve come, what what we’ve learnt and hopefully take all that forward and make less mistakes in future.

The next few talks I listened to were mostly around collaboration and the value of user stories and documentation/readability. I particularly liked Seb Rose’s talk on the relationship between user stories and features and why there shouldn’t necessarily be a one to one mapping. I do strongly believe that ultimately features are the documentation of the system, not individual stories, stories get thrown away after they’re done! Multiple stories add more acceptance criteria to a “feature” of the system. If we stick to this then we can keep using the feature files as documentation and keep the number of scenarios to the minimum required (instead of a proliferation of near duplicate yet subtle tweaked ones where stories map one to one with feature files). Seb also announced “The Cucumber-JVM Book” will be released, which as far as I understood will be “The Cucumber Book” but specifically for Cucumber-JVM instead of Ruby.

The final talk of the day from Matt Wynne had his thoughts on the 6 stages a dev team goes through when implementing BDD. From burnt toast, to auto-burnt toast, to the three amigos, into eventual disillusionment, and some thoughts on what the final stage of transcendence might look like. I guess much like the Buddhist Nirvana, we all have to find our own way there. We can take common tips, but what works for one team may not work for another, and we may not all get there, but that is the ultimate goal.

Overall I thought it was well worth attending, there was at least something to take away from each talk, and definitely a few things that give food for thought to being implemented as soon as I’m back in the office.

The book is a good overview of Selenium WebDriver, but it is a very dry and sometimes frustrating read.

There are grammatical mistakes throughout, the code examples are inconsistent (sometimes relevant lines are bolded, sometimes not, line numbers are mentioned however not printed next to the code, tab spacing on a couple of examples is out of alignment etc.).

However those are minor things that can be ignored; more fundamentally the book suffers from a lot of writing that explains how to do everything, but almost never “why”. Why use selenium’s file handling methods vs standard java.io (assuming java is being used of course)? Why need to know how to replace the client libraries with your own; the book in fact simultaneously described this “not an option”/”not the best idea”, but at the same time as “definitely useful” with no real explanation as to why this could be useful in a real world situation (does knowing how to do that help you write selenium tests any better, I can’t think of any reason why it would…).

Not only that large sections of most chapters are simply copy pasted paragraphs with single word or single line changes to cover each of the methods being described. This comes across as lazy writing and about as engaging as reading a javadoc.

Overall 3 stars because it does do a good job of covering most areas of Selenium you could want to know about. I just wish they’d spent more time editing/reviewing it before release, and included more real world examples covering more of the “why’s”.