An Overview of the Selenium Suite

First Things First

For those unfamiliar with Selenium, it is a suite of open-source browser automation tools consisting of three main products: Selenium IDE, Selenium Server, and Selenium WebDriver. Selenium IDE is a Firefox plugin that can be used to record and playback tests. Selenium Server is a java application that is used to control browsers on remote machines and/or to create what is called a Selenium Grid. The final component of the suite is Selenium WebDriver, which is a browser automation API designed to create tests or perform other required tasks. Each of the suite components serves a specific purpose in a web tester’s toolbox although all of them may not be needed within an organization.

The Pros and Cons of Selenium

As with any tool set, Selenium has positive and negative aspects for each of its components. The most obvious benefit for Selenium would have to be the raw cost of the software; you can’t beat free. While I have seen many arguments about free software not being free, most of the costs identified, such as training, are common for all software and tend to be a wash. With that said, I’ll offer some insight into each of the suite components.

Selenium IDE

Selenium IDE is a test development environment that integrates with Firefox as a plugin. It allows the user to record and play back onscreen actions for quick test creation. It is also offers a way to build tests manually. Using one or a combination of these techniques, a user would be able to produce an automated sequence for testing or recreating a bug very efficiently. In addition to the ease of use provided by the IDE, the functionality offered can be expanded through the use of plugins.

As I have found with all record/playback style test automation tools, the tests created in Selenium IDE are usually fragile and not scalable. Combine this shortcoming with the fact that Selenium IDE is only supported by Firefox, and it becomes clear that it isn’t the best solution for providing a robust test environment. Another drawback of using the IDE is that it has been marked for deprecation in Selenium 3.0 although there will be some support for running the scripts.

Selenium Server

As mentioned earlier, Selenium Server allows you to connect to remote systems and control the browsers installed there. While the server can be used as a single instance, the real power of this tool comes from establishing a Selenium Grid. A grid consists of several systems running the server software in a hub and node configuration. The hub provides a central connection for the tests and coordinates operations by distributing tests to the nodes according to the requested browser, version, and operating system. Each of the nodes is capable of controlling multiple browser instances and facilitates cross-browser testing as well as parallel runs to improve test run times.

When used alone, Selenium Server can only automate versions of Firefox, but there are additional drivers that can be used to control other browsers. These drivers can be run independently or connected to Selenium Server to make them part of a grid node. Among the browsers supported are Chrome, Internet Explorer, Safari, Opera, and PhantomJS (a headless browser). There are also drivers to control Android, iOS, and Windows phone among others.

The benefits of having an easy way to configure and control multiple browsers across different platforms via a common access point should be self-evident. Unfortunately, the Selenium Grid has limitations that aren’t always expected. For example, each node is capable of running multiple instances of each of the supported browsers, but tests may interfere with each other if they are running in parallel and attempt to run on the same machine. It is also a good idea to cycle the grid machines occasionally to avoid memory issues and lockups during test runs.

Selenium WebDriver

The WebDriver API is a more powerful and flexible tool for browser automation than its IDE counterpart (and other GUI-based tools), but using it requires development knowledge. The API was originally developed in Java, and it has been ported for use in Ruby, PHP, Python, C#, and Javascript (Node), making it available to most (if not all) popular platforms. When used in conjunction with a Selenium Grid (including additional browser drivers), WebDriver tests become extremely powerful. A single test—if written correctly—can be used to perform cross-browser and cross-platform testing without modifications.

One difficulty a tester will run into when starting out with Selenium WebDriver is the learning curve. Even if the person has development knowledge, there are many paths and variations that can be used to develop tests. A few searches online will provide numerous examples of how the API can be used, but many are simplistic and fragile, which usually results in frustration. The key to success lies in study and experimentation, otherwise known as experience. There are some excellent sources of information available—some of which I’ll list at the end of this post—but they will not always suit your particular style or situation, so you need to be flexible.

Weighing the Options

While there are many different applications that provide browser automation, I have found that they generally fall into two categories: those that do not run browsers natively and those that are built on top of Selenium WebDriver. While I prefer to avoid reinventing the wheel, I have yet to understand why someone would buy a tool built over a free tool so they could write unique code using the free tool.

Another consideration in favor of Selenium WebDriver is the W3C WebDriver Specification. This proposed standard is being promoted as the base requirements for supporting automation in modern browsers. The specification contains a subset of the Selenium WebDriver functionality and was based on the Selenium project. With that in mind, can you really say that there is a better option than using the industry standard to get your job done?