Why Automation Projects Fail and How to Avoid the Pitfalls

Overview

Automation remains one of the most contentious topics in software development. You get three engineers into a room to discuss automation and you will end up with four contradicting absolutes and definitions. So for the purpose of this post, we will place the following limits on the topic:

Automation in this post will refer specifically to Automation of Functional Tests for a UI or API.

Functional Tests will be defined as tests containing a prescribed set of steps executed via an interface connected to consistent data which produces an identical result each time its executed.

Failure in this document will be defined as greater than 3 months of effort spent creating automation that is not acted upon or is determined to be too expensive or buggy to maintain after 12 months and is turned off or ignored.

This post will cover the three most common reasons automation fails:

Inability to describe a specific business need/objective that automation can solve.

Automation is treated as a time-boxed activity and not a business/development process.

Automation is created by a collective without a strong owner who is responsible for standards.

Wait, Who are You?

I am Michael Cowan, Senior QA Engineer at Jama Software. Over the past 20 years I have been a tester, developer, engineer, manager and QA architect. I have built automation solutions for Windows, Web, Mobile and APIs. I have orchestrated the creation of truly spectacular monstrosities that have wasted large sums of money/resources as well as designed frameworks that have enabled global teams to work together on different platforms, saving large(r) sums of money.

I have had the amazing opportunity to be the lead designer and implementer of automation for complex systems that handled millions of transactions a day (Comcast), dealt with 20 year old systems managing millions of dollars (banking), worked in high security/zero tolerance (Homeland Security) environments and processed massive big data streams (Facebook partner). I have worked side by side with brilliant people, attended conferences and trainings, as well as given my own talks and lectures.

I have a very “old school” business focused philosophy when it comes to automation. To me it is not a journey or an exploration of cool technology. Automation is a tool to reduce development and operating costs, while freeing up resources to work on more complicated things. I strongly believe that automation is a value add for companies that correctly invest in it, and a time/money sink for companies that let it run wild in their organizations.

Failure Reason #1: Unable to describe a specific business need/objective that automation can solve

The harsh truth is that, by itself, clicking a button on a page (even a really cool/difficult/complex custom button) has no value to the business. The business doesn’t care if you click that button manually with the mouse, execute it with Javascript, call the business logic via API or directly manipulate the database. What they care about is ensuring that a customer is not going to call up after the release to return the product, or some blogger wont discover a major issue and drive away investors with an scathing review.

Automation Projects fail when they are technical exercises that are not tied to specific business needs. If the ROI (Return on Investment) is not clearly understood, you are unlikely to get the funding to do automation correctly. Instead you will find your efforts rushed to just implement ‘Automation’ and move on. Months later, everyone is confused why automation hasn’t been completed, why it doesn’t do x, y, z and why all the things they assumed would be included were never planned.

Nothing is worse than a team of automation engineers thinking they are making great progress, just to have the business decide to pull apart the team due to a lack of understanding the value. If you are running automation directly tied to an understood business need, then the business leaders will be invested. You will find support because your metric will clearly show the value being produced.

Another consequence of running automation as an engineering project is in making decisions based on technology instead of business need. If you decide upfront to use some open source tool you read about, you will find yourself telling the business what it (you) can’t do. Well no, our tool doesn’t hook into our build server, but we can stop writing tests and build a shim. Pretty soon you are spending all your time converting your project to be more feature rich, instead of creating the test cases the business needs. This is how teams can spend 6-12 months building a handful of simple automation scripts. Even worse, you end up with a large code base that now needs to be maintained. The majority of your time will have been spent building shims, hacks and adding complexity that has nothing to do with your business offering or domain.

Mitigation

It’s actually very easy to avoid this pitfall. Don’t start writing tests until you have a plan for what automation will look like when its fully implemented. If your team practices continuous integration (running tests as part of the build), don’t start off with a solution that doesn’t have built in support for your CI/Build system. Find an industry standard tool or technology that meets the business needs, create a POC (Proof of Concept) that proves your proposed solution integrates correctly and can generate the exact metrics the business needs.

Write a single test to showcase running through your system and generating metrics. Make sure the stakeholders take the time to understand the proposed output and that they know what decisions that information would impact. Get a documented agreement before moving forward and then drive everything you do to produce those metrics. If anything in the business changes, start with reevaluating the metrics and resetting expectations. Once everyone is on the same page start working backwards to update the code. Consistency and accuracy in your reports will be worth more to the business than any cool technical solution or breakthrough that you try to explain upwards.

If you are in management, you might consider asking for daily automation results with explanations of all test failures. If the team can not produce that information, have them stop building test cases until the infrastructure is done.

Key deliverables that should be produced before building out automated scripts:

A documented business plan/proposal that clearly lays out the SMART goal you are trying to accomplish.

Your proposal should include a template with sample (fake) data for all reports.

A turnkey process generates those reports and metrics from automation results data.

1 single automated test that does a simple test on a real system and generates a business report.

Take away

The key takeaway is that business and project management skills are critical to the success of any automation initiative. Technical challenges pale in comparison to the issues you will have if you are not aligned with the Business. Don’t start writing code until you have gotten written approval and have a turnkey mechanism to produce the metrics that will satisfy your stakeholders. Remember your project will be judged by the actionable metrics it produced, not by demoing buttons being clicked.

Failure Reason #2: Automation is treated as a time-boxed project and not part of the software development process

Automation is not an 8 week project you can swarm on and then hand off to someone else to ‘maintain’. A common mistake is to take a group of developers to ‘build the automation framework’ and then hand it off to less technical QA to carry forward. Think about that model for your company’s application. Imagine hiring 6 senior consultants to build version 1 in 8 weeks and then handing the entire project off to a junior team to maintain and take forward.

Automation is a software project. It has the same needs for extensibility and maintainability as any other project. As automation is written for legacy and new features, constant work needs to be done to update the framework. New requirements for logging, reporting or handing new UI functionality. As long as you are making changes to your application, you will be updating the automation. Also keep in mind most automation frameworks are tied into other systems (like build, metrics, cloud services) and everything needs to stay in sync as they evolve.

You quickly end up in a situation where junior engineers are in over their heads and either have to stop working on automation until expert resources free up, or they go in and erode the framework with patches and hacks. The end result is conflict which lowers ROI, generates a perception of complexity and difficulty and eventually leads to the failure of the project.

Mitigation

Again, this is an easy pitfall to avoid. Your business plan for automation should include long term resources that stay with automation through its initial lifecycle. It’s still beneficial to bring in experts during key parts of framework creation, but the owner(s) of the automation need to be the lead developers. They will build the intimate knowledge required to grow and refactor the framework as tests are automated.

Additionally, leverage industry standard technologies. Automation is not an area you want to be an early adopter. If your organization is building a web application you will want to pick a framework like selenium instead of something like m-kal/DirtyBoots. A good standard as a manager, you should be able to search LinkedIn for the core technologies your team is proposing and find a number of experienced people in them. No matter how awesome a mid level engineer tell you this new technology is, when he leaves, the next person will insist on rewriting it.

Take away

If you are using standard technologies and industry best practices, you will not need an elite team of devs to build the framework for QA. The complexity for the automation project should remain fairly constant through the life of your company’s application updates, new features, UI uplifts. The original creators of the framework should be the same ones automating the bulk of the tests. Additional, less experienced scripters can be added to increase velocity, but a consistent core group will produce the beast results for the least investment.

Failure Reason #3: Automation is created by a collective without a strong owner who is responsible for standards

Making the automation framework a community project is a very expensive mistake. If your company created a new project initiative with the guideline of “Take 3 months to build a good CRM System in any language that we will use internally” and turned that over to 10 random devs, to work on in their spare time, you would expect issues. Automation has the same limitations. A small dedicated team (with members that expect to carry Automation forward for at least a year or two) has the time to gather requirements, understand the business needs, build up the infrastructure and drive the project forward to success. An ad-hoc group with no accountability, especially one that the main members will not be doing the actual creation of tests, is going to struggle.

Everyone wants to work on the fun POC stage of automation, hacking together technologies to do basic testing and reporting. Most QA has some experience with previous projects and they have their own ideas about what can and can’t work. Without strong leadership, an approved roadmap and strict quality controls you will end up with an ad-hoc project that does a variety of cool things, but you can never seem to tie it together to get the information you need for actionable metrics. The team always has low confidence in the tests or their ability to reproduce results reliably. There is always a reasonable sounding excuse of why. The fun drains away as weeks turn into months and your team finds other things to focus on, while automation stagnates.

Eventually it becomes apparent how little business value was produced for all the effort, how much work remains and no clear owner to hold accountable or plan how to maintain or move forward. The champions for the fun part have shifted their attention to the next cool project. Management will end up assigning people to sacrifice their productivity by manually generating reports, cleaning up scripts and try training others to use it. Eventually everyone agrees the process sucks. Eventually a new idea/technology will surface and the cycle repeats itself.

Another common mistake is assuming that adding more engineers to aid in writing automation will increase ROI. Remember that ROI is measured against the business objective, not lines of code. Unlike application development there are few established patterns when it comes to automation. This means 2 equally skilled automation engineers will write vastly different automated tests for the same features. Remember that adding less experienced engineers requires your experienced engineers to stop building automation and start a training, mentoring and code reviewing program. In order for this program to be successful, every test written needs to be reviewed by 1-2 people to ensure it fits. It will take months until the additional engineers will be autonomous and able to contribute without degrading the framework. Additionally the more complex the application, the more custom systems, features and controls it can contain. Each of these variations will need the more senior engineers to tackle them first. Even with these efforts the business has to accept that new automation engineers will not write the best tests, it can take years to build the skills and apply the concepts correctly. This is a large factor in the constant ‘do-overs’ that automation projects suffer from.

I would assert that the ONLY business value, from automation, comes via the metrics and reports it produces. You could have the best automation in the world, but if it just clicks buttons and never produces an actionable report of its findings, then it has no value. Good automation will be structured in such a way as to produce a comprehensive report, that shows test coverage, is easy to understand and accurate release to release. Imagine having a large group of sales and marketing people, all working separately to generate their own KPIs from their own data. How cohesive would their reports be? Could the business make informed decisions with KPI from different groups at different scopes? The skill to structure and create valuable automation is not the same as being able to read the Selenium Documentation and click a button on a page.

We should always be working towards an approved business objective. Is the business objective to write test cases as fast as possible, even if they can’t be maintained? Or is it to “Automate the Regression to free up QA for other tasks”. Shifting your engineers’ time from running manual regressions, to babysitting automation does not solve anything (and actually reduces the test coverage). In certain cases, slower test case development by a smaller team of experienced engineers is the way to go. As long as you build in redundancy and have their work open for review and feedback you will produce value much faster.

Mitigation

Building automation that can generate reliable and actionable metrics is non-trivial and requires a lot of structure, discipline and previous experience. Automation projects should always be championed by 1-2 engineers experienced with setting up automation projects. They should make a compelling case to the business on what they want to build and what value it will bring. Once the business signs off, they should be given the space to build out the initial POC framework and sample test case(s). Once a working prototype is in place, feedback is solicited and then the project moves forward. The core team should be 2-3 engineers who are equals. Once all the critical areas are automated and the framework is hardened, you can begin training up interested individuals by pairing them with an experienced member of the team.

This initial work should be done by a core team of 2-3 engineers. They will be held accountable for its success or failure. It’s critical to make this group show working tests for all the main/critical areas of the product. It’s these initial tests that expose gaps in the framework. Once a working set of automated tests have been completed and tested from kickoff to report delivery, you can discuss training a small group to start building out test cases and moving automation into other teams.

Take away

When looking at an automation report, you need to be able to understand, at a glance, what was tested and what wasn’t. When you have questions on failed tests, you need to be able to quickly understand what the test did and the results. All tests should have the same scope and voice. Imagine if Feature X only has 5 tests with 100 steps each while Feature Y has 100 tests with 5 steps each, how do you combine those data points to understand the real state of the product. As the group gets larger and larger, it’s harder to maintain a single voice. You will move much faster allowing your core group to solve these problems before introducing less experienced engineer.

Summary

In this post I discussed the three most common reasons automation fails, ways to avoid them, and keep your projects focused on increasing ROI and business value.

The Jama Product Development Platform is a solution for complex systems development.