Software Testing

I recently met with an Australian Software Development company, PepperStack, and we got onto the subject of Software Testing. As someone who began their career as an Electronics Hardware Engineer, one of the things I learnt was that you have to test thoroughly to be sure everything is working as it should be. With Electronics, if you make a mistake with an Engineering Calculation you can easily destroy things. This is sometimes referred to as “letting the smoke out”. So it was good to meet with others who believe in the same level of rigorous software unit, module and system testing that we do.

Some Engineering Humour

Which reminds me of a joke I once heard:

There are 3 Engineers in a car going for a drive. The first is a Mechanical Engineer, the second an Electronics Engineer and the third is a Software Engineer. Fortunately the Mechanical Engineer is driving because the brakes fail and they are going downhill. The Mechanical Engineer eventually brings the car safely to a halt and gets out to examine the hydraulic systems. The Electronics Engineer gets out and checks and body computer, ABS system and the power train CAN bus. The Software Engineer stays in the car and when queried about it says that they should all just get back in the car and see if it happens again!

Now don’t get me wrong, I’m not having a go at Software Engineers. The process of finding and eliminating faults is a very important part of the development cycle and is something that needs up front thinking and not just responding to symptoms. And the more complex or sophisticated a system is, that harder it is to fully test every possible response to every possible stimuli and after a certain point it becomes impractical to have 100% Test Coverage (every line of code has been executed through all of the possible states). The reason this is a bigger problem withSoftware Development is that the flexibility of software means that it is inherently complex and it takes skill and planning to manage that complexity so it is testable.

So here is the issue. More than any other discipline, faults can be experienced by an end user of a product under a situation or scenario you could not have proactively tested against before release. There are many potential reasons for this including:

change of hardware or operating system environment

new standards or protocols

the sheer number of potential combinations of drivers, peripherals, software and users

the product being used for a purpose it wasn’t originally designed for

gamma ray corruption of a memory location – I am getting esoteric now but in some areas like avionics and space this is a big threat

So how do you reduce the likelihood of these problems occurring?

Improving Software Quality

With many new products having Electronics and Embedded Software and the Software Development requiring 80% of the effort, it is important to delivery it as quickly and fault free as you can. The main weapons in your Software Quality arsenal have been known about for a long time but are, in our experience, just not used. These are:

Architectural Design – work out how the data and execution flow will happen and how you will manage the constraints

Functional Decomposition – divide and conquer but with an emphasis on how each module fits into the system and how the interfaces work in detail

Error handling – who will decide what to do with response codes – again this is data and execution flow and part of the architecture. In many cases exception management is at least 50% of the project.

Have an Integration Test Plan – some thing that proves the data and execution flow matches the architectural design. Too often “it builds” seems to be good enough here.

Unit Test modules – so you remove all the issues before adding them to the integration

Do the Integration Tests before you try system testing

Design modules so you can integrate them as shells then add functionality down the track

Have NVM and configuration data available at the beginning of the project and not as an after thought at the end

Have a rationale for what level of Churn you can accept – Churn is the percentage of the lines of code that have changed in the past time period. Usually either a week or month depending on the size of the project.

Use automated software quality tools. For instance we use both PC-Lint and RSM to automated many software quality metrics which saves a lot of time in Code Reviews