Safety-critical software development surprisingly short on standards, analysis, and review

Recent survey data indicates that static code analysis, peer review, and basic coding standards are being neglected in the development of safety-critical connected embedded devices. In our monthly safety and security interview with Andrew Girson, Co-Founder and CEO of embedded consulting firm Barr Group, he picks apart the recent findings.

Safety and security are both critical to the development and deployment of connected systems, but while the two overlap, they are also distinct. Can you define safety and security in an embedded systems context?

GIRSON: Within an embedded systems context, safety and security become broad issues, crossing disciplines from mechanical to electronics to software. As contrasted with other technology environments such as cloud computing and mobile apps, an organization developing an embedded device must incorporate disparate engineering disciplines into the process. Thus, safety and security must first be considered at the higher system level, somewhat independent of engineering discipline. It is critical during early requirements analysis and architectural design to incorporate security and safety expertise into the process.

There will be overlap in design of systems that are both safe and secure. Perhaps the most obvious overlap is in the concept of reliability. Incorporating best practices into the design process that improve reliability reduces the potential for bugs/faults that cause improper operation of the device. This improves safety and also reduces the opportunities for hackers, since many security breaches occur via hackers exploiting system bugs/faults. For this reason, we strongly advocate the use of best practices that increase reliability. Some of these are universal across engineering disciplines, such as peer design reviews. Some are specific to an engineering discipline, such as signal integrity analysis or the use of coding standards or static analysis.

Beyond this, there are indeed distinctions between safety and security. Improving reliability is not enough to ensure adequate security. A secure design must consider how hackers think, how they might try to enter a device, how expensive it would be for them to do so, etc. Protecting data-at-rest and data-in-motion via encryption can be very important, as well as authenticating over-the-air updates. Providing anti-tamper mechanical features can be critical, too. And, one must consider the motivations and resources of the hacker. If the device you are designing is of interest to just a few, security levels may not need to be as high as if your device has super-secret or financially valuable information that might be of interest to a large group or nation-state.

Barr Group recently conducted a survey surrounding safety and security for connected embedded systems, which had some astonishing results. Can you share some of the results around the (lack of) best practices being used in safety-critical, connected system development?

GIRSON: Our recent Embedded Systems Safety and Security Survey did uncover concerning trends around best practices for embedded software development. Of over 1,700 qualified respondents, we did an analysis of those that self-identified as currently involved in the design of embedded devices that could kill or injure if the device malfunctioned – so-called safety-critical devices. Approximately 28 percent are designing these safety-critical devices and it should be a foregone conclusion that well-known fault-reducing best practices in the development of embedded software – such as the use of code reviews, coding standards, and static analysis – should be universally used in these device designs. Yet, we found:

17 percent are not using coding standards

41 percent are not performing comprehensive code reviews

32 percent are not incorporating static analysis

It seems like a no-brainer to do these best practices. Studies have shown that they work. They are generally well-known and supported by a variety of third-party tools. And yet, we are not seeing universal support for them. Why? Perhaps it is time-to-market pressure. Or limited development budgets. Whatever the reason, it needs to change, and I encourage everyone to review the complete survey results which are freely available as a report on our website.

In your opinion, what is more of a threat to safety-critical connected systems: Hackers compromising a network and manipulating devices, or errors during the development process that make their way through test and into shipping products?

GIRSON: Both are significant concerns. But there is a difference, at least philosophically. Security is an arms race. Hackers always will be coming up with new and unique ways to compromise embedded devices. So, designers always must be vigilant and build an appropriate level of security measures into their devices, considering how they are used, the value of the device and its information, and the resources of the potential hackers.

That said, the biggest concern to me is the lack of recognition and use of best practices, as noted above. If design teams and their management are not willing to incorporate even lightweight best practices into their development, how can we as an industry be all that surprised at the safety and security issues that we see in Internet of Things (IoT) and embedded devices?