You know the manufacturing automation or process control errors we are talking about: The nagging errors in the log that don’t seem to go away. The system is “running fine” but there’s a tag in the PLC or other controller that doesn’t exist, or a link in the application that doesn’t compile completely. The lesson here is simple, but one that lots of engineers ignore:

NO ERRORS NO WARNINGS.

Anthony Baker received a recent call from a very stressed out, very important client. The client had an human-machine interface (HMI) software system that would become unresponsive after 12-15 hours of uptime. The onsite engineers were unsure of the cause and had called Anthony as a last resort. The system was in a fragile state, and production was at risk.

This was a new site for Anthony and, despite having a deep understanding of this particular application, Anthony did what lots of engineers do and went into the application error logs, created a list of errors and one-by-one hunted down the root cause and eliminated the message. In many cases the error was a warning such as “Tagname ZZ does not exist in the PLC” or “Database connection Y has timed out.” All the errors could have been prevented with clean implementation and more thorough test plans that include a “NO ERRORS NO WARNINGS” policy.

Eliminating the errors in the logs one by one, the issue was eventually solved. As a note, the most frequent and obvious error was actually not the root cause of the outage — it was a subtle, one-time issue that cropped up in the logs and was dutifully logged by the HMI application. Unfortunately, the root-cause error was drowned out by a recurring issue that had been present in the system for years. Only by fixing this error flooding the logs could Anthony fix the real issue.

4-point no-errors policy

As a culture we as software engineers need to adopt a no errors/no warnings policy with strict adherence. More specifically:

1) Maintaining errors throughout a project is much simpler than just trying to fix it at the end.

2) How do you know you are not creating issues if you have constant errors and warnings? If the logs are 100% clear, you make a change, and then something occurs, you know you have caused a problem. If, however, the logs are full of problems, you are likely to be unaware of any issues you inject into a system with your work.

3) It might seem like more work but in the long run, but taking care of errors as you go creates a higher quality, more stable end product with less overall effort.

4) Lastly, nobody would expect a bridge builder to leave a site saying: “Well, the bridge works, but there are a few cracks here and there, but they shouldn’t matter.”

4 error-related answers you should know

If you understand the prior four points about errors and alarms, you will be able to answer the following questions quickly:

c) If so, am I very clear on their meaning and have a clear plan to remove them? (Such as, I have three warnings because these duplicate coils exist, but I will remove them at the end of the day when I remove my simulation logic.)

d) If a peer were to sit down and review my work thoroughly for errors and warnings would I be proud of my work? (Here, think of the civil engineer building a bridge.)