I've just been reading
Opinion - Taking Software Seriously
P. B. Ladkin
University of Bielefeld
Journal of System Safety 41(3), May-June 2005
(No, I don't subscribe to that Journal. Wish I did. I got it from the
Beielefeld web site.)
The topic of the paper is basically "How do we achieve, and know that we
have achieved, extremely low failure rates? We can't." Quotes:
Both these standards [for safety critical software] require that,
for the most critical systems, someone will have had to demonstrate
a dangerous failure rate of at most one failure in one hundred
million [one standard], respectively one billion [the other], hours
of operation (the fabled ten-to-the-minus-nine rate),
for the device running your software.
...
What do we know about defect rates or failure rates?
... current good practice achievs a defect rate in delivered
software of less than one per deliver KLOC.
Les Hatton ... using "Safer C" ... 0.24 per K executable LOC ...
Praxis ... 0.22 per KSLOC ... on [a] helicoper landing advisory system
[and] 0.04 defects per KSLOC on [a] smart card security project
[and] comparable or lower defect rates on more recent projects.
I know of no better results than those of Praxis.
Issue: defect rates are thought to scale worse than linearly with
SLOC; reducing SLOC count by a factor of 6, say, can be expected to reduce
number of defects by more than a factor of 6.
... what do we know about the efficacy of testing SW?
Quite a lot. There is a hard mathematical limit ... published by
Bev Littlewood and Lorenzo Strigini ... and Rick Butler and George
Finelli ... in 1993. It disturbs me that it is still not generally known.
...
how can you possibly attain that [] confidence when you know that the
very best practice only gets you down to one fault per few KLOC?
Answer: you can't. ... To summarise, the very best statistical-
testing regime will find you those faults which lead to failure at
a rate equal to or more frequent than one hundred thousand operational
hours [that's 1 every 11 years; what if your software is running on
11,000 sites? it's 1000 every year then]
How many failures are those? Edward Adams ... [found that] fully
one third of the [failure] reports [from the field] concerned faults
that could be expected to result in failure less frequently than
once every 60k operational months. [that means testing is not likely
to find them] Through statistical testing you can expect to find
those faults which cause at most two-thirds of your failures.
(What do you do about that remaining one-third?)
Issue: do our best, and we are still going to have defects.
We need design methods and supporting tools (like languages) that let
us cope with unreliable components. This has always been part of the
design ideas behind Erlang.
... some organisations develop their systems according to rigorous
development processes, such as SEI's Capability Maturity Model and
derivatives. The problem with this approach is that no reliable
correlation has ever been demonstated between adherence to a
development-process model and the quality of the resulting product,
no matter how we might wish for one.
Oops.
There's no denying it, Java *is* more verbose than Erlang.
(There's also no denying that Erlang is verbose compared with, say,
Haskell. I have a Haskell-like notation for Erlang which reduces
the SLOC count by a factor of 1.6; I'm fairly confident that it is
possible to do better.) Using higher level tools which *generate*
Java rather than writing Java directly is an extremely sensible thing
to do. Ditto for Erlang, when appropriate, which is why Erlang has
Yecc and interface-generating tools for CORBA and ASN.1.
JUnit (originally designed for Smalltalk, which is also considerably
more concise than Java) is one of the great strengths of Java, but we
have testing tools for Erlang, including QuickCheck. There is a C++Unit.
The way Praxis get their error rates down is by using a verifier (written
in Poplog Prolog, I believe; why else did the SPARK kit come with a Poplog
installation?). ESC/Java2 is a verifier for Java; I don't know of any
comparable free tool for C++. Thomas Arts &c has been doing impressive
stuff verifying Erlang, and the Dialyzer is a relevant tool.
Programs are data. My big complaint about older IDEs is that they encouraged
people to think of programs as exclusively something human-written, when
our best hope of reliable software is to write as little of it as possible.
What would Template Haskell look like in Erlang?