User login

Navigation

Ada 2012 Language Standard Approved by ISO

December 18, 2012 - The Ada Resource Association
(ARA) and Ada-Europe today announced the approval and publication of
the latest version of the Ada programming language by the Geneva-based
International Organization for Standardization (ISO). The language
revision, known as Ada 2012, was under the auspices of ISO/IEC
JTC1/SC22/WG9 and was conducted by the Ada Rapporteur Group (ARG)
subunit of WG9, with sponsorship in part from the ARA and Ada-Europe.
The formal approval of the standard was issued on November 20 by
ISO/IEC JTC 1, and the standard was published on December 15.

I am glad to say that the major area of improvement is the ability to specify contracts, something I was urging the Ada community to do back in 2002. The new features include the ability to specify preconditions and postconditions for subprograms,
and invariants for private types. Another important area that received attention in this iteration is multi-core programming.

So if you are serious about mission critical software, head on to the web site and see what you have been missing.

I don't really have experience developing safe software for mission-critical systems, but my style of programming has adapted over the years to prefer patterns I deem safer. For example, prefering parametric polymorphism to inclusion polymorphism. Reading the Ada 2012 Rationale paper, focusing on support for expressions, and also watching the talk of Ada 2012 Features, I was surprised by two things:

In the talk, Ed mentions they have added support for case expressions, and then ignores coverage of them in his talk. Instead, he talks about extending set memberships expressions.

In the paper, they concentrate on syntactic ambiguities and consistently use catch-all "when others" expression. They also write, curiously, "However, this is clearly very unlikely to be a problem. Case statements over Boolean types are pretty rare anyway." But Boolean types seem like just one example of deconstructing a sum type via cases.

For my tastes, case expressions over algebraic data types are one of the most important techniques for writing bug-free software, especially when maintaining large systems, because a language with a sound type system will warn the programmer of the following scenarios when extending a data type with more members:

Non-exhaustive cases

Redundant cases

Impossible cases

However, if using a catch-all, then the usefulness of such a coding pattern is greatly diminished. In fact, I would not emphasize the style of using extended set membership tests, and would instead ensure the compiler raises warnings for the three cases above, as ML-based languages do. (Maybe Ada 2012 does that, but it isn't clear and finding coverage of this feature is difficult.)

A tangent, and something I have been thinking of lately, is how to make catch-all cases safer. The major problem is static text cannot react to deltas to a sum type. A more interactive programming environment would treat the program not merely as flat text but as a realization of all dependencies. So if I extend a sum type with another type, my functions that pattern match over that type should provide compiler warnings to the user that some assumptions might be violated. These are the insidious types of errors we need to be most careful with, since they are silent abstraction failures.

Non-exhaustive, redundant and impossible cases (whether in "case statement" or "case expression") are not allowed in Ada, and any Ada compiler will raise an error on such inputs. John Barnes says so in the rationale chapter you quote: It is always worth emphasizing that an important advantage of case constructions is that they give a
coverage check.

The corresponding wordings in the Ada Reference Manual are found in paragraph 19/3 of the section on conditional expressions: The expected type for the selecting_expression and the discrete_choices are as for case statements (see 5.4) and paragraph 10 of the section on case statements: Two distinct discrete_choices of a case_statement shall not cover the same value.

To make catch-all safer, just forbid them in your coding standard, and have it checked automatically by a tool. In Ada, this means no "when others", and there is a rule to forbid it in the coding standard checker GNATcheck.

Discussing Ada2012 in linufr, I was surprised to learn that the 'in out' reference is unsafe when there is aliasing, I'm quite surprised that Ada designers chose performance over "safety" here: I prefer the C99's design: safe by default and provide a restrict option.

It is not "unsafe" to pass aliased "in out" parameters in Ada. It is a "bounded error" as explained in the Ada Reference Manual: The possible consequences are that Program_Error is raised, or the newly assigned value is read, or some old value of the object is read.

Note that it is an error only when the language does not define that the parameters are passed by-copy or by-reference, which depends in particular on the type of the parameters. In such a case, one compiler may choose to pass the parameter by copy, and another one by reference, so the language defines it as an error. But, in any case, there is no safety hole here.

I've noticed Ada vendors have started promoting the language for security-critical software over the past few years. It certainly has features that would help in those areas. I've advocated using languages like Ada or Modula-3 over C/C++ for secure/reliable systems software (where usable) in the past. However, I have an important question for LTU language gurus.

Most exploits of C/C++ programs are control flow integrity violations. They smash the stack, misdirect a pointer or something like that. An article on Ada I read mentioned, with safety features on, the runtime would detect and throw exceptions for many errors like this. So, the key question is: are Ada programs immune to code injection attacks? What specific avenues of attack (minus D.O.S.) are available against a pure Ada program?

If Ada is only DOSable, then that's an excuse in and of itself to rewrite a lot of system software in Ada. Hackers at most being able to crash a program is an advantage compared to the click-to-control-entire-system paradigm coming from C/C++ software bases.

(I'm also open to being informed on other languages/tools that automatically prevent common CFI issues so long as they're practical, have foreign function interface and have good tool/IDE support.)

Strong typing languages like Ada and Modula-3 belong to my favorite set of secure languages and the computing world would surely be safer if one of those lanugages or similar would have taken C/C++'s place.

Pascal family of languages have a special place in my heart.

But they also need OS support to prevent re-writting the generated code, otherwise the generated binaries can also be equally exploited like in the C and C++ languages. It is quite easy to nop assembly instructions from the language's runtime.

What I have started to advocate if you really can only go with C or C++, is to make use of static analysis and enabling all warnings, and possibly dealing warnings as errors.

So, i guess that's the trick of it. The runtime must be able to load and run code, but not allow code injection during runtime by design. At the least, it must be configured not to do that. I know the Java-based Sandia Secure Processor has this capability. I'm thinking more of a software solution: language, tools and runtime.

I looked at Modula 3 it's great and still tools available, but less than Ada. So, any ideas for a runtime that preserves runtime integrity? Any you know of or any good research? (I recently got a half-dozen academic papers making claims like that and I'm working through them. I'm just asking in case I missed a good solution.)

Netscape then Perl then most of the dynamic languages have a very good idea to stop these sorts of attacks called "taint checking". The idea is that any variable that comes from an untrusted input source is marked "tainted". Any variable which is derived from a tainted variable is tainted. You have to do a cleaning operation on a tainted variable before you can do anything system oriented on it. This is part of the runtime engine so the system doesn't have to check through all possible pathways at compile time.

As for most system software and Ada... I don't think that's a good idea. The features that make Ada safe, like no pointers make it problematic for system software. Manipulating hardware is fundamentally a low level activity and very performance dependent.

However there is an Ada based OS: MaRTE OS. They use Ada more for RTOS functions than security but it might be worth checking out.

I've been thinking about your comment regarding tainting. I agree that tainting strikes me as trivial in a good strongly typed language, it is just a monad with typical monad operations. Where it is valuable is for dynamic languages and programs that are already written. If I had program with say 10,000 and 1000 of them had I/O possibilities adding taint as just a type is miserable. This may be an example of the advantages of code isolation and typing, that what is a really cool feature in dynamic languages is trivial in strongly typed languages.

____

As for the low level stuff, I stand corrected. I looked that up. I hadn't know about features like "use at" and interrupt handling.

I'll look into taint checking. At the least, it might help with provenance checks later on. MarteOS was interesting. The Ada language can handle system software fine, as another person noted. Embedded, real-time system software is actually what it was made for. I think, though, that I'll have to use asm/C at some point near the bottom of the stack for best results. That's no problem: I'll just use separate verification tech for asm/C code and Ada software. Verve gave us an idea of how to do that with low level software.

This isn't an attack against C or Ada, but rather libc and x86 -- this is really about searching linked code for instruction sequences and building gadgets out of dangerous instruction sequences. The style is called return-oriented programming by the author. I haven't kept up with the long list of citations this paper has received, but one notable advancement I am aware of is Return-oriented Programming Without Returns, co-authored by Shacham.

MULTICS defeated stack overflows by using a reverse stack in the 1970's. Some students in 2010 I think modified GCC to use a reverse stack on x86 with a 10% max performance penalty. I think our stack overflow issues are simply a case of modern INFOSEC people not learning lessons old guys taught us or history repeating itself. Happens a lot in INFOSEC.

Regarding Ada, thanks for the reply. My plan with regard to libraries is to use Ada for most stuff and C/C++ libraries for the rest. I'll choose mature, safe libraries. I intend to attempt to wrap them with SFI-style protection, like NaCl. My alternative is full Java a la JX Operating System.

The attack outlined in Shacham's paper is a brilliant subversion of the X86 instruction set, and shows that the "write XOR execute" security model is utterly broken when applied to a machine which executes that instruction set.

The attack, as noted, would be more difficult in a RISC architecture, because in a RISC instruction set you can't find unintended instruction sequences or unintended RETURN instructions. But that would just require the attacker to search more code to find his "gadgets"; current instruction sets whether RISC or CISC simply aren't designed to be resistant to this attack.

But it's not at all difficult to design an instruction set to be resistant.

There is a simple addition to a RISC instruction set that could make the attack outlined in that paper impossible. What you need is a LAND instruction which is required to create a landing site for JUMPs. It could be a semantic no-op in itself, but would be a bit pattern which must be present at the target address for any flow of control transfer to succeed.

Instruction sets up to now have been designed without this attack in mind, and have used "implicit" landing sites which are simply addresses, and can contain any instruction. But in an instruction set where flow-of-control operators fail unless the targeted address of the control transfer is the address of a LAND instruction, you cannot execute less than one basic block of the program or library.

If you have different kinds of LAND corresponding to different kinds of transfer (CLAND, BLAND, and RLAND to serve as targets for CALLFN, within-segment BRANCH, and RETURN transfers respectively), you can further constrain the attack; with the attacker's code in a different segment it would be impossible to transfer control into the library in any way that caused the execution of less than a whole function.

In a RISC architecture where, additionally, all instructions are the same length and have a known alignment property, you would then have no opportunity for an "accidental" LAND instruction to appear as a subsequence of an intended instruction or spanning a boundary between intended instructions. That allows you to prove rigorously that this exploit could not be used.

Unlike most rigorous security-hardening measures, the cost to the defender (in terms of "what you can't do with the system") is near zero; it makes the executables very slightly longer, which means that where the CPU/memory bandwidth is a bottleneck it would run very slightly slower. It would require a few hundred additional logic gates in the chip to check for the appropriate LAND instruction when processing transfers of control; and it would require all control transfers across segment boundaries to be via explicit function call and return. And that's it. People could still write assembly code with no additional complexity needed; the symbolic addresses people use anyway to designate jump targets in assembly language would just denote LAND instructions rather than being labels for arbitrary locations containing other instructions. Finally, it woudn't require subverting the intent or limiting the authority of the owner/user in any unusual way; the only things it makes impossible are things no authorized user needs to do on purpose anyway.

Now all I need to do is convince hardware manufacturers. The tiny marginal costs would certainly be worth it in the design of a secure machine, but the attack isn't well known, and security hasn't been an apparent high priority in consumer hardware design. Convincing some manufacturer to sacrifice instruction-set backward compatibility for its sake would probably be hard.

Hey, should I write a paper about this idea? With a little editing, the above could serve as its abstract.

Unfortunately hardware changes to improve security/semantic are very rare.
For example, a trap on operation with signed integer overflow is very cheap and compatible with C's semantic, but except for the MIPS, no mainstream CPU has this.

I'm on top of issues above language. It's just that I keep seeing language-related implementation flaws pop up and would rather my systems not be written in 100% C/C++. Too much risk. Thanks for the link.