Chapter 10. Language-Specific Issues

Undoubtedly there are all sorts of languages in the world,
yet none of them is without meaning.

1 Corinthians 14:10 (NIV)

There are many language-specific security issues.
Many of them can be summarized as follows:

Turn on all relevant warnings and protection mechanisms available to you
where practical.
For compiled languages, this includes
both compile-time mechanisms and run-time mechanisms.
In general, security-relevant programs should compile cleanly with
all warnings turned on.

If you can use a ``safe mode'' (e.g., a mode that limits the activities
of the executable), do so.
Many interpreted languages include such a mode.
In general, don't depend on the safe mode to provide absolute protection;
most language's safe modes have not been sufficiently analyzed for their
security, and when they are, people usually discover many ways to exploit it.
However, by writing your code so that it's secure out of safe mode, and
then adding the safe mode, you end up with defense-in-depth (since in
many cases, an attacker has to break both
your application code and the safe mode).

Avoid dangerous and deprecated operations in the language.
By ``dangerous'', I mean operations which are difficult to use correctly.
For example, many languages include
some mechanisms or functions that are ``magical'', that
is, they try to infer the ``right'' thing to do using a heuristic -
generally you should avoid them, because an attacker may be able to
exploit the heuristic and do something dangerous instead of what was intended.
A common error is an ``off-by-one'' error, in which the bound is
off by one, and sometimes these result in exploitable errors.
In general, write code in a way that minimizes the likelihood of
off-by-one errors.
If there are standard conventions in the language (e.g., for writing loops),
use them.

Ensure that the languages'
infrastructure (e.g., run-time library) is available and secured.

Know precisely the semantics of the operations that you are using.
Look up each operation's semantics in its documentation.
Do not ignore return values unless you're sure they cannot be relevant.
Don't ignore the difference between ``signed'' and ``unsigned'' values.
This is particularly difficult in languages which don't support exceptions,
like C, but that's the way it goes.

It is possible to develop secure code using C or C++, but both
languages include fundamental design decisions that make it
more difficult to write secure code.
C and C++ easily permit buffer overflows, force programmers to do their
own memory management, and are fairly lax in their typing systems.
For systems programs (such as an operating system kernel),
C and C++ are fine choices.
For applications, C and C++ are often over-used.
Strongly consider using an even higher-level language,
at least for the majority of the application.
But clearly, there are many existing programs in C and C++
which won't get completely rewritten, and many developers may choose
to develop in C and C++.

One of the biggest security problems with C and C++ programs is
buffer overflow; see Chapter 6
for more information.
C has the additional weakness of not supporting exceptions, which makes
it easy to write programs that ignore critical error situations.

Another problem with C and C++ is that developers have to do their
own memory management (e.g., using malloc(), alloc(), free(), new, and delete),
and failing to do it correctly may result in a security flaw.
The more serious problem is that programs may erroneously
free memory that should not be freed (e.g., because it's already been freed).
This can result in an immediate crash or be exploitable, allowing
an attacker to cause arbitrary code to be executed; see
[Anonymous Phrack 2001].
Some systems (such as many GNU/Linux systems) don't protect
against double-freeing at all by default, and it is not clear that those
systems which attempt to protect themselves are truly unsubvertable.
Although I haven't seen anything written on the subject, I suspect that
using the incorrect call in C++ (e.g., mixing new and malloc()) could
have similar effects.
For example, on March 11, 2002, it was announced that the zlib
library had this problem, affecting the many programs that use it.
Thus, when testing programs on GNU/Linux,
you should set the environment variable
MALLOC_CHECK_ to 1 or 2, and you might consider executing your program
with that environment variable set with 0, 1, 2.
The reason for this variable is explained in GNU/Linux malloc(3) man page:

Recent versions of Linux libc (later than 5.4.23) and
GNU libc (2.x) include a malloc implementation which is tunable
via environment variables.
When MALLOC_CHECK_ is set, a special (less efficient) implementation
is used which is designed to be tolerant against simple errors,
such as double calls of free() with the same argument,
or overruns of a single byte (off-by-one bugs).
Not all such errors can be protected against, however, and memory leaks
can result.
If MALLOC_CHECK_ is set to 0, any detected heap corruption
is silently ignored;
if set to 1, a diagnostic is printed on stderr;
if set to 2, abort() is called immediately.
This can be useful because otherwise a crash may happen much later,
and the true cause for the problem is then very hard to track down.

There are various tools to deal with this, such as
Electric Fence and Valgrind;
see Section 11.7 for more information.
If unused memory is not free'd, (e.g., using free()), that unused memory
may accumulate - and if enough unused memory can accumulate, the
program may stop working.
As a result, the unused memory may be exploitable by attackers to
create a denial of service.
It's theoretically possible for attackers to cause memory to be
fragmented and cause a denial of service, but usually this
is a fairly impractical and low-risk attack.

Be as strict as you reasonably can when you declare types.
Where you can, use ``enum'' to define enumerated values (and not
just a ``char'' or ``int'' with special values).
This is particularly useful for values in switch statements, where
the compiler can be used to determine if all legal values have been covered.
Where it's appropriate, use ``unsigned'' types if the value can't be
negative.

One complication in C and C++ is that the character type ``char'' can be
signed or unsigned (depending on the compiler and machine).
When a signed char with its high bit set
is saved in an integer, the result will be a negative number;
in some cases this can be exploitable.
In general, use ``unsigned char'' instead of char or signed char for
buffers, pointers, and casts when
dealing with character data that may have values greater than 127 (0x7f).

C and C++ are by definition rather lax in their type-checking support, but
you can at least increase their level of checking so that some mistakes
can be detected automatically.
Turn on as many compiler warnings as you can and change the code to cleanly
compile with them, and strictly use ANSI prototypes in separate header
(.h) files to ensure that all function calls use the correct types.
For C or C++ compilations using gcc, use at least
the following as compilation flags (which turn on a host of warning messages)
and try to eliminate all warnings (note that -O2 is used since some
warnings can only be detected by the data flow analysis performed at
higher optimization levels):

gcc -Wall -Wpointer-arith -Wstrict-prototypes -O2

You might want ``-W -pedantic'' too.

Many C/C++ compilers can detect inaccurate format strings.
For example,
gcc can warn about inaccurate format strings for functions you create
if you use its __attribute__() facility (a C extension) to mark such functions,
and you can use that facility without making your code non-portable.
Here is an example of what you'd put in your header (.h) file:

The "format" attribute takes either "printf" or "scanf", and the numbers
that follow are the parameter number of the format string and the first
variadic parameter (respectively). The GNU docs talk about this well.
Note that there are other __attribute__ facilities as well,
such as "noreturn" and "const".

Avoid common errors made by C/C++ developers.
For example, be careful about not using ``='' when you mean ``==''.