GCC Coding Conventions

There are some additional coding conventions for code in GCC,
beyond those in the GNU Coding
Standards. Some existing code may not follow these conventions,
but they must be used for new code. If changing existing
code to follow these conventions, it is best to send changes to follow
the conventions separately from any other changes to the code.

Documentation, both of user interfaces and of internals, must be
maintained and kept up to date. In particular:

All command-line options (including all --param
arguments) must be documented in the GCC manual.

Any change to documented behavior (for example, the behavior of
a command-line option or a GNU language extension) must include the
necessary changes to the manual.

All target macros must be documented in the GCC manual.

The documentation of the tree and RTL data
structures and interfaces must be kept complete and up to date.

In general, the documentation of all documented aspects of the
front-end and back-end interfaces must be kept up to date, and the
opportunity should be taken where possible to remedy gaps in or
limitations of the documentation.

GCC requires ChangeLog entries for documentation changes; for the web
pages (apart from java/ and libstdc++/) the CVS
commit logs are sufficient.

See also what the GNU Coding
Standards have to say about what goes in ChangeLogs; in
particular, descriptions of the purpose of code and changes should go
in comments rather than the ChangeLog, though a single line overall
description of the changes may be useful above the ChangeLog entry for
a large batch of changes.

For changes that are ported from another branch, we recommend to
use a single entry whose body contains a verbatim copy of the original
entries describing the changes on that branch, possibly preceded by a
single-line overall description of the changes.

There is no established convention on when ChangeLog entries are to
be made for testsuite changes; see messages 1 and 2.

If your change fixes a PR, put PR java/58 (where
java/58 is the actual number of the PR) at the top
of the ChangeLog entry.

There are strict requirements for portability of code in GCC to
older systems whose compilers do not implement all of the
latest ISO C and C++ standards.

The directories
gcc, libcpp and fixincludes
may use C++03.
They may also use the long long type
if the host C++ compiler supports it.
These directories should use reasonably portable parts of C++03,
so that it is possible to build GCC with C++ compilers other than GCC itself.
If testing reveals that
reasonably recent versions of non-GCC C++ compilers cannot compile GCC,
then GCC code should be adjusted accordingly.
(Avoiding unusual language constructs helps immensely.)
Furthermore,
these directories should also be compatible with C++11.

The directories libiberty and libdecnumber must use C
and require at least an ANSI C89 or ISO C90 host compiler.
C code should avoid pre-standard style function definitions, unnecessary
function prototypes and use of the now deprecated PARAMS macro.
See README.Portability
for details of some of the portability problems that may arise. Some
of these problems are warned about by gcc -Wtraditional,
which is included in the default warning options in a bootstrap.

The programs included in GCC are linked with the
libiberty library, which will replace some standard
library functions if not present on the system used, so those
functions may be freely used in GCC. In particular, the ISO C string
functions memcmp, memcpy,
memmove, memset, strchr and
strrchr are preferred to the old functions
bcmp, bcopy, bzero,
index and rindex; see messages 1 and 2. The
older functions must no longer be used in GCC; apart from
index, these identifiers are poisoned to prevent their
use.

Machine-independent files may contain conditionals on features of a
particular system, but should never contain conditionals such as
#ifdef __hpux__ on the name or version of a particular
system. Exceptions may be made to this on a release branch late in
the release cycle, to reduce the risk involved in fixing a problem
that only shows up on one particular system.

Function prototypes for extern functions should only occur in
header files. Functions should be ordered within source files to
minimize the number of function prototypes, by defining them before
their first use. Function prototypes should only be used when
necessary, to break mutually recursive cycles.

Every language or library feature, whether standard or a GNU
extension, and every warning GCC can give, should have testcases
thoroughly covering both its specification and its implementation.
Every bug fixed should have a testcase to detect if the bug
recurs.

The testsuite READMEs discuss the requirement to use abort
() for runtime failures and exit (0) for success.
For compile-time tests, a trick taken from autoconf may be used to evaluate
expressions: a declaration extern char x[(EXPR) ? 1 :
-1]; will compile successfully if and only if EXPR
is nonzero.

Where appropriate, testsuite entries should include comments giving
their origin: the people who added them or submitted the bug report
they relate to, possibly with a reference to a PR in our bug tracking
system. There are some copyright
guidelines on what can be included in the testsuite.

If a testcase itself is incorrect, but there's a possibility that an
improved testcase might fail on some platform where the incorrect
testcase passed, the old testcase should be removed and a new testcase
(with a different name) should be added. This helps automated
regression-checkers distinguish a true regression from an improvement
to the test suite.

Use of the input_location global, and of the
diagnostic functions that implicitly use input_location,
is deprecated; the preferred technique is to pass around locations
ultimately derived from the location of some explicitly chosen source
code token.

Diagnostics using the GCC diagnostic functions should generally
use the GCC-specific formats such as %qs or
%< and %> for quoting and
%m for errno numbers.

Identifiers should generally be formatted with %E or
%qE; use of identifier_to_locale is needed
if the identifier text is used directly.

Formats such as %wd should be used with types such as
HOST_WIDE_INT (HOST_WIDE_INT_PRINT_DEC is a
format for the host printf functions, not for the GCC
diagnostic functions).

error is for defects in the user's code.

sorry is for correct user input programs but
unimplemented functionalities.

warning is for advisory diagnostics; it
may be used for diagnostics that have severity less than an
error.

inform is for adding additional explanatory
information to a diagnostic.

internal_error is used for conditions that should not
be triggered by any user input whether valid or invalid and including
invalid asms and LTO binary data (sometimes, as an exception, there is
a call to error before further information is printed and
an ICE is triggered). Assertion failures should not be triggered by
invalid input.

inform is for informative notes accompanying errors
and warnings.

All diagnostics should be full sentences without English
fragments substituted in them, to facilitate translation.

The following conventions of spelling and terminology apply
throughout GCC, including the manuals, web pages, diagnostics,
comments, and (except where they require spaces or hyphens to be used)
function and variable names, although consistency in user-visible
documentation and diagnostics is more important than that in comments
and code. The following table lists some simple cases:

Use...

...instead of

Rationale

American spelling (in particular -ize, -or)

British spelling (in particular -ise, -our)

"32-bit" (adjective)

"32 bit"

"alphanumeric"

"alpha numeric"

"back end" (noun)

"back-end" or "backend"

"back-end" (adjective)

"back end" or "backend"

"bit-field"

"bit field" or "bitfield"

Spelling used in C and C++ standards

"built-in" as an adjective ("built-in function") or "built in"

"builtin"

"builtin" isn't a word

"bug fix" (noun) or "bug-fix" (adjective)

"bugfix" or "bug-fix"

"bugfix" isn't a word

"ColdFire"

"coldfire" or "Coldfire"

"command-line option"

"command line option"

"compilation time" (noun);
how long it takes to compile the program

"compile time"

"compile time" (noun), "compile-time" (adjective);
the time at which the program is compiled

"dependent" (adjective), "dependence", "dependency"

"dependant", "dependance", "dependancy"

"enumerated"

"enumeral"

Terminology used in C and C++ standards

"epilogue"

"epilog"

Established convention

"execution time" (noun);
how long it takes the program to run

"run time" or "runtime"

"floating-point" (adjective)

"floating point"

"free software" or just "free"

"Open Source" or "OpenSource"

"front end" (noun)

"front-end" or "frontend"

"front-end" (adjective)

"front end" or "frontend"

"GNU/Linux" (except in reference to the kernel)

"Linux" or "linux" or "Linux/GNU"

"link time" (noun), "link-time" (adjective);
the time at which the program is linked

"lowercase"

"lower case" or "lower-case"

"H8S"

"H8/S"

"Microsoft Windows"

"Windows"

"MIPS"

"Mips" or "mips"

"nonzero"

"non-zero" or "non zero"

"Objective-C"

"Objective C"

"prologue"

"prolog"

Established convention

"PowerPC"

"powerpc", "powerPC" or "PowerPc"

"Red Hat"

"RedHat" or "Redhat"

"run time" (noun), "run-time" (adjective);
the time at which the program is run

"runtime"

"runtime" (both noun and adjective);
libraries and system support present at run time

"run time", "run-time"

"SPARC"

"Sparc" or "sparc"

"testcase", "testsuite"

"test-case" or "test case", "test-suite" or "test suite"

"uppercase"

"upper case" or "upper-case"

"VAX", "VAXen", "MicroVAX"

"vax" or "Vax", "vaxen" or "vaxes", "microvax" or "microVAX"

"GCC" should be used for the GNU Compiler Collection, both
generally and as the GNU C Compiler in the context of compiling C;
"G++" for the C++ compiler; "gcc" and "g++" (lowercase), marked up
with @command when in Texinfo, for the commands for
compilation when the emphasis is on those; "GNU C" and "GNU C++" for
language dialects; and try to avoid the older term "GNU CC".

Use a comma after "e.g." or "i.e." if and only if it is appropriate
in the context and the slight pause a comma means helps the reader; do
not add them automatically in all cases just because some style guides
say so. (In Texinfo manuals, @: should to be used after
"e.g." and "i.e." when a comma isn't used.)

In Texinfo manuals, Texinfo 4.0 features may be used, and should be
used where appropriate. URLs should be marked up with
@uref; email addresses with @email;
command-line options with @option; names of commands with
@command; environment variables with @env.
NULL should be written as @code{NULL}. Tables of
contents should come just after the title page; printed manuals will
be formatted (for example, by make dvi) using
texi2dvi which reruns TeX until cross-references
stabilize, so there is no need for a table of contents to go at the
end for it to have correct page numbers. The @refill
feature is obsolete and should not be used. All manuals should use
@dircategory and @direntry to provide Info
directory information for install-info.

It is useful to read the Texinfo manual.
Some general Texinfo style issues discussed in that manual should be
noted:

For proper formatting of the printed manual, TeX quotes (matched
` or `` and ' or
'') should be used; neutral ASCII double quotes
("...") should not be. Similarly, TeX dashes
(-- (two hyphens) for an en dash and ---
(three hyphens) for an em dash) should be used; normally these
dashes should not have whitespace on either side. Minus signs
should be written as @minus{}.

For an ellipsis, @dots{} should be used; for a
literal sequence of three dots in a programming language, the dots
should be written as such (...) rather than as
@dots{}.

English text in programming language comments in examples should
be enclosed in @r{} so that it is printed in a
non-fixed-width font.

Full stops that end sentences should be
followed by two spaces or by end of line. Full
stops that are preceded by a lower-case letter but do not end a
sentence should be followed by @: if they are not
followed by other punctuation such as a comma; full stops, question
marks and exclamation marks that end a sentence but are preceded by
an upper-case letter should be written as "@.",
"@?" and "@!", respectively. (This is not
required if the capital letter is within @code or
@samp.)

Upstream packages

Some files and packages in the GCC source tree are imported from
elsewhere, and we want to minimize divergence from their upstream sources.
The following files should be updated only according to the rules set
below:

config.guess, config.sub: The master copy of these files is at ftp://ftp.gnu.org/pub/gnu/config.
Proposed changes should be e-mailed to config-patches@gnu.org. Only
after the change makes it to the FTP site should the new files be
installed in the GCC source tree, so that their version numbers remain
meaningful and unique. Don't install the patch, install the whole
file.

ltmain.sh, libtool.m4, ltoptions.m4, ltsugar.m4, ltversion.m4,
lt~obsolete.m4, and formerly also ltconfig, ltcf-c.sh, ltcf-cxx.sh,
ltcf-gcj.sh: The master copy of these files is the source repository of
GNU
libtool. Patches should be posted to libtool-patches@gnu.org.
Only after the change makes it to the libtool source tree should the new
files be installed in the GCC source tree.
ltgcc.m4 is not imported from upstream.
ltconfig and ltmain.sh are generated files from ltconfig.in and
ltmain.in, respectively, and with libtool 2.1, the latter is generated
from ltmain.m4sh, so, when you post the patch, make sure you're
patching the source file, not the generated one. When you update
these generated files in the GCC repository, make sure they report the
same timestamp and version number, and note this version number in the
ChangeLog.

Top-level configure.ac, configure, Makefile.in, config-ml.in,
config.if and most other top-level shell-scripts: Please try to keep
these files in sync with the corresponding files in the src repository
at sourceware.org. Some people hope to eventually merge these
trees into a single repository; keeping them in sync helps this goal.
When you check in a patch to one of these files, please check it in
the src tree too, or ask someone else with write access there to
do so.

libjava/classpath: The master sources come from
GNU Classpath.
New versions of Classpath are periodically imported into the GCC source
tree. In general local modifications are prohibited, but they can be
checked in for emergencies, such as fixing bootstrap.

zlib: The master sources come from
zlib.net. However, the
autoconf-based configury is a local GCC invention. Changes to zlib
outside the build system are discouraged, and should be sent upstream
first.

libstdc++-v3: In docs/doxygen, comments in *.cfg.in are
partially autogenerated from the
Doxygen tool. In docs/html, the ext/lwg-* files are copied from the C++ committee homepage,
the 27_io/binary_iostream_* files are copies of Usenet postings, and most
of the files in 17_intro are either copied from elsewhere in GCC or the
FSF website, or are autogenerated. These files should not be changed
without prior permission, if at all.

Code should use gcc_assert (EXPR) to check invariants.
Use gcc_unreachable () to mark places that should never be
reachable (such as an unreachable default case of a
switch). Do not use gcc_assert (0) for such purposes, as
gcc_unreachable gives the compiler more information. The
assertions are enabled unless explicitly configured off with
--enable-checking=none. Do not use abort.
User input should never be validated by either gcc_assert
or gcc_unreachable. If the checks are expensive or the
compiler can reasonably carry on after the error, they may be
conditioned on --enable-checking
by using gcc_checking_assert.

Code testing properties of characters from user source code should
use macros such as ISALPHA from safe-ctype.h
instead of the standard functions such as isalpha from
<ctype.h> to avoid any locale-dependency of the
language accepted.

Macros names should be in ALL_CAPS
when it's important to be aware that it's a macro
(e.g. accessors and simple predicates),
but in lowercase (e.g., size_int)
where the macro is a wrapper for efficiency
that should be considered as a function;
see messages
1
and 2.

C++ is a complex language,
and we strive to use it in a manner that is not surprising.
So, the primary rule is to be reasonable.
Use a language feature in known good ways.
If you need to use a feature in an unusual way,
or a way that violates the "should" rules below,
seek guidance, review and feedback from the wider community.

All use of C++ features
is subject to the decisions of the maintainers of the relevant components.
(This restates something that is always true for gcc,
which is that
component maintainers make the final decisions about those components.)

Variables should be defined at the point of first use,
rather than at the top of the function.
The existing code obviously does not follow that rule,
so variables may be defined at the top of the function,
as in C90.

Variables may be simultaneously defined and tested in control expressions.

A non-POD type will often (but not always)
have a declaration of a
special member function.
If any one of these is declared,
then all should be either declared
or have an explicit comment saying that the default is intended.

Single inheritance is permitted.
Use public inheritance to describe interface inheritance,
i.e. 'is-a' relationships.
Use private and protected inheritance
to describe implementation inheritance.
Implementation inheritance can be expedient,
but think twice before using it in code
intended to last a long time.

Complex hierarchies are to be avoided.
Take special care with multiple inheritance.
On the rare occasion that using mulitple inheritance is indeed useful,
prepare design rationales in advance,
and take special care to make documentation of the entire hierarchy clear.

Think carefully about the size and performance impact
of virtual functions and virtual bases
before using them.

Overloading operators is permitted,
but take care to ensure that overloads are not surprising.
Some unsurprising uses are
in the implementation of numeric types and
in following the C++ Standard Library's conventions.
In addition, overloaded operators, excepting the call operator,
should not be used for expensive implementations.

Default arguments are another type of function overloading,
and the same rules apply.
Default arguments must always be POD values, i.e. may not run constructors.
Virtual functions should not have default arguments.

Constructors and destructors, even those with empty bodies,
are often much larger than programmers expect.
Prefer non-inline versions unless you have evidence
that the inline version is smaller or has a significant performance impact.

Namespaces are encouraged.
All separable libraries should have a unique global namespace.
All individual tools should have a unique global namespace.
Nested include directories names should map to nested namespaces when possible.

Header files should have neither using directives
nor namespace-scope using declarations.

Run-time type information (RTTI) is permitted
when certain non-default --enable-checking options are enabled,
so as to allow checkers to report dynamic types.
However, by default, RTTI is not permitted
and the compiler must build cleanly with -fno-rtti.

Open a namespace with the namespace name
followed by a left brace and a new line.

namespace gnutool {

Close a namespace
with a right brace, optional closing comment, and a new line.

} // namespace gnutool

Definitions within the body of a namespace are not indented.

For questions related to the use of GCC,
please consult these web pages and the
GCC manuals. If that fails,
the gcc-help@gcc.gnu.org
mailing list might help.
Comments on these web pages and the development of GCC are welcome on our
developer list at gcc@gcc.gnu.org.
All of our lists
have public archives.

Copyright (C)
Free Software Foundation, Inc.
Verbatim copying and distribution of this entire article is
permitted in any medium, provided this notice is preserved.