This document contains the release notes for the LLVM Compiler
Infrastructure, release 2.6. Here we describe the status of LLVM, including
major improvements from the previous release and significant known problems.
All LLVM releases may be downloaded from the LLVM releases web site.

Note that if you are reading this file from a Subversion checkout or the
main LLVM web page, this document applies to the next release, not the
current one. To see the release notes for a specific release, please see the
releases page.

The LLVM 2.6 distribution currently consists of code from the core LLVM
repository (which roughly includes the LLVM optimizers, code generators
and supporting tools), the Clang repository and the llvm-gcc repository. In
addition to this code, the LLVM Project includes other sub-projects that are in
development. Here we include updates on these subprojects.

The Clang project is an effort to build
a set of new 'LLVM native' front-end technologies for the C family of languages.
LLVM 2.6 is the first release to officially include Clang, and it provides a
production quality C and Objective-C compiler. If you are interested in fast compiles and
good diagnostics, we
encourage you to try it out. Clang currently compiles typical Objective-C code
3x faster than GCC and compiles C code about 30% faster than GCC at -O0 -g
(which is when the most pressure is on the frontend).

In addition to supporting these languages, C++ support is also well under way, and mainline
Clang is able to parse the libstdc++ 4.2 headers and even codegen simple apps.
If you are interested in Clang C++ support or any other Clang feature, we
strongly encourage you to get involved on the Clang front-end mailing
list.

In the LLVM 2.6 time-frame, the Clang team has made many improvements:

Previously announced in the 2.4 and 2.5 LLVM releases, the Clang project also
includes an early stage static source code analysis tool for automatically finding bugs
in C and Objective-C programs. The tool performs checks to find
bugs that occur on a specific path within a program.

In the LLVM 2.6 time-frame, the analyzer core has undergone several important
improvements and cleanups and now includes a new Checker interface that
is intended to eventually serve as a basis for domain-specific checks. Further,
in addition to generating HTML files for reporting analysis results, the
analyzer can now also emit bug reports in a structured XML format that is
intended to be easily readable by other programs.

The set of checks performed by the static analyzer continues to expand, and
future plans for the tool include full source-level inter-procedural analysis
and deeper checks such as buffer overrun detection. There are many opportunities
to extend and enhance the static analyzer, and anyone interested in working on
this project is encouraged to get involved!

The new LLVM compiler-rt project
is a simple library that provides an implementation of the low-level
target-specific hooks required by code generation and other runtime components.
For example, when compiling for a 32-bit target, converting a double to a 64-bit
unsigned integer is compiled into a runtime call to the "__fixunsdfdi"
function. The compiler-rt library provides highly optimized implementations of
this and other low-level routines (some are 3x faster than the equivalent
libgcc routines).

All of the code in the compiler-rt project is available under the standard LLVM
License, a "BSD-style" license.

The new LLVM KLEE project is a symbolic
execution framework for programs in LLVM bitcode form. KLEE tries to
symbolically evaluate "all" paths through the application and records state
transitions that lead to fault states. This allows it to construct testcases
that lead to faults and can even be used to verify algorithms. For more
details, please see the OSDI 2008 paper about
KLEE.

The goal of DragonEgg is to make
gcc-4.5 act like llvm-gcc without requiring any gcc modifications whatsoever.
DragonEgg is a shared library (dragonegg.so)
that is loaded by gcc at runtime. It uses the new gcc plugin architecture to
disable the GCC optimizers and code generators, and schedule the LLVM optimizers
and code generators (or direct output of LLVM IR) instead. Currently only Linux
and Darwin are supported, and only on x86-32 and x86-64. It should be easy to
add additional unix-like architectures and other processor families. In theory
it should be possible to use DragonEgg
with any language supported by gcc, however only C and Fortran work well for the
moment. Ada and C++ work to some extent, while Java, Obj-C and Obj-C++ are so
far entirely untested. Since gcc-4.5 has not yet been released, neither has
DragonEgg. To build
DragonEgg you will need to check out the
development versions of gcc,
llvm and
DragonEgg from their respective
subversion repositories, and follow the instructions in the
DragonEgg README.

The LLVM Machine Code (MC) Toolkit project is a (very early) effort to build
better tools for dealing with machine code, object file formats, etc. The idea
is to be able to generate most of the target specific details of assemblers and
disassemblers from existing LLVM target .td files (with suitable enhancements),
and to build infrastructure for reading and writing common object file formats.
One of the first deliverables is to build a full assembler and integrate it into
the compiler, which is predicted to substantially reduce compile time in some
scenarios.

In the LLVM 2.6 timeframe, the MC framework has grown to the point where it
can reliably parse and pretty print (with some encoding information) a
darwin/x86 .s file successfully, and has the very early phases of a Mach-O
assembler in progress. Beyond the MC framework itself, major refactoring of the
LLVM code generator has started. The idea is to make the code generator reason
about the code it is producing in a much more semantic way, rather than a
textual way. For example, the code generator now uses MCSection objects to
represent section assignments, instead of text strings that print to .section
directives.

MC is an early and ongoing project that will hopefully continue to lead to
many improvements in the code generator and build infrastructure useful for many
other situations.

An exciting aspect of LLVM is that it is used as an enabling technology for
a lot of other language and tools projects. This section lists some of the
projects that have already been updated to work with LLVM 2.6.

Rubinius is an environment
for running Ruby code which strives to write as much of the core class
implementation in Ruby as possible. Combined with a bytecode interpreting VM, it
uses LLVM to optimize and compile ruby code down to machine code. Techniques
such as type feedback, method inlining, and uncommon traps are all used to
remove dynamism from ruby execution and increase performance.

Since LLVM 2.5, Rubinius has made several major leaps forward, implementing
a counter based JIT, type feedback and speculative method inlining.

MacRuby is an implementation of Ruby on top of
core Mac OS X technologies, such as the Objective-C common runtime and garbage
collector and the CoreFoundation framework. It is principally developed by
Apple and aims at enabling the creation of full-fledged Mac OS X applications.

Pure
is an algebraic/functional programming language based on term rewriting.
Programs are collections of equations which are used to evaluate expressions in
a symbolic fashion. Pure offers dynamic typing, eager and lazy evaluation,
lexical closures, a hygienic macro system (also based on term rewriting),
built-in list and matrix support (including list and matrix comprehensions) and
an easy-to-use C interface. The interpreter uses LLVM as a backend to
JIT-compile Pure programs to fast native code.

Pure versions 0.31 and later have been tested and are known to work with
LLVM 2.6 (and continue to work with older LLVM releases >= 2.3 as well).

LDC is an implementation of
the D Programming Language using the LLVM optimizer and code generator.
The LDC project works great with the LLVM 2.6 release. General improvements in
this
cycle have included new inline asm constraint handling, better debug info
support, general bug fixes and better x86-64 support. This has allowed
some major improvements in LDC, getting it much closer to being as
fully featured as the original DMD compiler from DigitalMars.

Roadsend PHP (rphp) is an open
source implementation of the PHP programming
language that uses LLVM for its optimizer, JIT and static compiler. This is a
reimplementation of an earlier project that is now based on LLVM.

IcedTea provides a
harness to build OpenJDK using only free software build tools and to provide
replacements for the not-yet free parts of OpenJDK. One of the extensions that
IcedTea provides is a new JIT compiler named Shark which uses LLVM
to provide native code generation without introducing processor-dependent
code.

LLVM now supports doing optimization and code generation on multiple
threads. Please see the LLVM
Programmer's Manual for more information.

LLVM now has experimental support for embedded
metadata in LLVM IR, though the implementation is not guaranteed to be
final and the .bc file format may change in future releases. Debug info
does not yet use this format in LLVM 2.6.

LLVM IR has several new features for better support of new targets and that
expose new optimization opportunities:

The add, sub and mul
instructions have been split into integer and floating point versions (like
divide and remainder), introducing new fadd, fsub,
and fmul instructions.

The add, sub and mul
instructions now support optional "nsw" and "nuw" bits which indicate that
the operation is guaranteed to not overflow (in the signed or
unsigned case, respectively). This gives the optimizer more information and
can be used for things like C signed integer values, which are undefined on
overflow.

The sdiv instruction now supports an
optional "exact" flag which indicates that the result of the division is
guaranteed to have a remainder of zero. This is useful for optimizing pointer
subtraction in C.

The getelementptr instruction now
supports an "inbounds" optimization hint that tells the optimizer that the
pointer is guaranteed to be within its allocated object.

LLVM now support a series of new linkage types for global values which allow
for better optimization and new capabilities:

linkonce_odr and
weak_odr have the same linkage
semantics as the non-"odr" linkage types. The difference is that these
linkage types indicate that all definitions of the specified function
are guaranteed to have the same semantics. This allows inlining
templates functions in C++ but not inlining weak functions in C,
which previously both got the same linkage type.

available_externally
is a new linkage type that gives the optimizer visibility into the
definition of a function (allowing inlining and side effect analysis)
but that does not cause code to be generated. This allows better
optimization of "GNU inline" functions, extern templates, etc.

linker_private is a
new linkage type (which is only useful on Mac OS X) that is used for
some metadata generation and other obscure things.

Finally, target-specific intrinsics can now return multiple values, which
is useful for modeling target operations with multiple results.

In addition to a large array of minor performance tweaks and bug fixes, this
release includes a few major enhancements and additions to the optimizers:

The Scalar Replacement of Aggregates
pass has many improvements that allow it to better promote vector unions,
variables which are memset, and much more strange code that can happen to
do bitfield accesses to register operations. An interesting change is that
it now produces "unusual" integer sizes (like i1704) in some cases and lets
other optimizers clean things up.

The Loop Strength Reduction pass now
promotes small integer induction variables to 64-bit on 64-bit targets,
which provides a major performance boost for much numerical code. It also
promotes shorts to int on 32-bit hosts, etc. LSR now also analyzes pointer
expressions (e.g. getelementptrs), as well as integers.

The GVN pass now eliminates partial
redundancies of loads in simple cases.

The Inliner now reuses stack space when
inlining similar arrays from multiple callees into one caller.

LLVM includes a new experimental Static Single Information (SSI)
construction pass.

We have put a significant amount of work into the code generator
infrastructure, which allows us to implement more aggressive algorithms and make
it run faster:

The llc -asm-verbose option (exposed from llvm-gcc as -dA
and clang as -fverbose-asm or -dA) now adds a lot of
useful information in comments to
the generated .s file. This information includes location information (if
built with -g) and loop nest information.

The code generator now supports a new MachineVerifier pass which is useful
for finding bugs in targets and codegen passes.

The Machine LICM is now enabled by default. It hoists instructions out of
loops (such as constant pool loads, loads from read-only stubs, vector
constant synthesization code, etc.) and is currently configured to only do
so when the hoisted operation can be rematerialized.

The Machine Sinking pass is now enabled by default. This pass moves
side-effect free operations down the CFG so that they are executed on fewer
paths through a function.

The code generator now performs "stack slot coloring" of register spills,
which allows spill slots to be reused. This leads to smaller stack frames
in cases where there are lots of register spills.

The register allocator has many improvements to take better advantage of
commutable operations, various spiller peephole optimizations, and can now
coalesce cross-register-class copies.

Tblgen now supports multiclass inheritance and a number of new string and
list operations like !(subst), !(foreach), !car,
!cdr, !null, !if, !cast.
These make the .td files more expressive and allow more aggressive factoring
of duplication across instruction patterns.

Target-specific intrinsics can now be added without having to hack VMCore to
add them. This makes it easier to maintain out-of-tree targets.

The instruction selector is better at propagating information about values
(such as whether they are sign/zero extended etc.) across basic block
boundaries.

The SelectionDAG datastructure has new nodes for representing buildvector
and vector shuffle operations. This
makes operations and pattern matching more efficient and easier to get
right.

The Prolog/Epilog Insertion Pass now has experimental support for performing
the "shrink wrapping" optimization, which moves spills and reloads around in
the CFG to avoid doing saves on paths that don't need them.

LLVM includes new experimental support for writing ELF .o files directly
from the compiler. It works well for many simple C testcases, but doesn't
support exception handling, debug info, inline assembly, etc.

Targets can now specify register allocation hints through
MachineRegisterInfo::setRegAllocationHint. A regalloc hint consists
of hint type and physical register number. A hint type of zero specifies a
register allocation preference. Other hint type values are target specific
which are resolved by TargetRegisterInfo::ResolveRegAllocHint. An
example is the ARM target which uses register hints to request that the
register allocator provide an even / odd register pair to two virtual
registers.

GCC-compatible soft float modes are now supported, which are typically used
by OS kernels.

X86-64 now models implicit zero extensions better, which allows the code
generator to remove a lot of redundant zexts. It also models the 8-bit "H"
registers as subregs, which allows them to be used in some tricky
situations.

X86-64 now supports the "local exec" and "initial exec" thread local storage
model.

The vector forms of the icmp and fcmp instructions now select to efficient
SSE operations.

Support for the win64 calling conventions have improved. The primary
missing feature is support for varargs function definitions. It seems to
work well for many win64 JIT purposes.

The X86 backend has preliminary support for mapping address spaces to segment
register references. This allows you to write GS or FS relative memory
accesses directly in LLVM IR for cases where you know exactly what you're
doing (such as in an OS kernel). There are some known problems with this
support, but it works in simple cases.

The X86 code generator has been refactored to move all global variable
reference logic to one place
(X86Subtarget::ClassifyGlobalReference) which
makes it easier to reason about.

Preliminary support for processors, such as the Cortex-A8 and Cortex-A9,
that implement version v7-A of the ARM architecture. The ARM backend now
supports both the Thumb2 and Advanced SIMD (Neon) instruction sets.

The AAPCS-VFP "hard float" calling conventions are also supported with the
-float-abi=hard flag.

The ARM calling convention code is now tblgen generated instead of resorting
to C++ code.

These features are still somewhat experimental
and subject to change. The Neon intrinsics, in particular, may change in future
releases of LLVM. ARMv7 support has progressed a lot on top of tree since 2.6
branched.

This release includes a number of new APIs that are used internally, which
may also be useful for external clients.

New PrettyStackTrace class allows crashes of llvm tools (and applications
that integrate them) to provide more detailed indication of what the
compiler was doing at the time of the crash (e.g. running a pass).
At the top level for each LLVM tool, it includes the command line arguments.

New StringRef
and Twine classes
make operations on character ranges and
string concatenation to be more efficient. StringRef is just a const
char* with a length, Twine is a light-weight rope.

LLVM has new WeakVH, AssertingVH and CallbackVH
classes, which make it easier to write LLVM IR transformations. WeakVH
is automatically drops to null when the referenced Value is deleted,
and is updated across a replaceAllUsesWith operation.
AssertingVH aborts the program if the
referenced value is destroyed while it is being referenced. CallbackVH
is a customizable class for handling value references. See ValueHandle.h
for more information.

The new 'Triple
' class centralizes a lot of logic that reasons about target
triples.

The new '
llvm_report_error()' set of APIs allows tools to embed the LLVM
optimizer and backend and recover from previously unrecoverable errors.

LLVM profile information support has been significantly improved to produce
correct use counts, and has support for edge profiling with reduced runtime
overhead. Combined, the generated profile information is both more correct and
imposes about half as much overhead (2.6. from 12% to 6% overhead on SPEC
CPU2000).

The C bindings (in the llvm/include/llvm-c directory) include many newly
supported APIs.

LLVM 2.6 includes a brand new experimental LLVM bindings to the Ada2005
programming language.

If you're already an LLVM user or developer with out-of-tree changes based
on LLVM 2.5, this section lists some "gotchas" that you may run into upgrading
from the previous release.

The Itanium (IA64) backend has been removed. It was not actively supported
and had bitrotted.

The BigBlock register allocator has been removed, it had also bitrotted.

The C Backend (-march=c) is no longer considered part of the LLVM release
criteria. We still want it to work, but no one is maintaining it and it lacks
support for arbitrary precision integers and other important IR features.

All LLVM tools now default to overwriting their output file, behaving more
like standard unix tools. Previously, this only happened with the '-f'
option.

LLVM build now builds all libraries as .a files instead of some libraries as
relinked .o files. This requires users to explicitly call functions like
those in
Target/TargetSelect.h
and to include files
like ExecutionEngine/JIT.h
to tell the linker to run the static initializers they need.

In addition, many APIs have changed in this release. Some of the major LLVM
API changes are:

All uses of hash_set and hash_map have been removed from
the LLVM tree and the wrapper headers have been removed.

The llvm/Streams.h and DOUT member of Debug.h have been removed. The
llvm::Ostream class has been completely removed and replaced with
uses of raw_ostream.

LLVM's global uniquing tables for Types and Constants have
been privatized into members of an LLVMContext. A number of APIs
now take an LLVMContext as a parameter. To smooth the transition
for clients that will only ever use a single context, the new
getGlobalContext() API can be used to access a default global
context which can be passed in any and all cases where a context is
required.

The getABITypeSize methods are now called getAllocSize.

The Add, Sub and Mul operators are no longer
overloaded for floating-point types. Floating-point addition, subtraction
and multiplication are now represented with new operators FAdd,
FSub and FMul. In the IRBuilder API,
CreateAdd, CreateSub, CreateMul and
CreateNeg should only be used for integer arithmetic now;
CreateFAdd, CreateFSub, CreateFMul and
CreateFNeg should now be used for floating-point arithmetic.

The DynamicLibrary class can no longer be constructed, its functionality has
moved to static member functions.

raw_fd_ostream's constructor for opening a given filename now
takes an extra Force argument. If Force is set to
false, an error will be reported if a file with the given name
already exists. If Force is set to true, the file will
be silently truncated (which is the behavior before this flag was
added).

SCEVHandle no longer exists, because reference counting is no
longer done for SCEV* objects, instead const SCEV*
should be used.

Many APIs, notably llvm::Value, now use the StringRef
and Twine classes instead of passing const char*
or std::string, as described in
the Programmer's Manual. Most
clients should be unaffected by this transition, unless they are used to
Value::getName() returning a string. Here are some tips on updating to
2.6:

getNameStr() is still available, and matches the old
behavior. Replacing getName() calls with this is an safe option,
although more efficient alternatives are now possible.

If you were just relying on getName() being able to be sent to
a std::ostream, consider migrating
to llvm::raw_ostream.

If you were using getName().c_str() to get a const
char* pointer to the name, you can use getName().data().
Note that this string (as before), may not be the entire name if the
name contains embedded null characters.

If you were using operator + on the result of getName() and
treating the result as an std::string, you can either
use Twine::str to get the result as an std::string, or
could move to a Twine based design.

isName() should be replaced with comparison
against getName() (this is now efficient).

The registration interfaces for backend Targets has changed (what was
previously TargetMachineRegistry). For backend authors, see the Writing An LLVM Backend
guide. For clients, the notable API changes are:

TargetMachineRegistry has been renamed
to TargetRegistry.

Clients should move to using the TargetRegistry::lookupTarget()
function to find targets.

Intel and AMD machines running on Win32 with the Cygwin libraries (limited
support is available for native builds with Visual C++).

Sun x86 and AMD64 machines running Solaris 10, OpenSolaris 0906.

Alpha-based machines running Debian GNU/Linux.

The core LLVM infrastructure uses GNU autoconf to adapt itself
to the machine and operating system on which it is built. However, minor
porting may be required to get LLVM to work on new platforms. We welcome your
portability patches and reports of successful builds or error messages.

This section contains significant known problems with the LLVM system,
listed by component. If you run into a problem, please check the LLVM bug database and submit a bug if
there isn't already one.

The llvm-gcc bootstrap will fail with some versions of binutils (e.g. 2.15)
with a message of "Error: can not do 8
byte pc-relative relocation" when building C++ code. We intend to
fix this on mainline, but a workaround for 2.6 is to upgrade to binutils
2.17 or later.

LLVM will not correctly compile on Solaris and/or OpenSolaris
using the stock GCC 3.x.x series 'out the box',
See: Broken versions of GCC and other tools.
However, A Modern GCC Build
for x86/x86-64 has been made available from the third party AuroraUX Project
that has been meticulously tested for bootstrapping LLVM & Clang.

The following components of this LLVM release are either untested, known to
be broken or unreliable, or are in early development. These components should
not be relied on, and bugs should not be filed against them, but they may be
useful to some people. In particular, if you would like to work on one of these
components, please contact us on the LLVMdev list.

The X86 backend generates inefficient floating point code when configured
to generate code for systems that don't have SSE2.

Win64 code generation wasn't widely tested. Everything should work, but we
expect small issues to happen. Also, llvm-gcc cannot build the mingw64
runtime currently due
to severalbugs and due to lack of support for
the
'u' inline assembly constraint and for X87 floating point inline assembly.

The X86-64 backend does not yet support the LLVM IR instruction
va_arg. Currently, the llvm-gcc and front-ends support variadic
argument constructs on X86-64 by lowering them manually.

The only major language feature of GCC not supported by llvm-gcc is
the __builtin_apply family of builtins. However, some extensions
are only supported on some targets. For example, trampolines are only
supported on some targets (these are used when you take the address of a
nested function).

If you run into GCC extensions which are not supported, please let us know.

The llvm-gcc 4.2 Ada compiler works fairly well; however, this is not a mature
technology, and problems should be expected.

The Ada front-end currently only builds on X86-32. This is mainly due
to lack of trampoline support (pointers to nested functions) on other platforms.
However, it also fails to build on X86-64
which does support trampolines.

The Ada front-end fails to bootstrap.
This is due to lack of LLVM support for setjmp/longjmp style
exception handling, which is used internally by the compiler.
Workaround: configure with --disable-bootstrap.

The c380004, c393010
and cxg2021 ACATS tests fail
(c380004 also fails with gcc-4.2 mainline).
If the compiler is built with checks disabled then c393010
causes the compiler to go into an infinite loop, using up all system memory.

Some GCC specific Ada tests continue to crash the compiler.

The -E binder option (exception backtraces)
does not work and will result in programs
crashing if an exception is raised. Workaround: do not use -E.

The Llvm.Linkage module is broken, and has incorrect values. Only
Llvm.Linkage.External, Llvm.Linkage.Available_externally, and
Llvm.Linkage.Link_once will be correct. If you need any of the other linkage
modes, you'll have to write an external C library in order to expose the
functionality. This has been fixed in the trunk.

A wide variety of additional information is available on the LLVM web page, in particular in the documentation section. The web page also
contains versions of the API documentation which is up-to-date with the
Subversion version of the source code.
You can access versions of these documents specific to this release by going
into the "llvm/doc/" directory in the LLVM tree.

If you have any questions or comments about LLVM, please feel free to contact
us via the mailing
lists.