This document contains the release notes for the LLVM Compiler
Infrastructure, release 2.7. Here we describe the status of LLVM, including
major improvements from the previous release and significant known problems.
All LLVM releases may be downloaded from the LLVM releases web site.

Note that if you are reading this file from a Subversion checkout or the
main LLVM web page, this document applies to the next release, not the
current one. To see the release notes for a specific release, please see the
releases page.

The LLVM 2.7 distribution currently consists of code from the core LLVM
repository (which roughly includes the LLVM optimizers, code generators
and supporting tools), the Clang repository and the llvm-gcc repository. In
addition to this code, the LLVM Project includes other sub-projects that are in
development. Here we include updates on these subprojects.

Clang is an LLVM front end for the C,
C++, and Objective-C languages. Clang aims to provide a better user experience
through expressive diagnostics, a high level of conformance to language
standards, fast compilation, and low memory use. Like LLVM, Clang provides a
modular, library-based architecture that makes it suitable for creating or
integrating with other development tools. Clang is considered a
production-quality compiler for C and Objective-C on x86 (32- and 64-bit).

In the LLVM 2.7 time-frame, the Clang team has made many improvements:

C++ Support: Clang is now capable of self-hosting! While still
alpha-quality, Clang's C++ support has matured enough to build LLVM and Clang,
and C++ is now enabled by default. See the Clang C++ compatibility
page for common C++ migration issues.

Objective-C: Clang now includes experimental support for an updated
Objective-C ABI on non-Darwin platforms. This includes support for non-fragile
instance variables and accelerated proxies, as well as greater potential for
future optimisations. The new ABI is used when compiling with the
-fobjc-nonfragile-abi and -fgnu-runtime options. Code compiled with these
options may be mixed with code compiled with GCC or clang using the old GNU ABI,
but requires the libobjc2 runtime from the GNUstep project.

New warnings: Clang contains a number of new warnings, including
control-flow warnings (unreachable code, missing return statements in a
non-void function, etc.), sign-comparison warnings, and improved
format-string warnings.

CIndex API and Python bindings: Clang now includes a C API as part of the
CIndex library. Although we may make some changes to the API in the future, it
is intended to be stable and has been designed for use by external projects. See
the Clang
doxygen CIndex
documentation for more details. The CIndex API also includes a preliminary
set of Python bindings.

ARM Support: Clang now has ABI support for both the Darwin and Linux ARM
ABIs. Coupled with many improvements to the LLVM ARM backend, Clang is now
suitable for use as a beta quality ARM compiler.

The Clang Static Analyzer
project is an effort to use static source code analysis techniques to
automatically find bugs in C and Objective-C programs (and hopefully C++ in the
future!). The tool is very good at finding bugs that occur on specific
paths through code, such as on error conditions.

In the LLVM 2.7 time-frame, the analyzer core has made several major and
minor improvements, including better support for tracking the fields of
structures, initial support (not enabled by default yet) for doing
interprocedural (cross-function) analysis, and new checks have been added.

The VMKit project is an implementation of
a JVM and a CLI Virtual Machine (Microsoft .NET is an
implementation of the CLI) using LLVM for static and just-in-time
compilation.

With the release of LLVM 2.7, VMKit has shifted to a great framework for writing
virtual machines. VMKit now offers precise and efficient garbage collection with
multi-threading support, thanks to the MMTk memory management toolkit, as well
as just in time and ahead of time compilation with LLVM. The major changes in
VMKit 0.27 are:

Garbage collection: VMKit now uses the MMTk toolkit for garbage collectors.
The first collector to be ported is the MarkSweep collector, which is precise,
and drastically improves the performance of VMKit.

Line number information in the JVM: by using the debug metadata of LLVM, the
JVM now supports precise line number information, useful when printing a stack
trace.

Interface calls in the JVM: we implemented a variant of the Interface Method
Table technique for interface calls in the JVM.

The new LLVM compiler-rt project
is a simple library that provides an implementation of the low-level
target-specific hooks required by code generation and other runtime components.
For example, when compiling for a 32-bit target, converting a double to a 64-bit
unsigned integer is compiled into a runtime call to the "__fixunsdfdi"
function. The compiler-rt library provides highly optimized implementations of
this and other low-level routines (some are 3x faster than the equivalent
libgcc routines).

All of the code in the compiler-rt project is available under the standard LLVM
License, a "BSD-style" license. New in LLVM 2.7: compiler_rt now
supports ARM targets.

DragonEgg is a port of llvm-gcc to
gcc-4.5. Unlike llvm-gcc, which makes many intrusive changes to the underlying
gcc-4.2 code, dragonegg in theory does not require any gcc-4.5 modifications
whatsoever (currently one small patch is needed). This is thanks to the new
gcc plugin architecture, which
makes it possible to modify the behaviour of gcc at runtime by loading a plugin,
which is nothing more than a dynamic library which conforms to the gcc plugin
interface. DragonEgg is a gcc plugin that causes the LLVM optimizers to be run
instead of the gcc optimizers, and the LLVM code generators instead of the gcc
code generators, just like llvm-gcc. To use it, you add
"-fplugin=path/dragonegg.so" to the gcc-4.5 command line, and gcc-4.5 magically
becomes llvm-gcc-4.5!

DragonEgg is still a work in progress. Currently C works very well, while C++,
Ada and Fortran work fairly well. All other languages either don't work at all,
or only work poorly. For the moment only the x86-32 and x86-64 targets are
supported, and only on linux and darwin (darwin needs an additional gcc patch).

DragonEgg is a new project which is seeing its first release with llvm-2.7.

The LLVM Machine Code (aka MC) sub-project of LLVM was created to solve a number
of problems in the realm of assembly, disassembly, object file format handling,
and a number of other related areas that CPU instruction-set level tools work
in. It is a sub-project of LLVM which provides it with a number of advantages
over other compilers that do not have tightly integrated assembly-level tools.
For a gentle introduction, please see the Intro to the
LLVM MC Project Blog Post.

2.7 includes major parts of the work required by the new MC Project. A few
targets have been refactored to support it, and work is underway to support a
native assembler in LLVM. This work is not complete in LLVM 2.7, but it has
made substantially more progress on LLVM mainline.

One minor example of what MC can do is to transcode an AT&T syntax
X86 .s file into intel syntax. You can do this with something like:

An exciting aspect of LLVM is that it is used as an enabling technology for
a lot of other language and tools projects. This section lists some of the
projects that have already been updated to work with LLVM 2.7.

Pure
is an algebraic/functional programming language based on term rewriting.
Programs are collections of equations which are used to evaluate expressions in
a symbolic fashion. Pure offers dynamic typing, eager and lazy evaluation,
lexical closures, a hygienic macro system (also based on term rewriting),
built-in list and matrix support (including list and matrix comprehensions) and
an easy-to-use C interface. The interpreter uses LLVM as a backend to
JIT-compile Pure programs to fast native code.

Pure versions 0.43 and later have been tested and are known to work with
LLVM 2.7 (and continue to work with older LLVM releases >= 2.5).

Roadsend PHP (rphp) is an open
source implementation of the PHP programming
language that uses LLVM for its optimizer, JIT and static compiler. This is a
reimplementation of an earlier project that is now based on LLVM.

TCE is a toolset for designing
application-specific processors (ASP) based on the Transport triggered
architecture (TTA). The toolset provides a complete co-design flow from C/C++
programs down to synthesizable VHDL and parallel program binaries. Processor
customization points include the register files, function units, supported
operations, and the interconnection network.

TCE uses llvm-gcc/Clang and LLVM for C/C++ language support, target
independent optimizations and also for parts of code generation. It generates
new LLVM-based code generators "on the fly" for the designed TTA processors and
loads them in to the compiler backend as runtime libraries to avoid per-target
recompilation of larger parts of the compiler chain.

SAFECode is a memory safe C
compiler built using LLVM. It takes standard, unannotated C code, analyzes the
code to ensure that memory accesses and array indexing operations are safe, and
instruments the code with run-time checks when safety cannot be proven
statically.

IcedTea provides a
harness to build OpenJDK using only free software build tools and to provide
replacements for the not-yet free parts of OpenJDK. One of the extensions that
IcedTea provides is a new JIT compiler named Shark which uses LLVM
to provide native code generation without introducing processor-dependent
code.

Icedtea6 1.8 and later have been tested and are known to work with
LLVM 2.7 (and continue to work with older LLVM releases >= 2.6 as well).

MacRuby is an implementation of Ruby based on
core Mac OS technologies, sponsored by Apple Inc. It uses LLVM at runtime for
optimization passes, JIT compilation and exception handling. It also allows
static (ahead-of-time) compilation of Ruby code straight to machine code.

GHC is an open source,
state-of-the-art programming suite for Haskell, a standard lazy
functional programming language. It includes an optimizing static
compiler generating good code for a variety of platforms, together
with an interactive system for convenient, quick development.

Ted Kremenek and Doug Gregor have stepped forward as Code Owners of the
Clang static analyzer and the Clang frontend, respectively.

LLVM now has an official Blog at
http://blog.llvm.org. This is a great way
to learn about new LLVM-related features as they are implemented. Several
features in this release are already explained on the blog.

The LLVM web pages are now checked into the SVN server, in the "www",
"www-pubs" and "www-releases" SVN modules. Previously they were hidden in a
largely inaccessible old CVS server.

llvm.org is now hosted on a new (and much
faster) server. It is still graciously hosted at the University of Illinois
of Urbana Champaign.

2.7 includes initial support for the MicroBlaze target.
MicroBlaze is a soft processor core designed for Xilinx FPGAs.

2.7 includes a new LLVM IR "extensible metadata" feature. This feature
supports many different use cases, including allowing front-end authors to
encode source level information into LLVM IR, which is consumed by later
language-specific passes. This is a great way to do high-level optimizations
like devirtualization, type-based alias analysis, etc. See the
Extensible Metadata Blog Post for more information.

2.7 encodes debug information
in a completely new way, built on extensible metadata. The new implementation
is much more memory efficient and paves the way for improvements to optimized
code debugging experience.

2.7 now directly supports taking the address of a label and doing an
indirect branch through a pointer. This is particularly useful for
interpreter loops, and is used to implement the GCC "address of label"
extension. For more information, see the
Address of Label and Indirect Branches in LLVM IR Blog Post.

2.7 is the first release to start supporting APIs for assembling and
disassembling target machine code. These APIs are useful for a variety of
low level clients, and are surfaced in the new "enhanced disassembly" API.
For more information see the The X86
Disassembler Blog Post for more information.

2.7 includes major parts of the work required by the new MC Project,
see the MC update above for more information.

LLVM IR has several new features for better support of new targets and that
expose new optimization opportunities:

LLVM IR now supports a 16-bit "half float" data type through two new intrinsics and APFloat support.

LLVM IR supports two new function
attributes: inlinehint and alignstack(n). The former is a hint to the
optimizer that a function was declared 'inline' and thus the inliner should
weight it higher when considering inlining it. The later
indicates to the code generator that the function diverges from the platform
ABI on stack alignment.

The new llvm.objectsize intrinsic
allows the optimizer to infer the sizes of memory objects in some cases.
This intrinsic is used to implement the GCC __builtin_object_size
extension.

LLVM IR now supports marking load and store instructions with "non-temporal" hints (building on the new
metadata feature). This hint encourages the code
generator to generate non-temporal accesses when possible, which are useful
for code that is carefully managing cache behavior. Currently, only the
X86 backend provides target support for this feature.

LLVM 2.7 has pre-alpha support for unions in LLVM IR.
Unfortunately, this support is not really usable in 2.7, so if you're
interested in pushing it forward, please help contribute to LLVM mainline.

In addition to a large array of minor performance tweaks and bug fixes, this
release includes a few major enhancements and additions to the optimizers:

The inliner reuses now merges arrays stack objects in different callees when
inlining multiple call sites into one function. This reduces the stack size
of the resultant function.

The -basicaa alias analysis pass (which is the default) has been improved to
be less dependent on "type safe" pointers. It can now look through bitcasts
and other constructs more aggressively, allowing better load/store
optimization.

The module target data string now
includes a notion of 'native' integer data types for the target. This
helps mid-level optimizations avoid promoting complex sequences of
operations to data types that are not natively supported (e.g. converting
i32 operations to i64 on 32-bit chips).

The mid-level optimizer is now conservative when operating on a module with
no target data. Previously, it would default to SparcV9 settings, which is
not what most people expected.

Jump threading is now much more aggressive at simplifying correlated
conditionals and threading blocks with otherwise complex logic. It has
subsumed the old "Conditional Propagation" pass, and -condprop has been
removed from LLVM 2.7.

The -instcombine pass has been refactored from being one huge file to being
a library of its own. Internally, it uses a customized IRBuilder to clean
it up and simplify it.

The optimal edge profiling pass is reliable and much more complete than in
2.6. It can be used with the llvm-prof tool but isn't wired up to the
llvm-gcc and clang command line options yet.

A new experimental alias analysis implementation, -scev-aa, has been added.
It uses LLVM's Scalar Evolution implementation to do symbolic analysis of
pointer offset expressions to disambiguate pointers. It can catch a few
cases that basicaa cannot, particularly in complex loop nests.

The default pass ordering has been tweaked for improved optimization
effectiveness.

We have put a significant amount of work into the code generator
infrastructure, which allows us to implement more aggressive algorithms and make
it run faster:

The 'llc -asm-verbose' option (which is now the default) has been enhanced
to emit many useful comments to .s files indicating information about spill
slots and loop nest structure. This should make it much easier to read and
understand assembly files. This is wired up in llvm-gcc and clang to
the -fverbose-asm option.

New LSR with "full strength reduction" mode, which can reduce address
register pressure in loops where address generation is important.

A new codegen level Common Subexpression Elimination pass (MachineCSE)
is available and enabled by default. It catches redundancies exposed by
lowering.

A new pre-register-allocation tail duplication pass is available and enabled
by default, it can substantially improve branch prediction quality in some
cases.

A new sign and zero extension optimization pass (OptimizeExtsPass)
is available and enabled by default. This pass can takes advantage
architecture features like x86-64 implicit zero extension behavior and
sub-registers.

The code generator now supports a mode where it attempts to preserve the
order of instructions in the input code. This is important for source that
is hand scheduled and extremely sensitive to scheduling. It is compatible
with the GCC -fno-schedule-insns option.

The target-independent code generator now supports generating code with
arbitrary numbers of result values. Returning more values than was
previously supported is handled by returning through a hidden pointer. In
2.7, only the X86 and XCore targets have adopted support for this
though.

The "DAG instruction
selection" phase of the code generator has been largely rewritten for
2.7. Previously, tblgen spit out tons of C++ code which was compiled and
linked into the target to do the pattern matching, now it emits a much
smaller table which is read by the target-independent code. The primary
advantages of this approach is that the size and compile time of various
targets is much improved. The X86 code generator shrunk by 1.5MB of code,
for example.

Almost the entire code generator has switched to emitting code through the
MC interfaces instead of printing textually to the .s file. This led to a
number of cleanups and speedups. In 2.7, debug an exception handling
information does not go through MC yet.

The ARM and Thumb code generators now use register scavenging for stack
object address materialization. This allows the use of R3 as a general
purpose register in Thumb1 code, as it was previous reserved for use in
stack address materialization. Secondly, sequential uses of the same
value will now re-use the materialized constant.

The ARM backend now has good support for ARMv4 targets and has been tested
on StrongARM hardware. Previously, LLVM only supported ARMv4T and
newer chips.

Atomic builtins are now supported for ARMv6 and ARMv7 (__sync_synchronize,
__sync_fetch_and_add, etc.).

This release includes a number of new APIs that are used internally, which
may also be useful for external clients.

The optimizer uses the new CodeMetrics class to measure the size of code.
Various passes (like the inliner, loop unswitcher, etc) all use this to make
more accurate estimates of the code size impact of various
optimizations.

A new
llvm/Analysis/InstructionSimplify.h interface is available for doing
symbolic simplification of instructions (e.g. a+0 -> a)
without requiring the instruction to exist. This centralizes a lot of
ad-hoc symbolic manipulation code scattered in various passes.

The optimizer now uses a new SSAUpdater
class which efficiently supports
doing unstructured SSA update operations. This centralized a bunch of code
scattered throughout various passes (e.g. jump threading, lcssa,
loop rotate, etc) for doing this sort of thing. The code generator has a
similar
MachineSSAUpdater class.

The
llvm/Support/Regex.h header exposes a platform independent regular
expression API. Building on this, the FileCheck utility now supports
regular exressions.

raw_ostream now supports a circular "debug stream" accessed with "dbgs()".
By default, this stream works the same way as "errs()", but if you pass
-debug-buffer-size=1000 to opt, the debug stream is capped to a
fixed sized circular buffer and the output is printed at the end of the
program's execution. This is helpful if you have a long lived compiler
process and you're interested in seeing snapshots in time.

If you're already an LLVM user or developer with out-of-tree changes based
on LLVM 2.6, this section lists some "gotchas" that you may run into upgrading
from the previous release.

The Andersen's alias analysis ("anders-aa") pass, the Predicate Simplifier
("predsimplify") pass, the LoopVR pass, the GVNPRE pass, and the random sampling
profiling ("rsprofiling") passes have all been removed. They were not being
actively maintained and had substantial problems. If you are interested in
these components, you are welcome to ressurect them from SVN, fix the
correctness problems, and resubmit them to mainline.

LLVM now defaults to building most libraries with RTTI turned off, providing
a code size reduction. Packagers who are interested in building LLVM to support
plugins that require RTTI information should build with "make REQUIRE_RTTI=1"
and should read the new Advice on Packaging LLVM
document.

The LLVM interpreter now defaults to not using libffi even
if you have it installed. This makes it more likely that an LLVM built on one
system will work when copied to a similar system. To use libffi,
configure with --enable-libffi.

Debug information uses a completely different representation, an LLVM 2.6
.bc file should work with LLVM 2.7, but debug info won't come forward.

The LLVM 2.6 (and earlier) "malloc" and "free" instructions got removed,
along with LowerAllocations pass. Now you should just use a call to the
malloc and free functions in libc. These calls are optimized as well as
the old instructions were.

In addition, many APIs have changed in this release. Some of the major LLVM
API changes are:

Just about everything has been converted to use raw_ostream instead of
std::ostream.

ModuleProvider has been removed
and its methods moved to Module and GlobalValue.
Most clients can remove uses of ExistingModuleProvider,
replace getBitcodeModuleProvider with
getLazyBitcodeModule, and pass their Module to
functions that used to accept ModuleProvider. Clients who
wrote their own ModuleProviders will need to derive from
GVMaterializer instead and use
Module::setMaterializer to attach it to a
Module.

GhostLinkage has given up the ghost.
GlobalValues that have not yet been read from their backing
storage have the same linkage they will have after being read in.
Clients must replace calls to
GlobalValue::hasNotBeenReadFromBitcode with
GlobalValue::isMaterializable.

Intel and AMD machines running on Win32 with the Cygwin libraries (limited
support is available for native builds with Visual C++).

Sun x86 and AMD64 machines running Solaris 10, OpenSolaris 0906.

Alpha-based machines running Debian GNU/Linux.

The core LLVM infrastructure uses GNU autoconf to adapt itself
to the machine and operating system on which it is built. However, minor
porting may be required to get LLVM to work on new platforms. We welcome your
portability patches and reports of successful builds or error messages.

This section contains significant known problems with the LLVM system,
listed by component. If you run into a problem, please check the LLVM bug database and submit a bug if
there isn't already one.

LLVM will not correctly compile on Solaris and/or OpenSolaris
using the stock GCC 3.x.x series 'out the box',
See: Broken versions of GCC and other tools.
However, A Modern GCC Build
for x86/x86-64 has been made available from the third party AuroraUX Project
that has been meticulously tested for bootstrapping LLVM & Clang.

The following components of this LLVM release are either untested, known to
be broken or unreliable, or are in early development. These components should
not be relied on, and bugs should not be filed against them, but they may be
useful to some people. In particular, if you would like to work on one of these
components, please contact us on the LLVMdev list.

The X86 backend generates inefficient floating point code when configured
to generate code for systems that don't have SSE2.

Win64 code generation wasn't widely tested. Everything should work, but we
expect small issues to happen. Also, llvm-gcc cannot build the mingw64
runtime currently due to lack of support for the 'u' inline assembly
constraint and for X87 floating point inline assembly.

The X86-64 backend does not yet support the LLVM IR instruction
va_arg. Currently, front-ends support variadic
argument constructs on X86-64 by lowering them manually.

The only major language feature of GCC not supported by llvm-gcc is
the __builtin_apply family of builtins. However, some extensions
are only supported on some targets. For example, trampolines are only
supported on some targets (these are used when you take the address of a
nested function).

The llvm-gcc 4.2 Ada compiler works fairly well; however, this is not a mature
technology, and problems should be expected.

The Ada front-end currently only builds on X86-32. This is mainly due
to lack of trampoline support (pointers to nested functions) on other platforms.
However, it also fails to build on X86-64
which does support trampolines.

The Ada front-end fails to bootstrap.
This is due to lack of LLVM support for setjmp/longjmp style
exception handling, which is used internally by the compiler.
Workaround: configure with --disable-bootstrap.

The c380004, c393010
and cxg2021 ACATS tests fail
(c380004 also fails with gcc-4.2 mainline).
If the compiler is built with checks disabled then c393010
causes the compiler to go into an infinite loop, using up all system memory.

Some GCC specific Ada tests continue to crash the compiler.

The -E binder option (exception backtraces)
does not work and will result in programs
crashing if an exception is raised. Workaround: do not use -E.

A wide variety of additional information is available on the LLVM web page, in particular in the documentation section. The web page also
contains versions of the API documentation which is up-to-date with the
Subversion version of the source code.
You can access versions of these documents specific to this release by going
into the "llvm/doc/" directory in the LLVM tree.

If you have any questions or comments about LLVM, please feel free to contact
us via the mailing
lists.