The Sad State of Symbol Aliases

This point continues my quest to condense and write down some of the folklore surrounding assemblers & linkers. In this case, I recently came across a situation where it would be useful to be able to generate an object file that contained an alias for a symbol defined elsewhere. For example, I want an object file to export a symbol foo that aliases bar, such that when any use site of foo is linked against the object file that use site then behaves exactly as if it had referenced bar instead.

This could be done straightforwardly (just export both foo with the same value as bar) except for the wrinkle that in general bar is not defined in the object file exporting foo, so we don’t know its value yet.

This article picks apart support for this feature on a platform-by-platform basis. Long story short: this is supported by the object file format on OS X and Windows, but you can’t get to it from the assembly code level. Linux has no support at all.

OS X

Buried deep within the Mach-O specification is a mention of the symbol table entry type N_INDR. Quoth the standard: “The symbol is defined to be the same as another symbol. The n_value field is an index into the string table specifying the name of the other symbol. When that symbol is linked, both this and the other symbol have the same defined type and value”.

This is great stuff, and exactly what we want! The fly in the ointment is that the latest version of Apples assembler has no support for actually generating such indirections. The source tree does contain a tool called indr which is capable of generating these indirections in a limited capacity, but it is not distributed with OS X and anyway not general enough for our needs. Happily, Apple’s linker does seem to include support for N_INDR, so everything should work OK if you managed to generate an object file making use of that type.

Windows

Interestingly, Windows DLLs support something called “forwarders” which give us the behaviour we want for dynamically exported symbols. You can create such DLLs with special syntax in your .def file EXPORTS section. This is not relevant to our problem though, because there is no equivalent at the object file level.

Page 44 of the PE/COFF specification talks about symbol tables. Reading carefully, we find a mention of “Weak Externals” on page 51:

“Weak externals” are a mechanism for object files that allows flexibility at link time. A module can contain an unresolved external symbol (sym1), but it can also include an auxiliary record that indicates that if sym1 is not present at link time, another external symbol (sym2) is used to resolve references instead. If a definition of sym1 is linked, then an external reference to the symbol is resolved normally. If a definition of sym1 is not linked, then all references to the weak external for sym1 refer to sym2 instead. The external symbol, sym2, must always be linked; typically, it is defined in the module that contains the weak reference to sym1.

This is not exactly what we had in mind, but it can be abused for the same effect. Nothing will go wrong unless someone else defines a symbol with the same name as our alias in another object file.

As far as I can see, the GNU assembler can’t be persuaded to generate this. The assembler does have rudimentary support for generating weak externals, but only uses it in the rudimentary capacity of supporting the .weak directive (with ELF-style semantics) on Windows. And as we shall shortly see, ELF semantics are not what we want at all…

Linux

Turning to page 1-16 of the ELF specification we find the definition of the ELF symbol table. As far as I can tell, there is no support whatsoever for this use case. Bah.

We might be tempted to search for some equivalent to the weak externals feature on Windows. Unfortunately, ELF weak symbols have a rather different semantics:

An undefined weak symbol will not cause the linker to error out if a definition is not found. Instead, the symbol will be filled in with a default value of 0.

A defined weak symbol has a lower link precedence than a strong symbol of the same name, and will not cause the linker to generate an error about duplicate symbol definitions in the case of such a conflict.

The difference between this and the Windows situation is that Windows basically lets us change the default value filled in by the linker in the case of no definition being found to an arbitrary symbol.

GCC

GCC supports an alias attribute that does exactly what I want. Unfortunately despite a fewpeople trying to do exactly what I want they have elected to reject the construct:

This is because it’s meaningless to define an alias to an undefined symbol. On Solaris, the native assembler would have caught this error, but GNU as does not.

This comment refers to the fact that assembly like this:

.globl reexport
.globl export
.equiv export, reexport

Does not fail to compile with the GNU assembler, but generates an object file that does not define any symbols despite referencing the reexport symbol.

Conclusion

A sufficiently motivated hacker could support a (weak) aliasing feature along the lines described above in the GNU assembler on Windows and OS_X without problems. However, there seems to be no way to support it on Linux within the bounds of the ELF specification.

Unusually Linux is the platform that lags behind the others in linker features! I usually find that quite the opposite is true.