I was recently looking for a way to ship an image with an executable
without referring to the image as external file. Now you can argue if
it is a good idea in general, but that's another story.

Since I found very little information on the subject and needed to piece together information from various sources, I conclude the task with this comprehensive write up on the subject. The mechanisms described here apply to including any resource - not just images - into an executable c-program.

There are various ways to embed generic fixed data within an application. Let's start with a basic example that should be familiar:

const char* mystring = "hello world";

Every application is just a collection of Data segments stored in a file.
On the modern PC architecture, every application binary consists of three major segments Stack, Data and Code.
Without going into details, fixed data - such as constant text-strings reside in the read-only part
of the Data Segment of the application binary and the c-compiler will automatically create a memory region for it.

When you compile the above line with gcc -c -o mytext.o mytext.c and inspect the object file with objdump -t mytext.o you'll find the data-section in the object file.

To encode the data of a given file, the xxd tool that comes with the vim text editor can be used: xxd -i binary_file outputs C include file style of the given binary file and writes a complete static array definition named after the input file:

Using this variant produces portable C code and just works(TM). On the downside, it requires at least five times the size of the original data (” 0xNN,” for every original data-byte) and the source code needs to be updated every time the original data changes.

While this approach is useful in some cases, we can do better than that.

The GNU linker can be used to directly create object files with a custom .data section directly from any input file. The /magic/ flags are -r to make the object file relocatable and -b binary for linking files with an unusual binary format.

# ld -r -b binary -o example.o example.jpg

The resulting object file example.o can be linked with any application, simply with gcc itself. e.g. gcc -o myapp myapp.c example.o. Now the last missing link is to access the data in example.o from the c-code in myapp.c. Have a look at the output of the object and compare it with above output for the mystring object.

Now for the tricky part: The GNU linker behaves differently depending on platform and architecture. The implementations interesting for me are GNU/Linux, OSX and mingw (cross-compiling windows binaries on a GNU/Linux host).

The mingw cross-compiler behaves almost exactly as gnu-ld with one minor difference: the data section does not include the leading underscore: _binary_example_jpg_start vs binary_example_jpg_start. – Fine, there goes some of the elegance of the solution, but that case is easily handled with an #ifdef.

However, Mac/OSX is different. The ld which is shipped with X-code comes from llvm version 2.7svn and does not support the -b input-format feature. Furthermore universal executables on OSX may comprise binary formats for various architectures with the .data section format being different for each architecture. The alignment for the data may differ between 32bit and 64bit architectures and the endianess may differ as well. Thus the creation of the data section needs to be done during compilation instead of the linking stage.

On OSX ld's binary linking feature has been moved into their customized gcc, and is available via '-sectcreate' option:

gcc -sectcreate __DATA __example_jpg example.jpg -o myapp myapp.c

To create a universal build for Intel architectures, add -arch i386 -arch x86_64 to above commandline. objdump is also a GNU tool which is not available on OSX. You can inspect the data section using otool -s __DATA __example_jpg /path/to/executable. see man otool for details there.

Due to the nature of OSX binaries, referencing the data-section in the c-code is not possible with a simple extern unsigned char. The linker does not know which architecture will be used and can not provide an address. The mach binary format which is used by OSX needs to be inspected at runtime when the architecture is known and map the relevant data after the application is started. Apple provides an API for doing that which is defined in the mach-o/getsect.h header file. If you have x-code installed you can read documentation on it at man getsectbyname.

Resolving the secion can be only be done at runtime after the data section has been relocated and can done by calling getsectbyname(). However there is a trick that you can use, to make this implicit. the meta-variable _section$
is recognized by the gcc compiler on OSX. It produces the same result as calling getsectbyname()→addr. Short of reading the actual code, information about osx linker internals is not easy to come by. getsectbyname() actually opens the executable file and searches the relevant data section while the application is running. _section$ may or may not already be resolved at link-time for a given architecture 1).

Update (Oct 2016 - Thanks to Eugene Gershnik): On newer versions of OSX/macOS that run executables with ASLR, the call to `getsectbyname` needs to be replaced with `getsectiondata` 2). However this API is only available from OS 10.7 onwards. –

As a final note, some care must be taken when choosing the variable identifier.

ld will use the filename to generated the section name. If the filename includes characters that are not valid C identifiers they will be transformed to underscores.
e.g. ld -r -b binary -o example.o ../images/example.jpg will create a region _binary____images_example_jpg. The ../ as well as the slash and dot are transformed to underscores.

This is not an issue on OSX where the identifier needs to be specified with the -sectcreate option. However identifiers on OSX are limited to 16 characters.

So in order to use above approach x-platform, the path to the file-name passed to ld must be <16 chars and the same identifier needs to be specified on the OSX compile command.

A complete project that uses this approach to include a jpeg image file and a javascript text file is harvid. It also outlines how to use a x-platform Makefile for creating the object files and adding the relevant flags to the OSX gcc command.