Recently I have been reading something about ELF PIE and PIC and I realized that there is some confusions about them and I have tried to summarize my understandings here.

a) How to tell if an ELF file is PIC or not?
PIC strictly speaking only applies to shared library so first we need to make sure it’s a shared library.

That seems to be easy to answer but actually the addition of PIE has made it more difficult because as we shall see soon – PIE executable is nothing but a shared object and thus shared the same e_type in ELF header.

So if we run “file” command on both a shared library and a PIE executable they would both show as shared object.

So “file” command gives us no difference but readelf has been able to show the existence of “TEXTREL” in non-PIC libs.

Let’s first take a look what “TEXTREAL” means, here is what is taken from ELF spec:

DT_TEXTREL This member’s absence signifies that no relocation entry should cause a modification to
a non-writable segment, as specified by the segment permissions in the program header
table. If this member is present, one or more relocation entries might request
modifications to a non-writable segment, and the dynamic linker can prepare accordingly.

So the key here is we need to modify a non-writable segment (which really means readonly) and since we cannot do that at run time, dynamic linker is supposed to fix that before it loads the segment into memory.

What makes it necessary to modify a non-writable segment in the first place? Let’s move on to see the relocation part.

So 0x602 seems to fall on the first LOAD segment which has the offset range of 0x0 to 0x598 and in it there are several sections that needs to map to it.
And yes, the first LOAD segment is with flags “R E” which is exactly non-writable. (wit E meaning executable)

So above code at address 0x4f3 tries to call “puts” function but the disassembled code shows it thinks “puts” is at address 0x4f4 which is obviously not correct – as we know “puts” must come from libc.

So if we “ldd” on the libhello.so we know it needs help from libc.so to get “put” function and when it is compiled as a shared library it has absolutely no idea where libc.so is going to be loaded and where “puts” function would be located.

That’s why we needs a fix here at address 0x4f4 and give it the right address.

Let’s write a simple client program to verify our idea.

void hello();
int main(void)
{
hello();
return 0;
}

Building the client program is trivial and we only need to bear in mind we have to put “-m32” switch to be compatible.

As can be seen from our gdb session, the memory at offset 0x4f4, which maps to virtual address 0xf7fd74f4 is replaced with real address where function “puts” is loaded (hence where libc.so is loaded).

And also notice the “call” instruction uses offset rather than absolute address in its next 4 bytes after instruction byte “e8”.
So value “0xffea0c98” is not “puts”‘s address but rather relative offset from next instruction to “puts”‘s real address.

This also explains why the relocation type is “R_386_PC32” which is for “relative offset” relocation fixing.

What about PIC version how does it solve the issue?
There are lots of articles on this and we know it is using PLT (Procedure Linkage table) and the code part will always access the data part of the module for the address of the target function.

While the code part of the module is always read-only, the data part of the module does not have this limitation. And as you can easily guess, there is a second LOAD segment with “RW” as the flag and for mapping those data segments.

Without going through too much details let’s quickly check the relocation of PIC version.

Just like we said, “puts” is handled in a special section “.rel.plt” with relocation type “R_386_JUMP_SLOT” and if you run readelf to list segments and sections you can easily find it goes to the second LOAD segments which is perfectly writable.

b) Why cannot I create non-PIC shared library on x64 box?

In the previous section I have intentionally added “-m32” switch to force creating 32bit-targeted binary.
And if you try without that you will instantly get an error.

This has to do again with the similar location type we talked about in previous section “R_386_PC32”, just this time it is explicit that it’s on X86_64 platform hence the name “R_X86_64_32” with the last “32” meaning 32 bit offset.

Since we are talking about 64bit program, 32bit offset may not be big enough to give the correct target function address.

To fix this we simply requests to use 64bit as offset in relocating and as you can guess its relocation type would be “R_X86_64_64”.

We can enable that by a new switch “-mcmodel=large” which suggests large memory model which explained by gcc man:

-mcmodel=large
Generate code for the large code model. This makes no assumptions about addresses and sizes of sections. Pointers are 64 bits. Programs can be statically linked only.

Compare this with the default model we have

-mcmodel=small
Generate code for the small code model. The program and its statically defined symbols must be within 4GB of each other. Pointers are 64 bits. Programs can be statically or
dynamically linked. This is the default code model.

Articles on the internet seems to talk about “-fPIE” switch a lot but actually that only deals with compiling stage. To create a PIE properly you also need to pass on the instruction to linker hence another switch is needed as well “-pie”.

gcc man suggests the same that we can replace -fPIE with model suboption:

-pie
Produce a position independent executable on targets that support it. For predictable results, you must also specify the same set of options used for compilation (-fpie,
-fPIE, or model suboptions) when you specify this linker option.

e) Does it matter that pie executable links with non-pic shared library?

pie is really about the executable itself so whether it links with pic or non-pic shared library does not really matter.

Linking with a non-PIC shared library simply suggests the library has to be patched before loading to a different address and may discourage sharing the same copy across multiple processes loading the same lib.

f)Where is the magic that pie executable gets loaded to different address?

The magic happens at dynamic linker by the way how it calls mmap, let’s take a look at the code in glibc/glibc-2.16.0/elf/dl-load.c

struct link_map *
_dl_map_object_from_fd (const char *name, int fd, struct filebuf *fbp,
char *realname, struct link_map *loader, int l_type,
int mode, void **stack_endp, Lmid_t nsid)
{
...
if (__builtin_expect (type, ET_DYN) == ET_DYN)
{
/* This is a position-independent shared object. We can let the
kernel map it anywhere it likes, but we must have space for all
the segments in their specified positions relative to the first.
So we map the first segment without MAP_FIXED, but with its
extent increased to cover all the segments. Then we remove
access from excess portion, and there is known sufficient space
there to remap from the later segments.
As a refinement, sometimes we have an address that we would
prefer to map such objects at; but this is only a preference,
the OS can do whatever it likes. */
ElfW(Addr) mappref;
mappref = (ELF_PREFERRED_ADDRESS (loader, maplength,
c->mapstart & GLRO(dl_use_load_bias))
- MAP_BASE_ADDR (l));
/* Remember which part of the address space this object uses. */
l->l_map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplength,
c->prot,
MAP_COPY|MAP_FILE,
fd, c->mapoff);
...
}
/* This object is loaded at a fixed address. This must never
happen for objects loaded with dlopen(). */
if (__builtin_expect ((mode & __RTLD_OPENEXEC) == 0, 0))
{
errstring = N_("cannot dynamically load executable");
goto call_lose;
}
/* Notify ELF_PREFERRED_ADDRESS that we have to load this one
fixed. */
ELF_FIXED_ADDRESS (loader, c->mapstart);
/* Remember which part of the address space this object uses. */
l->l_map_start = c->mapstart + l->l_addr;
l->l_map_end = l->l_map_start + maplength;
l->l_contiguous = !has_holes;
while (c < &loadcmds[nloadcmds])
{
if (c->mapend > c->mapstart
/* Map the segment contents from the file. */
&& (__mmap ((void *) (l->l_addr + c->mapstart),
c->mapend - c->mapstart, c->prot,
MAP_FIXED|MAP_COPY|MAP_FILE,
fd, c->mapoff)
== MAP_FAILED))

Obviously for any type that is not ET_DYN, i.e. shared object, dl-load will try to call mmap with MAP_FIXED indicating fixed address loading, while for shared object it allows kernel implementation to picks up an address at will.

The same logic applies to both shared lib as well as pie executable since they are under the cover the same sort of stuff.

“So 0x602 seems to fall on the first LOAD segment which has the offset range of 0x0 to 0x598 and in it there are several sections that needs to map to it.
And yes, the first LOAD segment is with flags “R E” which is exactly non-writable. (wit E meaning executable)”
Didn’t you mean “So 0x4f4 seems to fall on the first LOAD segment…”