These functions are returning the physical placement at the moment
they are called. If a Pthread moves around, it will still return the
correct, current value.

You should not cache the output of these functions. They require ~105
cycles per call (I just measured this for 1M calls, with 315-318M
cycles required for the test), which is cheaper than loading a stored
value if it's not in cache.