If the value of self ANDed with the bitmask is zero, then self is
a conventional object. Any other value indicates a tagged pointer.
Classes for TPS are stored in the runtime. And here we have the first problem.
In mulle-objc the runtime is usually accessed via the class, but we don’t have
the class yet.

Getting the TPS class

There are two possible configurations for the runtime, global and
thread-local. global is the default. In this case the runtime is stored in
a global variable. Access to it is assumed to be reasonably fast, but still
its another overhead incurred on every method call.
In the thread-local case though, the runtime is retrieved via mulle_thread_tss_get which does a pthread_getspecific on many platforms.

still calling a shared library function, could put even more of a damper on the proceedings. But none of this has been really benchmarked so far.

pthreads really should provide an inline function for pthread_getspecific.

Pros and Cons of TPS

What can we fit into a TPS ? Small strings of like
mulle_char5_t for example.

A standard object in mulle-objc has a guaranteed footprint of at least
2 * sizeof( uintptr_t), which translates on 64 bit to 16 bytes. This memory
is used for the retain-count and isa.
Now add the data required for the characters. In an app that holds 16 M
unique strings of 7 ASCII characters each, that is 256 MB overhead for a
payload of about half the size.

With tagged pointers you can eliminate this overhead, if the strings fit the
TPS encoding. The creation of a TPS object is also cheaper than a conventional
object , since you don’t call malloc. Retain/release of the object are
also very cheap as it is a NOP.

The big downside of TPS is, that it does slow down all other non TPS objects
method calls. The carefully crafted inlinable code section of
mulle_objc_object_call
now suddenly enlarges by quite a bit. This might make first stage inlining
prohibitive. But if we remove this inlining, we will slow-down other
objects even more.

So Boon or Bane ?

I don’t know!

My gut feeling is, that TPS will pay off in most programs. Currently
the compiler does compile with TPS by default. This will define the __MULLE_OBJC_TPS__ (in “future” version 3.9.1.1). The runtime checks
this and adds the TPS related code.

You can turn off the generation of tagged pointers with -fno-objc-tps.
Since you can not mix TPS with non-TPS code, the runtime checks that you
don’t load classes with mixed settings.