In search of a thread-safe Phoenix Singleton

I'm working on a project where I need a C++ object which has the following constraints:

There can be a limited number of instances of the object at a time

Since creating an instance locks down system resources that other applications might need to use, instances need to be created and destroyed on demand.

The implementation needs to be thread-safe. Further, it would be nice if the number of instances could be limited on a per thread basis.

It seems best to have the object be stack-based, but I'm not sure if this is an absolute requirement. It would make for cleaner code in the long run, but since my object will likely be called from some older, legacy code, it may not be possible to enforce this limitation until we've had time to clean up that old code more thoroughly.

The main reason most examples fail is that they cannot be created and destroyed on demand. They assume a persistent object that's destroyed near program termination. The Phoenix Singleton described in Modern C++ Design is close, but the code is baroque, and I'm not convinced it meets my needs. The Object Counting Base Class, Counted<class> looks like the best bet, and most nearly meets my needs, except for thread-safety, which I may be able to add using Double-Check. The investigation continues.