You can free the memory backing the HSTRING
after you destroy the HSTRING,
and since this is a fast-pass string,
you destroy the HSTRING by simply abandoning it.
Therefore, you can free the memory when you know that nobody
should have a copy of the fast-pass HSTRING handle any more.

(For the purpose of terminology,
I'm going to say that you have a "copy" of an HSTRING handle
if you merely copied the HSTRING handle.
E.g.,
HSTRING copy = hstr;
On the other hand, I'm going to say that you have a
"duplicate" of an HSTRING if you passed it to
Windows­Duplicate­String.)

Okay, so how do you know that
nobody
has a copy of the fast-pass HSTRING handle any more?

Recall the rules for HSTRINGs:
If a function is passed
an HSTRING and it wants to save the
HSTRING for future use,
it must use
Windows­Duplicate­String to increment
the reference count on the string (and possibly convert it from
fast-pass to standard).
Therefore,
if you pass the
HSTRING to another function,
you know that there are no copies of that HSTRING handle
when the function returns,
because creating a copy is not allowed.
The only place where a literal copy of the HSTRING handle
is allowed is in the function that created it,
and therefore you know when there are no more copies of the
HSTRING handle because all of the copies belong to you.

The question sort of acknowledges this rule,
but notes,
"All it takes is one bug somewhere in all of
WinRT where someone forgets to
duplicate a input string if they need said string later
after the function has returned."

That's true.
But it's true of C-style string pointers, too!
If you pass a C-style string to another function,
and that other function wants to retain the string,
it's going to need to call strdup or some
other string duplication function so it can have its own
private copy of the string.
The value received as a function parameter is not
valid once the function returns;
if you need to use it after the function returns,
you need to duplicate the string.

Similarly, if you receive a COM interface pointer,
and you want to continue using it after the function returns,
you need to call IUnknown::AddRef to increase
the reference count on the interface, corresponding to the
copy of the pointer you retained.
When you're done with the pointer, you call
IUnknown::Release.

In both of these cases,
you are relying on people writing code to respect these rules.
All it takes is one bug somewhere in all of
C where someone forgets to
duplicate a input string if they need said string later
after the function has returned.

Somehow, we've managed to survive working with C-style strings
and with COM interface pointers with these rules.
Maybe it's with the help of things like smart pointers,
or maybe it's just through good discipline.
Whatever the reason, keep up the good work.

Bonus chatter:
One of the rules for fast-pass strings is that you cannot
change the contents of the string as long as the HSTRING
is still in use.
One commenter interpreted this to mean that
string references aren't thread-safe.
Not true.
Rather, the statement is a
direct reflection of the fact that an HSTRING
is immutable.
If you changed the contents of the buffer that backs the
HSTRING,
then you break the immutability rule.
Thread safety is not at issue here.
You can use a fast-pass string from any thread you like,
as long as you stop using it before your function returns.
(This means that your function cannot return until
the other thread has definitely finished with the fast-pass
string.
In practice, this is not commonly done; instead, the function
uses
Windows­Duplicate­String to create a
standard HSTRING and passes that HSTRING
to the other thread,
which can then
Windows­Delete­String the HSTRING
when it is done.)

But why is HSTRING_HEADER so big then? It seems to me like it should be: struct { UINT Length; PCWSTR Source; HSTRING nonNullAfterSomebodyDuped; }. It has a lot more space than this and that makes it a bit more mysterious than a plain C string or a BSTR which is why I asked the question in the first place.

…and perhaps you could add a fake ref count field that is set to 0 so it has the same layout as a heap hstring (whatever that might be) and a 0 ref count tells you that you are looking at a fast pass string.

I suppose you’d have the first Windows­Duplicate­String(fastpass) call detect that HSTRING_HEADER::nonNullAfterSomebodyDuped == nullptr, make the copy, and save the duplicate HSTRING there. But if someone then calls Windows­Delete­String(duplicate), what should happen to nonNullAfterSomebodyDuped?

– If Windows­Delete­String(duplicate) takes the reference count to zero and destroys the duplicate but does not change nonNullAfterSomebodyDuped, then the next Windows­Duplicate­String(fastpass) call will use a danging pointer. No good.

– If Windows­Delete­String(duplicate) takes the reference count to zero, destroys the duplicate, and resets nonNullAfterSomebodyDuped, then it risks modifying an HSTRING_HEADER that has already been deallocated. No good.

– If Windows­Delete­String(duplicate) does not destroy the duplicate, then it will leak because the application isn’t going to call Windows­Delete­String(fastpass). In WRL, the destructor of HStringReference does not do it.

I don’t know how much internal details Raymond wants posted here and this could of course change at any time but in Win8 WindowsDeleteString simply does “if (!input || ((*(BYTE*)input) & 1)) return 0; …” and a heap HSTRING has a 0 there and the refcount is stored further out. WindowsDuplicateString also checks this byte and will as expected just InterlockedIncrement a heap HSTRING. The surprising thing for me is that it does not modify HSTRING_HEADER, it allocates new memory every time you duplicate a fastpass string. Perhaps it is just not worth it to try to cache the first duplicated string.

I don’t know where you would cache it but I assumed there was some type of clever magic going to since a HSTRING_HEADER is 16+sizeof(void*) bytes which is a lot more than you need for a “duplicate every time” implementation. WindowsCreateStringReference does seem to store a stack address and a exception handler pointer in there so I assume that is some type of debugging aid…

The documentation of HSTRING says, “Every call to WindowsCreateString and WindowsCreateStringReference must be matched with a corresponding call to WindowsDeleteString.”

The documentation of WindowsCreateStringReference says, “You don’t need to call the WindowsDeleteString function to de-allocate a fast-pass HSTRING created by the WindowsCreateStringReference function.”

Raymond says, “since this is a fast-pass string, you destroy the HSTRING by simply abandoning it.”

One against two; the documentation of HSTRING must be wrong then. Doc feedback sent.

Is the thread-safety of HSTRING reference counters documented somewhere?

They are documented to be immutable (“Use HSTRING to represent immutable strings in the Windows Runtime.”). Something that never* changes is always thread save.
*) as Windows­Delete­String certainly changes the state of the HSTRING (and therefore breaks immutability), you have to ensure that the string is not destroyed while it is still in use

They usually use atomic operations for reference counts, this keeps it thread safe.
If the reference count wasn’t as simple as just incrementing/decrementing a number, then something like a critical section could be used.

In [a C++ implementation from 1998], std::string used reference counters that were not thread-safe at all. If you had an std::string object being used by one thread, and wanted to give another thread a copy that is safe to access without further synchronization, then the copy constructor was not enough because the objects would share the reference counter. The std::string(const char *, size_t) constructor was safer. I don’t know whether [the vendor] ever categorized that as a bug. Nowadays though, the standard does not even allow reference counting in std::string.

This is why I would prefer explicit documentation on thread safety. OTOH, if HSTRING is thread-safe now, then it seems pretty unlikely that Microsoft would dare make it thread-unsafe later, even if the safety is undocumented.

We are living in a completely different world compared to 1998. In 1998, it was quite rare to get a dual processor system, let alone a multi processor system (>2 processors), so multi threaded programs and thread safety really didn’t occur that much.
It was 2005 where the first dual core processors were released, and this is when the real need for thread safety really kicked in. But what’s more, due to the introduction of these processors, multi-threading has become more prominent in programming languages. So it would actually be a lot more surprising if HSTRING was not thread safe. Another thing to really take note of, the Windows Runtime environment has the main function in a MTA thread by default.
So in short, these days, we have the capability to run multiple threads in parallel more readily available and the software does run multiple threads in parallel more often. So I would seriously be surprised if HSTRING, and probably a lot of the WinRT environment, wasn’t thread safe by default.