Paul Burba wrote:
>
> Here is my first stab at this.
>
> While it isn't strictly necessary, I changed all callers to
> svn_utf_cstring_*_utf8_ex() which passed a non-NULL convset_key to now
> pass NULL. This makes it clearer to someone working on the calling code
> that this argument is no longer relevant.

That looks like it will work, but I was hoping we could do something more
efficient, along the lines of:

* Use the object "struct {const char *frompage, topage; }" as the hash key, not
caring whether the two pointers are real pointers or type-cast integers.

That would be done in a single place, in get_xlate_handle_node().

Presently, the code uses either a dedicated, global-ish hash
"xlate_handle_hash" if our UTF-8 sub-system has been initialised
(svn_utf_initialize()), otherwise the "user data" hash of whatever pool the
caller supplies.

When using the pool user data as the hash, there are two additional
requirements on the key: it must be a null-terminated cstring (because the user
data access API requires this), and it must be a unique string within the
program - hence the "svn-utf-" prefix and "-xlate-handle" suffix.

In order to use a bare pointer pair as a hash key, we must always use our own
hash: we can't go looking up this key directly in the pool userdata hash.
Therefore we must do one of:

* Require that svn_utf_initialize() be used, which means we always use the
dedicated "xlate_handle_hash". We can remove the other code path except in the
backwards-compatibility API. I think this requires "revving" the API, but I
don't see that as a problem (indeed, it would be good because we could
eliminate the redundant arguments).

or

* Store a translation handle hash (which uses our pointer pair as its key)
within the pool userdata (which can then use a single, constant string as its key).

> /**
> * Initialize the UTF-8 encoding/decoding routines.
> * Allocate cached translation handles in a subpool of @a pool.
> *
> * @note It is optional to call this function, but if it is used, no other
> * svn function may be in use in other threads during the call of this
> * function or when @a pool is cleared or destroyed.
> * Initializing the UTF-8 routines will improve performance.
> *
> * @since New in 1.1.
> */
> void svn_utf_initialize(apr_pool_t *pool);

Isn't it about time that we required the use of library initialisation
functions (for every library) anyway, or is that topic too complex to tackle now?