GDB: How to dump Unicode Strings (i.e. provide custom data display)

Problem

Generally the problem you are facing is that you have strings of UCS-2/UTF-16
unicode 16bit characters, and you'd like to print them out inside gdb as you
can do normal 8bit C-style strings. Of course the same problem applies to any
custom data-type which gdb knows nothing about, but it is with strings we have
the most problems.

Solution

The hook to achieve this is that gdb can be used to directly call functions
from your executable, so you can take advantage of this by providing some
methods in your executable which return a view of the data formatted in a type
which gdb has inbuilt support, i.e. C-style string, or alternatively provide
methods which pretty print them directly to screen. Either way you can then
just ask gdb to call these functions when you wish to debug them.

When your language is C++ this has the huge advantage that gdb can emulate the
C++ resolution of overloaded function names. So if you provide a set of C++
functions which share the same name, but are overloaded for each type you wish
to have a custom debug view of, then gdb can call the correct overloaded
function for you depending on the function signature. So regardless of the type
you have just the one "custom dump" function name to use.

Example

e.g. for openoffice.org there are the two nonunicode string classes of
ByteString and rtl::OString and the two unicode string classes of UniString and
rtl::OUString. We can provide an overloaded dbg_dump function for each type in
the appropiate OpenOffice.org libraries e.g...

and if the sWhatever is a unicode type then we get a UTF-8 string, otherwise we
get a strightforward dump of a copy of the non-unicode data. Here I perferred
getting my data returned as a char pointer, so I return a pointer to a static
buffer, simply printing to stderr/stdout is an obvious alternative.

In principle it should be possible to extend the overloading for an arbitrary
number of types for which custom printing would be useful.

dbx

This should also work from solaris dbx with (I think)...

(dbx) print ``dbg_dump(sWhatever)

if not, check the dbx documentation to get the correct syntax and let me know.

Quirks

In practice the pseudo-overloading seems to work consistently in dbx and gdb
only when the overloaded functions are compiled with debugging symbols enabled,
otherwise things get a little flaky. For openoffice.org, due to size and time
constraints, it is common to only enable debugging symbols for the subset of
code which you are debugging, as opposed to globally which is the norm. So for
openoffice.org I place the debugging methods inside source files which are then
forced to be always built with debugging enabled.