Do symbols make sense for C++

I'm currently working on the Policy modification to document (and
recommend) use of symbols instead of shlibs, but I'd only personally used
symbols with C libraries. Today I decided that I should try adding a
symbols file to a C++ library, particularly if I'm going to recommend
everyone do it. I tried this exercise with xml-security-c, which is, I
think, a reasonably typical C++ library. Not the sort of core C++ library
that would sit at the center of the distribution, but a random software
package that's in Debian because other things use it.
The experience was rather interesting, and I ended up uploading the new
version without a symbols file and continuing to just use shlibs. That's
for the following reasons:
1. The generated symbols file was HUGE. Hundreds of lines. This is a
marked difference from the typical C symbols file, which is of quite
manageable size. Some of that is that the library provides a lot of
different classes, but some of it is that C++ just generates a lot of
exported symbols. There's no way that I could do what I would do with
a C library and understand those symbols, why they're there, and
whether they are likely to have changed between revisions.
2. Generating a reasonable symbols file was a pain. Generating an
unreasonable symbols file that just contains all of the mangled symbols
is largely mechanical and uninteresting, but that symbols file doesn't
seem to me to convey useful information. So I did some scripting to
translate the symbols back with c++filt, and add (c++) tags, and then
try to understand what I was looking at and figure out whether I should
sort the symbols list because the default sort is by mangled name,
which is meaningless. This is a rather unappealing process. It's not
particularly difficult, but it's very awkward and feels like it's
missing vital tools.
3. The resulting symbols file is incomprehensible to someone without
strong knowledge of C++. It's full of opaque entries that don't make
sense to the non-C++ programmer, wihch I suspect is a substantial
number of people who package C++ libraries for Debian. I know enough
C++ from school that I can evaluate security fixes, make simple
patches, and review upstream changes, and I think that's all that
should be needed to package things for Debian. But I'm deeply
uncomfortable producing a symbols file on my own that contains entries
for things that I know nothing about and cannot evaluate when they've
last changed, like "non-virtual thunk to FooClass::~FooClass@Base".
4. Once I had a symbols file that resulted in a successful build and that
I could have uploaded, I started thinking about how I was going to
maintain it. With a C program, I would change the symbols file
versions when the underlying function implementation changes in a way
that may not offer eqiuvalence, similar to bumping shlibs. I realized
that I was going to have no idea when that happened, and the only way
that I would maintain the symbols file would be to either trust
upstream to maintain ABI equivalence and therefore only change the
symbols file when upstream changes the SONAME, or not trust upstream to
maintain ABI equivalence and therefore change all the versions with
each new upstream release. That gives me exactly the same semantics as
a shlibs file, so what's the point in having a symbols file?
5. The exported symbols of the library contained many symbols that
obviously weren't really from that library, but instead were artifacts
of the C++ compilation process, things like instantiations of
std::vector. Do those go into the symbols file? Do they change from
architecture to architecture? If they disappear again, is that
actually an ABI break? How do I know? It's all very mysterious, and
while shlibs provides the same semantics as just ignoring this, at
least I'm not then including in the package data, generated by me,
things that I'm just blindly ignoring.
I came away from this experience thinking that I should revise the Policy
amendment to say that symbols files are really for C libraries and for C++
libraries with either a tightly maintained symbol export list or
maintained by a C++ expert, and that most C++ library maintainers should
just not bother with this and use shlibs, bumping the shlibs version or
not based on their impression of how good upstream is at maintaining ABI
equivalence.
But that feels like a result contrary to what I had previously thought was
the intended direction, so I wanted to ask the Debian development
community as a whole: am I missing something? Are these symbols files
actually useful? Am I missing some trick to make them useful?
--
Russ Allbery (rra@debian.org) <http://www.eyrie.org/~eagle/>