Vtbl layout under MI

Executive Summary: If you know how vtbls are laid out under MI for
particular compilers, please tell me, because published sources seem to
differ on how it's done.
Long version: Unbelievably, people pay me to prattle on about C++, so
that's what I was doing recently, specifically, talking about the
implementation of virtual functions under (nonvirtual) multiple
inheritance. As I spoke, somebody had the audacity to whip up some code
and check to see if what I said was true, and, well, let's just say I'm
making this posting because things did not go as I'd expected.
Consider the following:
class B1 {
public:
virtual void mf1() {}
};
class B2 {
public:
virtual void mf2() {}
virtual void mf3() {}
};
class D: public B1, public B2 {
public:
virtual void mf3() {}
virtual void mf4() {}
};
int main()
{
D *pd = new D;
pd->mf3();
}
My expectation is that each D object will contain two vptrs, each pointing
to different vtbls. But what goes in those vtbls?
One vtbl is shared between B1 and D objects, and I expected it to contain
entries for the following virtual function implementations. (In all that
follows, please ignore thunking. Thunks ultimately lead to virtual
function implementations, and I care only about which vtbls have which
entries pointing (possibly via thunks) to which virtual function
implementations.)
&B1::mf1 // virtual declared in B1
&D::mf3 // virtual declared in D
&D::mf4 // virtual declared in D
I believe this is consistent with the vtbl layout described by Lippman in
"Inside the C++ Object Model," pp. 132-136.
I further expected the call d->mf3() to go through this vtbl, but one of
the people to whom I was prattling looked at the assembly code generated
for the call and saw that it was not using this vtbl. Instead, it was
using the other vtbl, the one for B2. Ouch.
So I looked up vtbl layouts under (nonvirtual) MI. On pp. 229-231, the ARM
-- wow, that takes me back -- shows the B1/D vtbl having only these
entries, assuming I'm understanding the discussion correctly:
&B1::mf1 // virtual declared in B1
&D::mf4 // virtual declared in D
There is no entry for mf3, because that entry is stored in the B2 vtbl.
This is consistent with what my student observed, and it's also consistent
with Jan Gray's article, "C++ Under the Hood"
(http://msdn.microsoft.com/archive/default.asp?url=/archive/en-us/dnarvc/html/jangrayhood.asp,
but without the diagrams that accompany the printed version in the 1994
book, "Black Belt C++: The Masters Collection", a title I've always liked,
and not just because I have an article in the book). The Jan Gray article
is relevant, because it describes how Microsoft implements things (circa
1994), and the student in question was using VC++ 7.1.
So the Microsoft implementation as described by Jan Gray seems to match the
implementation described in the ARM, though not the one described by
Lippman.
These are old sources, however, and surely there's more recent stuff on the
web, right? Maybe. Perhaps the most interesting thing I found was
http://www.cse.wustl.edu/~mdeters/seminar/spring2002/natarajan-mi.pdf,
which shows the same example as given in the ARM and has a cover page
saying the slides are by Bjarne, but which shows a different vtbl layout
than in the ARM. (Compare pg. 230 of the ARM with slide 13 of the talk.
Perhaps the difference has to do with the vptr locations, which seem to be
different in the two examples?)
I also found a chapter in an online compiler textbook of indeterminate
authorship and vintage (
http://topaz.cs.byu.edu/text/html/Textbook/Chapter10/) which seems to say
that a possible implementation is a B1/D vtbl with repeated entries for
overridden functions. Under such an implementation, these two calls
((B1*)pd)->mf3(); // call mf3 on D object via B1* pointer
pd->mf3(); // call mf3 on D object via D* pointer
resolve to the same function call implementation, but go through different
vtbl slots. This seems contrary to the implementations described by the
ARM, Lippman, and Gray, but possibly consistent with the slides in the PDF
file referenced above.
As things stand now, I *think* I understand how vtbls are laid out under MI
in the MS compiler (same as described in the ARM and by Gray), but is this
how all contemporary implementations lay them out? If you know how one or
more compilers lay things out, please tell the world -- or at least me.
Thanks,
Scott
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

In article <MPG.1d005115f2a59da99897d9@news.hevanet.com>, Scott Meyers
<Usenet@aristeia.com> writes
>As things stand now, I *think* I understand how vtbls are laid out under MI
>in the MS compiler (same as described in the ARM and by Gray), but is this
>how all contemporary implementations lay them out? If you know how one or
>more compilers lay things out, please tell the world -- or at least me.
I seem to remember that Borland uses a different layout. As the C++
standard has nothing to say on the subject it is open to implementers to
implement virtual functions any way they like (yes I know you know that,
Scott). And they certainly have tackled MI in several different ways.
Even when no virtual inheritance is involved they still have to design
implementations that can cope with such.
--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

0

Francis

5/27/2005 6:28:02 PM

Hi, I can tell you a bit about GCC 3
Scott Meyers wrote:
> Executive Summary: If you know how vtbls are laid out under MI for
> particular compilers, please tell me, because published sources seem to
> differ on how it's done.
>
> My expectation is that each D object will contain two vptrs, each pointing
> to different vtbls. But what goes in those vtbls?
>
> One vtbl is shared between B1 and D objects, and I expected it to contain
> entries for the following virtual function implementations.
>
> &B1::mf1 // virtual declared in B1
> &D::mf3 // virtual declared in D
> &D::mf4 // virtual declared in D
>
Your expectation is correct for GCC. The only thing that is different,
though not really important in my opinion is that D has only one
vtable,
produced by sequentially stacking 2 vtables you expected to see.
Of these 2 vtables (or parts of the single vtable) the second contains
an offset to the beginning of the entire vtable. This offset is used
within downcasts and I wonder if it is used for other purposes (please
tell me if it is!) Anyway D contains 2 vptrs, and this is possible to
treat
these 2 parts of D's vtable as 2 different vtables. Let me attach a
class
hierarchy dump for your convinience:
(produced by gcc -fdump-class-hierarchy)
=== dump ===
Vtable for B1
B1::vtable for B1: 3u entries
0 0u
4 (int (*)(...))(&typeinfo for B1)
8 B1::mf1
Class B1
size=4 align=4
base size=4 base align=4
B1 (0x4046e6c0) 0 nearly-empty
vptr=((&B1::vtable for B1) + 8u)
Vtable for B2
B2::vtable for B2: 4u entries
0 0u
4 (int (*)(...))(&typeinfo for B2)
8 B2::mf2
12 B2::mf3
Class B2
size=4 align=4
base size=4 base align=4
B2 (0x4046ea40) 0 nearly-empty
vptr=((&B2::vtable for B2) + 8u)
Vtable for D
D::vtable for D: 9u entries
0 0u
4 (int (*)(...))(&typeinfo for D)
8 B1::mf1
12 D::mf3
16 D::mf4
20 -4u
24 (int (*)(...))(&typeinfo for D)
28 B2::mf2
32 D::non-virtual thunk to D::mf3()
Class D
size=8 align=4
base size=8 base align=4
D (0x4046ed80) 0
vptr=((&D::vtable for D) + 8u)
B1 (0x4046edc0) 0 nearly-empty
primary-for D (0x4046ed80)
B2 (0x4046ee00) 4 nearly-empty
vptr=((&D::vtable for D) + 28u)
=== end of dump ===
> I also found a chapter in an online compiler textbook of indeterminate
> authorship and vintage (
> http://topaz.cs.byu.edu/text/html/Textbook/Chapter10/) which seems to say
> that a possible implementation is a B1/D vtbl with repeated entries for
> overridden functions. Under such an implementation, these two calls
>
> ((B1*)pd)->mf3(); // call mf3 on D object via B1* pointer
> pd->mf3(); // call mf3 on D object via D* pointer
>
> resolve to the same function call implementation, but go through different
> vtbl slots.
Did you mean ((B2*)pd)->mf3() in the code above?
If you did, then as of GCC this call would go via thunk,
that is in the second part of D's vtable.
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

0

kerzum

5/28/2005 11:03:45 AM

Scott Meyers wrote:
> As things stand now, I *think* I understand how vtbls are laid out under MI
> in the MS compiler (same as described in the ARM and by Gray), but is this
> how all contemporary implementations lay them out? If you know how one or
> more compilers lay things out, please tell the world -- or at least me.
Here's the documentation for how GNU C++ does it.
<http://www.codesourcery.com/cxx-abi/abi.html#vtable>
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

0

Hyman

5/28/2005 11:08:41 AM

On 28 May 2005 07:03:45 -0400, wrote:
> (produced by gcc -fdump-class-hierarchy)
I didn't know about this switch, thanks for pointing it out :-)
> > I also found a chapter in an online compiler textbook of indeterminate
> > authorship and vintage (
> > http://topaz.cs.byu.edu/text/html/Textbook/Chapter10/) which seems to say
> > that a possible implementation is a B1/D vtbl with repeated entries for
> > overridden functions. Under such an implementation, these two calls
> >
> > ((B1*)pd)->mf3(); // call mf3 on D object via B1* pointer
> > pd->mf3(); // call mf3 on D object via D* pointer
> >
> > resolve to the same function call implementation, but go through different
> > vtbl slots.
>
> Did you mean ((B2*)pd)->mf3() in the code above?
No, I meant what I wrote. Check out "Figure {MEMVRST}". (That's what they
call it, honest.)
Also, let me refine my question a bit. In this call,
pd->mf3();
which vptr is used, the B1/D one (which is what I expected) or the B2 one
(which is what MSVC seems to use)? My understanding is that g++ uses the
B1/D one. Is there an easy way to spit out the assembly for just this
statement so I can compare it with the corresponding assembly generated by
MSVC?
Thanks,
Scott
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]