Describing the MSVC ABI for Structure Return Types

An ABI is an “application binary interface”, which is basically a contract between pieces of executable code on how to behave. The ABI dictates things like how parameters are passed, where return values go, how to create and destroy stack frames, etc. As a programmer, you oftentimes don’t have to worry about this sort of thing because the compiler takes care of it for you. However, if you want code from one compiler to talk to code from another compiler, the ABI is extremely important because if the two compilers don’t agree, the two pieces of code won’t be able to work together.

I don’t want to go into the entire MSVC ABI (that could likely fill a book!), but instead would like to focus on the under-documented portion having to do with the way structures are returned from functions. There is some documentation on the subject on MSDN, the latest of which can be found here.
If you read the above link, you will see the documentation pertaining to return values:

Return values are also widened to 32 bits and returned in the EAX register, except for 8-byte structures, which are returned in the EDX:EAX register pair. Larger structures are returned in the EAX register as pointers to hidden return structures.

This seems quite definitive, however, it’s also quite inaccurate in practice. I went through all of the calling conventions except __clrcall and tried various interesting structures coupled with different structure packings, and want to share what I found.

Before I get to my findings, I should describe what I tested and why. All of my tests were performed with MSVC 10. I tested with functions utilizing four different calling conventions: (__cdecl, __stdcall, __fastcall and __thiscall). Each function had six variants, returning a structure of different sizes: 3, 4, 7, 8, 15 and 16 byte structures. I tested using all six different packing modes: 1, 2, 4, 8, 16 and natural. I only tested on x86, so there’s room for further research on x64 and ARM. All told, there is a lot of raw data involved (about 145 distinct datapoints)!

__cdecl

With the cdecl calling convention (which is the default for C/C++ programs in MSVC), the stack is cleaned up by the caller instead of by the callee. This allows for it to use variable argument lists, at the expense of larger executables.

For packing sizes 2, 4, 8, 16 and natural the cdecl calling convention behaves as documented. 3 and 4 byte structures were returned in EAX, 7 and 8 byte structures were returned in EAX/EDX, 15 and 16 byte structures were returned via a caller-allocated pointer stored in EAX, and the caller was responsible for cleaning that pointer up.

However, for packing size 1, the calling convention does not behave as documented in all cases. Structure size 4, 8, 15 and 16 all behave the same as the other packing modes. But structure size 3 and 7 use the same hidden parameter mechanism as used by 15 and 16 byte structures, instead of using EAX or EAX/EDX.

__stdcall

The stdcall calling convention (which is the default for Win32 APIs), the stack is cleaned up by the callee instead of the caller. So the executable code is typically smaller, but unable to use variable argument lists.

For packing sizes 2, 4, 8, 16 and natural the stdcall calling convention behaves as documented. 3 and 4 byte structures were returned in EAX, 7 and 8 byte structures were returned in EAX/EDX, 15 and 16 byte structures were returned via a callee-allocated pointer stored in EAX, and the callee was responsible for cleaning that pointer up. Basically, the only difference between stdcall and cdecl was exactly what you would expect: callee cleaned instead of caller cleaned.

However, for packing size 1, the calling convention behaved the same as it did for cdecl with packing size 1. Structure sizes 4, 8, 15 and 16 all behaved as the other stdcall packing modes. But structure size 3 and 7 use the same hidden parameter mechanism as used by 15 and 16 byte structures, instead of using EAX and EAX/EDX.

__fastcall

The fastcall calling convention is similar to stdcall in that the callee is responsible for stack maintenance. It differs in that the first two DWORD or smaller parameters are always passed in the ECX and EDX registers. This isn’t a common calling convention on Windows for x86, but it’s awfully close to the calling convention used by default on x64. However, you can use the /Gr compile option to cause all functions to be compiled with __fastcall by default.

The behavior of fastcall with returning structures is identical to the behavior seen with stdcall.

For packing sizes 2, 4, 8, 16 and natural the fastcall calling convention behaves as documented. 3 and 4 byte structures were returned in EAX, 7 and 8 byte structures were returned in EAX/EDX, 15 and 16 byte structures were returned via a callee-allocated pointer stored in EAX, and the callee was responsible for cleaning that pointer up.

However, for packing size 1, the calling convention behaved the same as it did for cdecl and stdcall with packing size 1. Structure sizes 4, 8, 15 and 16 all behaved as the other fastcall packing modes. But structure size 3 and 7 use the same hidden parameter mechanism as used by 15 and 16 byte structures, instead of using EAX and EAX/EDX.

__thiscall

The thiscall calling convention is almost like stdcall, and almost like fastcall, but not quite the same as either. All parameters are passed on the stack with the exception of the “this” pointer, which is passed via ECX. It is the default calling convention for C++ class member functions. It was also the odd-man-out in terms of behavior. Regardless of structure size or packing, structures were returned via a callee-allocated pointer stored in EAX, and the callee was responsible for cleaning that pointer up.

Raw Data

Here is the raw data that I collected for this information. If you run your own experiment and have findings different from mine, please contact me so we can research the issue further. For a link to the Excel spreadsheet with this data, click here.

I am getting different results on the cdecl calling convention. My results are that the calling convention solely depends on the size of the structure. I think your structure sizes are not what you think they are. In particular, if I define

However, if you define the packing alignment to be 2, or just define no packing alignment and define this struct:

struct b_s { char x; short y; };

Note that sizeof(struct b_s) is 4, not 3 — there’s a padding byte between x and y. Thus the structure gets returned in EAX, with no hidden return pointer parameter.

So I think your results are wrong and so far I have seen nothing inconsistent with the theory that if the return type is a structure, the size of your structure is all that determines the behavior. I’ve observed that structures of size 1, 2, 4, and 8 are returned in AL, AX, EAX, or EAX:EDX, but others are returned using a hidden parameter.

I did not check the behavior of the other calling conventions, so you might want to double-check those.

Your email address will not be published. Required fields are marked *

Comment

Name *

Email *

Website

Who

Aaron Ballman is a software engineer for GrammaTech. He has almost two decades of experience writing cross-platform frameworks in C/C++, compiler & language design, and software engineering best practices and is currently a voting member of the C (WG14) and C++ (WG21) standards committees.

In case you can't figure it out easily enough, the views expressed here are my personal views and not the views of my employer, my past employers, my future employers, or some random person on the street. Please yell only at me if you disagree with what you read.