Thursday, December 20, 2012

Easy Binary Compatible Interfaces Across Compilers in C++ - Part 0 of n: Introduction and a Sneak Preview

The problem of using a C++ library compiled with Compiler A, from a program compiled with Compiler B has been a problem for a while. This is especially true on Windows where Visual C++ generally breaks binary compatibility from release to release. Shipping a library for Windows involves shipping several versions for Visual C++ as well now often for mingw gcc.
Some of the problems C++ has in regards to binary compatibility across different compilers are:name mangling,object layout, exception support.
There are several ways to get around this.

There are whole books written on COM, so I won’t try to go into too many details. A brief overview in regards to the binary interface is here.
The basic idea is that you define an interface like this

Interface Definition

struct Interface;

struct InterfaceVtable{

int (*Function1)(struct Interface*);

int (*Function2)(struct Interface*, int);

};

struct Interface{

struct InterfaceVtable* pTable;

};

It can be used like this

Using an Interface

struct Interface* pInterface = GetInterfaceSomehow();

int a = pInterface->pTable->Function1(pInterface);

Implementing an interface like this is painful and will be left as an exercise to the reader .
Fortunately, (and by design), Microsoft Visual C++ and most Windows C++ compilers will generate something compatible to the above with an abstract base class using pure virtual functions.

Inteface using C++ (MSVC)

structInterfaceCpp{

virtualint Function1() = 0;

virtualint Function2(int) = 0;

};

You can implement and use like this

Code Snippet

structInterfaceImplementation:publicInterfaceCpp{

virtualint Function1(){return 5;}

virtualint Function2(inti){return 5 + i;}

};

InterfaceImplementation imp;

InterfaceCpp* pInterfaceCpp = &imp;

std::cout << pInterfaceCpp->Function2(5) << std::endl;

The reason for this, is that the version with function pointers was doing a vtable and a vptr by hand and this version is letting the compiler do it. For more information about vtable and vptr see the excellent article by Dan Saks in Dr. Dobbs.
While the above solution works on Windows (generally), this is not guaranteed to always work A more general cross-platform solution is presented in Matthew Wilson’s Imperfect C++ in chapters 7 and 8. He basically provides a way and macros that allow you to define the above structure manually (ie define your own vtables).
By using either COM style interfaces with compilers that have a compatible vtable layout or rolling your own, you can have cross-compiler binary compatible interfaces.However, you do not have

Exceptions

Due to not having exceptions, you often have to use error codes and thus do not have real return values.

Standard C++ types such as vector and string (use arrays and const char*)

In fact, in an article explaining why Microsoft created C++/CX Jim Springfield stated one of the problems with COM even with libraries such ATL was
“There is no way to automatically map interfaces from low-level to a higher level (modern) form that throws exceptions and has real return values.”
During this series of posts, I will discuss the development of a C++11 library that has the following benefits

Able to use std::string and std::vector as function parameters and return values

Use exceptions for error handling

Compatible across compilers – able to use MSVC to create.exe and g++ to create .dll on Windows, and g++ for executable and clang++ to create .so on Linux

Works on Linux and Windows

Written in Standard C++11

No Macro magic

Header only library

As we progress we will talk about some of the disadvantages and areas for improvements and possible alternatives
Here is how we would define an interface DemoInterface. Note jrb_interface is the namespace of the library.

Code Snippet

usingnamespace jrb_interface;

template<bool b>

structDemoInterface

:publicdefine_interface<b,4>

{

cross_function<DemoInterface,0,int(int)> plus_5;

cross_function<DemoInterface,1,int(std::string)> count_characters;

cross_function<DemoInterface,2,std::string(std::string)> say_hello;

cross_function<DemoInterface,3,std::vector<std::string>(std::string)>

split_into_words;

template<classT>

DemoInterface(Tt):DemoInterface<b>::base_t(t),

plus_5(t), count_characters(t),say_hello(t),split_into_words(t){}

};

.
In this library, all interfaces are actually templates that take a bool parameter. The reason for this will become clear as we discuss the implementation in later posts.
All interfaces inherit from define_interface which takes a bool parameter (just use the bool passed in to the template) and an int parameter specifying how many functions are in the interface. If you pass in a too small number, you will get a static_assert telling you that the number is too small.
To define a function in the interface, use the cross_function template
The first parameter is the interface in this case DemoInterface. The second parameter is the 0 based position of the function. The first function is 0, the second is 1, the third 2, etc. The third and final parameter of cross_function is the signature of the function is the name style as std::function.
Finally all interfaces need a templated constructor that takes a value t and passes it on to the base class as well as each function. For convenience the define_interface template defines a typedef base_t that you can use in your constructor initializer.
To implement an interface you would do this

Code Snippet

structDemoInterfaceImplemention:

publicimplement_interface<DemoInterface>{

DemoInterfaceImplemention(){

plus_5 = [](inti){

returni+5;

};

say_hello = [](std::stringname)->std::string{

return"Hello " + name;

};

count_characters = [](std::strings)->int{

returns.length();

};

split_into_words =

[](std::strings)->std::vector<std::string>{

std::vector<std::string> ret;

auto wbegin = s.begin();

auto wend = wbegin;

for(;wbegin!= s.end();wend = std::find(wend,s.end(),' ')){

if(wbegin==wend)continue;

ret.push_back(std::string(wbegin,wend));

wbegin = std::find_if(wend,s.end(),

[](charc){returnc != ' ';});

wend = wbegin;

}

return ret;

};

}

};

To implement an interface, you derive from implement_interface specifying your Interface as the template parameter. Then in your constructor you assign a lambda with the same signature you specified in the definition of the interface to each of the cross_function variables.
To use an interface, you construct use_interface providing the Interface as the template parameter.

You then call the functions just as you would with any class object. Note the use of . instead of –>
Thank you taking the time to read this post. I hope this has piqued your interest. In future posts we will explore how we create this library, and how we can extend this library to do more. I hope you will join me.
You can find compilable code athttps://github.com/jbandela/cross_compiler_call
The code has been tested on

Windows with compiling the executable with MSVC 2012 Milan (Nov CTP) and the DLL with mingw g++ 4.7.2

Ubuntu 12.10 with compiling the executable with g++ 4.7.2 and the .so file with clang++ 3.1

Instructions on how to compile are included in the README.txt file.
Please let me know what you think in the comments section
- John BandelaCodeProject

Very interesting project, and I'm eagerly waiting for the next post(s) :) What are the limitations with this technique? I'm thinking of a larger project with multiple inter-dependent shared libraries... If I could migrate selected shared dll's to another compiler, such as Visual Studio 2012 or mingw (currently using Visual Studio 2010), it would be fantastic! However, I'm uncertain if or how that would work...?

Thanks for your kind words. Since this post, I have added a lot more features to the library. They are available on github link above. In terms of limitations, the biggest is that the compiler has to be have c++11 support including support for variadic templates. Visual Studio 2012 has it with the November (codename Milan) CTP. Visual Studio 2010 does not. If you have intel c++ 13, I believe you could use that with Visual Studio 2010 since it supports variadic templates. Also there is some overhead involved since we are converting types back and forth at the boundaries.

In terms of next post, I am going to give a talk at C++Now (formerly BoostCon) in May of this year about this. I have been busy improving the code and working on my presentation. If you are interested in where this code is, take a look at the demo code.

Thanks for the quick answer! I was thinking of limitations with regards to compatibility; what types are one required to convert at the boundaries? std namespace? boost or external libraries?

How can I (if possible) share custom data structures? What if I keep my (custom) class hierarchy in library A compiled with compiler A, and use library A in both library B (compiled with compiler B) and library C (compiled with compiler C)?

The library does not depend on any external library except standard C++ with 2 exceptions: the boost unit test framework for the unit test code, and Windows and linux system calls to load dynamic libraries and look up functions.

In terms of conversions, the library supports char, std::int8_t/uint8_t - std::int64_t,and float and double as well as pointers and references to the above.

Also supports std::string,std::vector of anything supported, and std::pair of anything supported.

In addition, you can define your own conversions. The rule is that what gets passed can't have anything a C struct couldn't. Take a look at cross_conversion and cross_conversion_return templates in cross_compiler_conversions.hpp (look in cross_compiler_interface/implementation)

If you want I could try to see if I could help you make your custom data types be able to be used with this library. It would be good feedback to see someone else use this library.

Am I correct to assume that vector and string is pretty easy to support, because the standard requires a certain layout? I'm guessing it might not be that computationally expensive either, because only two pointers are passed.

Something like std::set or std::map is perhaps trickier?

Thanks for helping me out :)

I have a simplified example of my data model here : https://gist.github.com/meastp/5116333

The example is simple, but if it is possible to adopt that model without too much performance penalty and work, I would be very happy to not depend on old compilers and have a modern, backwards-compatible solution. :)

For your example take a look at https://github.com/jbandela/cross_compiler_examples

look at example_1 - I used your data model (and added a few stuff)

example_1_exe.cpp is the exe that uses the interfaceexample_1_dll.cpp is the dll that implements the interfaceexample_1_interface.h defines the interfaces that the exe and dll use.

There is an MSVC solution as well if you want to play around.

Make sure you git the latest from cross_compiler_callMake sure you use MSVC November CTP that has variadic template support

In regards to your first question,with string we pass 2 pointers for parameters, for returning it gets trickier. With vector, we end up passing some function pointers and a pointer to void which we use to reconstruct vector on the other side.

Is it possible to use a non-intrusive adaption as well, i.e. adding support without modifying the data model classes (if I have common header/source files with the data model that I can not modify, but need to be compatible with)?

Sorry so late to get back, take a look at my leveldb repository on github. I adapted leveldb to cross_compiler_interface. You build leveldb using visual c++. Then build the dll with visual c++. Then you can use visual c++ or g++ to build the exe. This is a rough first pass, so it probably still has some bugs

Hi,No worries! I've been following your github activity in the cross_compiler repositories. :)

I'm guessing there are *a lot* of developers held back because they have to be compatible with old compilers. So this is very useful.

Do you think it is possible to gain VC10 compatibility by mimicking variadic templates with macros (a lot of work, obviously, but so is staying on an outdated compiler, not being able to use the new features ;) )? Once your library is complete, if we can use it without too much performance penalty, I think I would like to attempt that to be able to move to a more modern compiler, while still being VC10 compatible through the cross_compiler interface :)

One question about the implementation:Does the interface support creating objects in both the exe and the dll, and passing it back and forth across boundaries? In your example with the leveldb:

1. create a (native) leveldb object in the cpp/exe.2. manipulate that leveldb.3. pass it to the library/dll through the interface4. manipulate the same leveldb object, that was received through the interface

Obviously, I would have to compile the leveldb twice (once for the dll, and once for the exe), but is this possible?

The reason I ask is because that feature would make it possible to use a subset of a large codebase with the cross_compiler_interface.

Perhaps this wasn't a very good explanation? I can try to make an illustration and/or code if you want.. :)

An initial attempt at vc10 backport. Unit tests pass but because now all faux variadics, need more coverage for functions of different arity. Has same interface as the variadic template version.It is at https://bitbucket.org/jbandela/cross_compiler_call_vc10

Take a look at it. Developing it was no fun at all - just tediously cranking out code (trying to debug macros in VC++ would have been a nightmare).