Introduction

This article describes a useful technique for using a closed source C++ DLL, loaded at run-time, to access an API for a popular consumer peripheral. It’s assumed the developer does not have access to import libraries or source code.

Background

This is a technical follow-up to my Continuum blog post describing how I automated a simple software task to enable an automatic pan-and-zoom feature of a consumer webcam that didn't provide an API. We needed to do this without disturbing the user experience, so standard program automation tools like AutoIt were off the table.

Throughout the process, I realized some of the pitfalls in making calls on C++ DLLs and identified workarounds. The inspiration for this article was a blog post from Recx Ltd, Working with C++ DLL Exports without Source or Headers. An example project is provided, but our specific case relied on a proprietary DLL that could not be included due to licensing.

API Monitor is an invaluable tool for monitoring the activity of API calls. It has more comprehensive features for Windows API calls,
but it can also be used to view calls to external DLLs. You can spawn processes
from within the tool or attach to a running process. The state of the call
stack can be seen before and after API calls are made, return values are
visible, and it even provides breakpoint functionality.

With API Monitor, I spawned an instance of the web cam
controller application, which allowed me to see which DLLs were being loaded. Monitoring
from process execution (as opposed to attaching to running process) can be
important when trying to see initialization behavior.

The module dependency view revealed a large list of DLLs.
Most of them were Windows system DLLs or related to the Qt framework. One in
particular stood out for what we needed: CameraControls_Core.dll. I set up the API Monitor to log all calls to this DLL and this was the relevant output:

Monitoring the API activity live, I checked and unchecked
the facial recognition box. I noticed calls to SetFaceTracking () were being
made. A quick look at the call stack revealed Boolean values being sent as parameters to the method. I used the Microsoft dumpbin utility on
CameraControls_Core.dll to view the exported method list. It was quickly evident that I was dealing with a C++ DLL (due to the use of name decoration).

It was around this point in time that I stumbled on the above mentioned Recx Ltd article. I realized that our particular DLL would present a few challenges
that made a simple script-kiddie application of this technique impossible. I moved a copy of the DLL into my application’s working directory and attempted
to write some code to load it dynamically.

Stepping through the code revealed the first dependency library of many: Qt4Gui4. As the errors presented themselves, I copied the dependencies into the working directory. It turns out ten additional DLLs
were required. I ran through the code one last time, and got an error about MSVCR90.DLL being missing.

Placing this DLL in the directory results in an error about the C-runtime being loaded improperly.

MSVCR90.DLL is the C-runtime library for Visual C++ 2008. I tried
re-building my project to run with this runtime, so it would use the same
runtime as the DLL, but it didn't mitigate the problem. As it turns out, Microsoft introduced a new form of system-wide DLL
management in Windows 98 for allowing conflicting DLLs to exist simultaneously
in memory (Wikipedia DLL Hell). DLL’s complying with this standard have a
manifest file that is imported into the DLL to inform a calling process
which dependencies need to be loaded.

To overcome the C run-time dependency, I needed to create a manifest file pointing to the run-time library and embed it into my
CameraControls_Core.dll. This MSDN article outlines how to use Microsoft’s
MT.EXE utility to embed a manifest file into an already built executable or
DLL.

After this step, the library was loading into memory without
exceptions! Now, I just needed to verify method calls would work. Early attempts at calling essentially any of the methods were throwing
exceptions. I realized this was because some critical init function was
probably not being called. The reader should read through Working with C++ DLL Exports without Source or Headers first.

A function or method's calling convention determines how information
is passed and returned from the caller to the callee. The calling convention
our DLL used differed from those presented in the above
article. Their example used __cdecl,
which I just learned is now a standard calling convention for all functions on
Win x64 systems ( this was done to eliminate the problem of so many complicated calling conventions).

In __cdecl,
all the parameters are passed on the call stack. The object instance this pointer is passed last. Our DLL used the
__thiscall convention.__thiscall,
it turns out, informs the callee that the this pointer is being passed via the ECX register (not on the call stack). The __thiscall keyword tells the compiler that the first parameter should be placed in the
ECX register. Technically, the first parameter should be a pointer to the object instance.

Since our DLLs
Init() function requires a QString as a
parameter, I needed to determine the version of Qt being used by the target DLL, build Qt, and statically link it to my DLL. Fortunately, passing an empty
QString to the Init function completed initialization without errors and
allowed the other methods to be called. The dwFakeObject array is required
because C++ instance methods expect to be passed a reference to the object they are designed to work on. We reserve an area in memory and treat this as a
reference to a dummy object that is passed to the methods within the DLL.

Points of Interest

Hopefully this will prove insightful to someone going
through a similar situation with attempting to use third party DLLs. Without
access to source for the definitions of the underlying interface and data
types, the problem can be more or less complex than this example. However, it
serves to illustrate that there isn’t always a general purpose solution for accessing DLL functions and some cases will require an ad-hoc approach.

Share

About the Author

I am an embedded software engineer with almost 10 years of experience in the industry. I currently work at Continuum Advanced Systems, a global design innovation consultancy specializing in consumer and medical products. My specializations and interests cover embedded systems, mobile application development, and various web software technologies.

Comments and Discussions

Hi @rbermani, I am very interested with your topic. In many cases, I have faced that the clients only provide me some pre-built libraries but I have had to develop some high level programs which use functions from those libraries.
I wonder that there must be some mechanisms to expose APIs from binary but I'm not successful till I have seen your post.
I am looking forward more posts from you but unfortunately, so may you please give me some instructions to find resources for studying more about this topic? Many thanks.

For anyone who is interested, the size of the class can not be determined from neither .lib file nor .DLL itself, it has to be calculated from the header files for accuracy, but for a simple scenario, you may be able to just give a big number to avoid calculating the correct size of the class.

You could also use ordinal number instead of mangled name when you call GetProcAddress, just defined a couple of constant integer value make it cleaner.

The article is helpful. But a few thoughts... This way you could actually get access to private methods of classes too. While this can be seen as an advantage, on the other hand it breaks the whole purpose of OOP. Moreover, this method ties the application to the particular version of the dll and the compiler it is compiled with. The moment the dll changes or gets compiled by another compiler whose name mangling conventions are different, the application will break. Please share your thoughts on this.

Thanks for the feedback- you brought up some great points! I agree with you entirely that encapsulation is one of the cornerstones of OOP.

As for the DLL functions being accessible (either private or public). I believe this can be the case with export functions from any DLL, regardless of if you have the source available or not. Although it probably depends on the compiler. The compiler depends on the contents of the header file to enforce access specifiers, as they are front-end features, not security mechanisms.

As a developer, you could possibly write an intermediate layer in between the end-user and the DLL itself, to expose only the intended functionality.

The method described was only intended as a hack. If the name mangling scheme of the compiler changes between revs, and the DLL is rebuilt, it would no longer work without modification. Since the DLL loading scheme occurs at run-time, one method to mitigate the name mangling between DLL revisions issue would be to create an application-specific definition file that maps the mangled names their respective functions. The map could be released with the DLL and the application would be written to parse and process it before loading begins. The other option is that the specific DLL must be shipped with the release.

In our particular use case, the application we used this technique for was merely for a proof of concept. Unfortunately it is difficult to turn this into a robust, general purpose solution.

Although you are able to control the camera that way, the author of the DLL probably don't want you to do that. Most commercial licenses explicitly disallow reverse-engineering.

If it is for personal use then they might not bother but if it is used for commercial applications that you develop, you should make an agreement with the original vendor and in such case why not have them provide you the required header and library.

Hi Philippe,
Thank you for the quick response to my article. I understand your concern on this matter. However, my the objective with the article was to present a general purpose method of making calls on any C++ DLL, along with describing some pitfalls and obstacles that one might encounter along the way. I tried to use caution in not revealing what vendor this was used for internally, in the article as well as the source code.

This article and source code does not disclose any information about copyrighted or confidential material, nor provides any proprietary libraries. I just thought someone might appreciate knowing the original context.

We did make contact with the hardware vendor and we were told that no such SDK or API was available, either commercially or for our own experimental use for the proof of concept we needed to demonstrate to our client.