Avoid C Runtime, C library, System Calls in Portable Code

In a previous post, I mentioned that you should avoid using system or C runtime calls in portable code. I also said that an entire blog post would be needed to explain why. Well, here's that post.

In the C programming language there are C runtime calls. These are subroutines you can call to perform various common functions. They include memory allocation and movement (malloc, memcpy, memset), string operations (strlen, atoi), and many more.

There are also C library calls. These are essentially the same thing as C runtime calls, except that they aren't "officially" C runtime calls. For example, there are math functions such as sin, exp, and so on, or time functions.

System calls are things the kernel or operating system does such as creating new processes, threading, or file I/O. Generally there are libraries continaing routines that perform these functions so you don't have to call the kernel directly, which means they can look just like C library calls (fopen, for example).

Of course, when you get right down to it, all C runtime and C library calls are system calls with a C library interface, but somehow the programming literature makes a distinction between them.

These calls are vital to almost any program. Almost every C program written will use at least one C runtime call and a very large percentage of programs will make a C library or system call. These calls will be necessary to complete the task. It will be impossible to write many programs without calling these functions. In some programs it might be possible to get around the calls, but the performance hit might be too great, or the time spent duplicating the functionality would be a waste of company money.

So why do I say avoid them if they are vital?

First of all, I say that only for portable code. Code that will run only on Windows or only on HP/UX, for example, can be written to that platform's specification. But if your code will need to compile on Windows and Solaris and OpenBSD and Palm OS and who knows what else, then you need to write portable code, and C runtime, C library, and system calls are the biggest enemy of portable code.

Secondly, I say that if you need to use them, then you should use them through a wrapper.

One of the first questions I get when I talk about this subject is, "But aren't all these calls standardized now?" The answer is yes, but there are still some variations and other issues. For an example, let's look at malloc.

The function malloc is standardized, so why can't I use that? First, there have been platforms in the past, and there might still be some that have no malloc. In the old Palm OS, to allocate memory, you called MemPtrNew (I think that's still the case, or if there is now a malloc the recommendation is to use MemPtrNew). Some embedded or small device platform might not have memory, they only have stack space. If something as basic as malloc is not portable, can you see why any other function might not be?

Another issue might be that malloc exists, but you want to use something else. For example, on Windows, there are cases where GlobalAlloc is preferred over malloc. Or maybe in your app, you want to guarantee that every allocation automatically zeroes out the memory or before every free you want to overwrite the buffer (maybe it contained passwords or other sensitive data).

Finally, the malloc function has a particular signature. It has one argument, the size, and returns the allocated buffer or NULL if the allocation fails. What if you want to allocate memory and keep track of each allocation (this makes FIPS certification easier, by the way)? Each time you allocate memory you want to pass in some sort of structure or context that does the bookkeeping. You want your memory allocation function signature to be one that allows this extra context.

This means when your code needs to allocate memory, it should call a wrapper. For example, let's say the wrapper is

int Z3Malloc (unsigned int size, Pointer ctx, Pointer *buffer);

In your program, every call to "malloc" will be a call to Z3Malloc. If you need to change the underlying malloc, you do it once in your code.

That was malloc, how about threading? Your app might want to do some synchronization. On Windows you use Windows threading and on Linux you use Pthreads. The wrapper makes even more sense. How about file I/O and getting the current time? There are some standard functions, but there is still some variability in the real world.

On Windows you have to use _snprintf, and then that function is deprecated, you're supposed to use _snprintf_s. If you can get away without using snprintf, great, you don't have that problem. If you must use it, put a wrapper around it so you can change it to _snprintf on Windows, and then if you want to change it to use _snprintf_s, you do it one place. If you don't change it, then your customers might see this.

warning C4996: '_snprintf': This function or variable may be unsafe.

Do you want your customers to see that message when compiling or using your code?

What about character sizes? In the US you might deal with only ASCII characters and each character is a single byte. What if you have a customer in Japan, or Russia, or even Germany, where characters might be 2 bytes long. How does your snprintf work there?

It would be better if you did not use snprintf at all. If you figure out how to do the task without using snprintf (maybe you use memcpy to complete part of the task), you will save yourself some headaches and the code might be smaller and faster. The next few paragraphs discuss that.

There are some routines that you might use that can be performed by other functions. For example, look at the following line of code.

sprintf (header, "Title information");

That call is functionally equivalent to a memcpy. Your app will probably have a wrapper for memcpy (that is such an important and often-used function). So rather than a sprintf, use this.

You might look at it and think it made something simple into something overly complicated. But here are the advantages. (1) I don't have to worry about sprintf, no wrapper, no issue about compatibility, deprecation, safety, anything. (2) Memcpy is almost certainly faster and smaller than sprintf. (3) You have made it easier to deal with ASCII/EBCDIC issues if they come up. (4) It's easier to change or add header info.

Actually, you can get numbers 3 and 4 working for you using sprintf as well, but I thought I'd throw them in to illustrate a little bit of how to write robust code.

There are many other examples of using wrappers. Suppose you want to do any networking. What is the socket API? Or how do you download something using HTTP? The list could be very long.

If you can write your code in such a way that it avoids a system call you will make your life easier when it comes to porting. If you must use a system call, use it through a wrapper. But the fewer wrappers the better. After 15 years of porting code, I have learned that you just never know what is going to pop up at you.