The first thread changes the length of the string rapidly between
255 and 511, between a string that passes validation and a string that
doesn't, and more specifically between a string that passes validation
and a string that, if it snuck through validation, would pwn the machine.

The second thread keeps handing this string to
Some­Function.
Eventually, the following race condition will be hit:

Thread 1 changes the length to 255.

Thread 2 checks the length and when it reaches attack[256],
it reads zero and concludes that the string length is
less than 256.

Thread 1 changes the length to 511.

Thread 2 copies the string and when it reaches attack[256],
it reads nonzero and keeps copying, thereby overflowing its buffer.

Oops, you just fell victim to the
Time-of-check-to-time-of-use attack
(commonly abbreviated TOCTTOU).

Now, the code above as-written is not technically a vulnerability
because
you haven't crossed a security boundary.
The attack code and the vulnerable code are running under the same
security context.
To make this a true vulnerability, you need to have the attack code
running in a lower security context from the vulnerable code.
For example, if the threads were running user-mode code and
Some­Function is a kernel-mode function,
then you have a real vulnerability.
Of course, if Some­Function were at the boundary
between user-mode and kernel-mode, then it has other things
it needs to do, like verify that the memory is in fact readable
by the process.

A more interesting case where you cross a security boundary
is if the two threads are running code driven from an untrusted
source; for example, they might be threads in a script interpreter,
and the toggling of attack[256] is being done by
a function on a Web page.

When the user clicks on the button, the script interpret
creates a background thread and starts toggling the
length of the string under the instructions of the script.
Meanwhile, the main thread calls foo
repeatedly.
And suppose the interpreter's implementation of foo
goes like this:

The script interpreter has kindly converted the script
code into the equivalent native code, and now you have a problem.
Assuming the user doesn't get impatient and click "Stop script",
the script will eventually hit the race condition and cause
a buffer overflow in Some­Function.

My knowledge of threads in C is a bit fuzzy. Can you mount the same attack against strlcpy by a second thread toggling the length value between allocation and strlcpy call, or is the length value guaranteed to stay in a register (assuming non-volatile) in the strlcpy()ing thread?

@Adam Rosenfield No, the GIL wouldn't help at all. The GIL would only ensure that the call of strlen(s) would return the correct result at the time you call strlen (assuming strlen to be a native function). After the call to strlen, the GIL could still be acquired by the malicious thread before strcpy. What would help is a serialization of the program (i.e. disable multi-threading). That's obviously not going to happen.

@Philipp: If interpret_foo() holds the GIL, then SomeFunction() runs without interruption since the attacker's thread cannot run without the GIL. If SomeFunction() were itself implemented in script instead of natively, then yes, it could lose the GIL in between strlen() and strcpy(), but why the heck would you be calling strcpy() from a scripting language?

But… If someone can execute code in your process already then how do they need a stack overflow to do anything useful. Surely they could just run the code directly. I understand that if the function is an operating system call or something that runs in kernel mode that it matters, but I can't see how this helps secure an ordinary process much. I'm not arguing that it doesn't I just still don't understand the additional risk here.

Most embedded scripting languages expose a very limited set of what the machine can do. Think Lua scripting in WoW's client. If the script itself can trigger an overflow and inject its own machine code on the stack, that script can now do a lot more than Blizzard, or the player, ever intended.

Congratulations. You just justified a nonsense practice with a chunk of code that violates the ground rules of programming.

Since there's obviously no sense in the matter (else you would have come up with a better example), I'm turning that warning off and leaving it off.

[News flash: The bad guys don't care about the ground rules of programming. And even if the other code is not malicious, it may have violated the ground rules by mistake. You should take reasonable steps not to amplify a minor error into a security vulnerability. -Raymond]

Even such apps draw the line somewhere. You'd be hard-pressed to create a keylogger in VBA, for example. Even a simple one using a hook. Or make an arbitrary network connection or start listening on a port for command&control signals.

The issue isn't with the strlen. The version of the function written with the secure functions would look like

int SomeFunction(const char *s)

{

char buffer[256];

if (strcpy_s(buffer, _countof(buffer), s) != 0) { return ERR; }

…

}

(side note for nitpickers, deliberately calling the C version of the function instead of the C++ to illustrate the point).

The secure version ensures that buffer has a valid, null-terminated string that never writes out of buffers bounds. Raymond's sample version attempts to guarantee that but fails.

Now, given the docs for strcpy_s, I am honestly not sure if it really will behave correctly where the source string is being modified during the copy operation. If the source string is too large it will return ERANGE, but the destination buffer is not modified. This implies an initial size check, exactly like Raymond's original (vulnerable) example. The MSDN docs say nothing about what happens if the source string gets to large during the copy itself. I would hope that a sanity check in the copy would abort the copy if it is about to overflow the destination buffer, but I don't see that guarantee made in a multithread environment.

@SimonRev "This implies an initial size check, exactly like Raymond's original (vulnerable) example." Yes, this is what I was thinking — how does letting the "safe" function pre-check the size for you accomplish anything in a multithreaded environment where the string could be changing mid-copy?

It may turn out that strcpy_s can handle this situation too, but as far as I can see, neither the docs nor Raymond have explicitly said this, which makes this multi-threaded TOCTTOU argument weak IMO (whatever other good reasons there are to use "safe" functions.)

It seems to me we're basically arguing: "Hey instead of using a function that already does the checks I can just reimplement the functionality wherever I need it". Which I guess is true enough in most cases, but that applies to almost everything.

char buffer[256];

if (strlen(s) ≥ 256) return ERR;

strcpy(buffer, s);

vs.

char buffer[256];

if (strcpy_s(buffer, _countof(buffer), s) != 0) return ERR;

Seems more error prone, because it's easy to forget, longer and maybe even less efficient.

If the string passed into SomeFunction is being used, or even merely has the possibility to be used in multiple threads, then a wait lock on it should be taken immediately before reading, and released as soon as the reading is finished, and other threads should likewise respect that policy as well. Threads that take such locks have a dutiful responsibility to release them immediately. If you are wanting to code particicularly defensively, you could try a timed wait lock instead, and if after a certain amount of time the lock is still not available, then the function should simply fail, and return a status that indicates it did not succeed.

I disagree with the premise of the article. The reason you should be using the safe functions is because people who try and do the checks themselves tend to get it wrong:

I've seen lots of programmers smugly try "if(strlen(X) > COUNT(Y)) return ERR; else strcpy(X, Y)", only to end up with a one-byte heap/stack overwrite.

The idea that failing to use it is a TOCTOU bug isn't valid unless the function EXPECTS the parameter to be modified by another thread, for example when kernel-mode code is getting a pointer from user-mode. In this case, the bug isn't failing to call the safe function – it's failing to capture the buffer first.

Functions are entitled to expect their parameters to be thread-local. If you're building a multithreaded scripting language then you need to either capture the parameter first or somehow lock the object against writes before doing the checks.

The alternative is chaos, since if you have to expect other threads are attacking you then you suddenly need to mark all variables as volatile, to disable compiler pre-fetches, and do interlocked writes to make sure other threads don't get confused by your modifications. You COULD write your apps like that, but then you'd go insane, and you'd write pretty slow apps.

As a programmer you should use strcpy_s (actually StringCchCopy) because that way you can afford to suck at math every now and then. If you choose strcpy then you have to be sure that you don't suck at math, will never suck at math, and have never sucked at math before. Otherwise you're going to be dragged into work at 4am on a Sunday to patch your way of a hole when all your customers' credit cards start disappearing.

Your program that tries to prove wrongness is wrong: comparing any (signed) char with 0xcc is always false. The char you filled with 0xcc is implicitely casted to int, giving 0xffffffcc on the left and 0xcc on the right to be compared to equality.

Well, I've looked into the implementation of strcpy_s and it can't overflow from the timed attack. What it does is to keep the length you passed in and for every character you copy, it subtracts one from that. Now if this gets to 0 and you haven't read a null character, then it will blank the entire string and return an error.

So strcpy_s isn't a moral equivalent of strlen followed by strcpy, it is more involved than that.

Aside from security/correctness, there's another reason that someone might want to use the 'secure' version of strcpy() instead of checking the length then calling strcpy(): the safe version only makes a single pass over the original input (and may not even need to touch all of the input buffer).

So even if you're a programmer who never makes a mistake and injects bugs, you still might like to have it or something similar as an option.

To those who think they're safe because their script language only supports a single thread: you're wrong. The guy who adds threads to your scripting language isn't going to audit all of your code to make sure that introducing threads doesn't create any race conditions. And if you rely on a third-party script engine, you might not even know that it ever became multithreaded!

Are you going to just do the right thing, or pretend you know better and allow every PDF to become a potential attack vector because somebody added threading to the JavaScript engine you embed in your PDF viewer?

But really, how is it in any way better to litter your code with string validation followed by unsafe calls when it is easier to just make safe calls in the first place?

Also, I'm surprised by the use of ≥ rather than >= for the >= operator.

@Crescens2k If this (and others' analyses) is true, I still agree with Joshua about relying on undocumented behavior. If everyone has to look at the source of these functions (since docs don't seem to answer the question) to see if there's a TOCTTOU, we can't be sure these functions will always be thread-safe. But from Raymond saying, "I'll see what I can do to make this clearer in the docs", I gather that the existing precautions we've discovered will become documented.

@Joshua Great, so you only have to do 3 things including calling a function to get the same behavior instead of calling the function to begin with. I assume you also don't use printf in C, because hey it's only a simple write to a known file descriptor and a bit of parsing involved?

@Joshua A premature optimization – why didn't you mention that at first? But you're still going to call strcpy anyhow, so you only safe the additional guarantees made by strcpy_s which are pretty negligible..

@Cesar: Good point about strcpy_s not guaranteed to be thread-safe. Hopefully C11 adoption will be more swift than C99 adoption, since it made optional certain hard-to-implement features (such as VLAs) which were required in C99.

> If the source string is too large it will return ERANGE, but the destination buffer is not modified.

I can't find where the docs say this. In fact, they seem to say the opposite. From MSDN docs:

– strcpy_s(): "The debug versions of these functions first fill the buffer with 0xFD."

– StringCchCopy(): "STRSAFE_E_INSUFFICIENT_BUFFER – The copy operation failed due to insufficient buffer space. The destination buffer contains a truncated, null-terminated version of the intended result."

Somehow, a scripting engine (hard), multithreaded (hard) and with a security requirement (very hard) is a bit of a corner case to explain the existence of a function in a *standard* library – surely someone with those three skills can write their own safe strcpy.

The real reason is that progammers often forget to apply good practices and if they already got into the habit of using secure functions the damage if they forget some sanity check would be limited. Because if programmers were following best practices 100% right in the first place, we wouldn't ahve the problem to start with: surely you have unit test for all boundary conditions, stress tests, security tests etc in place for every single line of code don't you ? :)

===

@Avi : I would change most with "many". Many scripting scenarios are quite unrestricted and treat script basically as plugins. I don't know which way are most scripting engines, but I would not take security for granted. WoW of course is probably a secure case. Most LoB applications with scripting aren't secured on the scripting side instead and for a good reason: if a script can do operations on the LoB app, usually formatting the hard drive is the least damage it can do compared to intentionally corrupting business data, etc.

The original commenter was claiming that if you do your own check you don't need to use the secure version of the function. Well, that *might* be true sometimes, or it might not. It depends on how your code is going to be used. Raymond's example shows that there are cases where you really might want to worry about it.

So, at a minimum, you need to do some additional analysis to figure out if your check is sufficient. In some cases you won't know the answer, because you don't completely control the context in which your code is used (say because you are a library or other component which gets used in many different contexts). Plus, you might get the analysis wrong simply because you haven't considered certain lines of attack (would *you* have thought of Raymond's TOCTTOU example on your own?). Are you *sure* there are no other issues to worry about?

Compare the cost of doing that analysis (if you can), and the risk that you might have accidentally got it wrong, with the cost of just having a cast-iron rule that says "always use the secure string functions". The cast-iron rule has very little downside, and prevents you from making a mistake.

As far as Microsoft's implementation of strdup()/_strdup() is concerned, it uses strcpy_s() to perform the copy into the allocated buffer (at least since VS2005 SP1 Update for Windows Vista, anyway) using the size value that was passed in to malloc(). However, that says nothing about older versions or non-Microsoft implementations.

[You should take reasonable steps not to amplify a minor error into a security vulnerability. -Raymond]

Don't you go over not depending on undocumented behavior frequently?

There is no indication in the documentation that this is not a valid implementation of strcpy_s:

errno_t strcpy_s(char *s, size_t nemb, char *t)

{

if (!s || !t) return EINVAL;

size_t tl = strlen(t) + 1;

if (nemb < tl + 1) return ERANGE;

strcpy(s, t);

memset(s + tl, 0, nemb – tl);

return 0;

}

I would have actually implemented like so (which just has a different problem with that race):

errno_t strcpy_s(char *s, size_t nemb, char *t)

{

if (!s || !t) return EINVAL;

size_t tl = strlen(t) + 1;

if (nemb < tl + 1) return ERANGE;

memcpy(s, t, tl);

memset(s + tl, 0, nemb – tl);

return 0;

}

[I don't know what strcpy_s does, but StringCchCopy will not overflow the specified output buffer no matter how crazy the source string is. That's sort of its whole point. I'll see what I can do to make this clearer in the docs. -Raymond]

‘is not technically a vulnerability because you haven't crossed a security boundary.’ – I strongly disagree. The attack string will probably come from outside rather than be a string constant. Maybe it's pasted by the user who has no idea that there's a chance that the strange accented letters mean ‘root my box’ or maybe it's from the *gasp* internet.

In practice though, I feel the example to be contrived, and buggy even if the safe string function is used. A better answer to our C64 fan would have been that the safe function forces you to do the check, whereas even the best programmers sometimes forget the check if they use the standard functions. Furthermore they sometimes get the check wrong in subtle ways (this article is an example). As someone who has seen various (non-Windows) kernels and rtls I feel this point cannot be stressed enough. I thus feel a certain amount of irritation that the safe functions still aren't available on most Posix systems. The developers of said systems are usually well aware of the safe string functions but tend to refuse to implement them (or deprecate the old ones) because of exactly that kind of arrogance.

That said, I tend to use a C++ string class library in which the string class really is whatever is most convenient (on Windows it's a BSTR).

@SimonRev: There is an assembler opcode which when correctly used in conjunction with other opcodes and initial conditions prevents the fault condition you describe. I haven't checked but it seems likely that the rtl would use that (or something that compiles to it) possibly with some optimisations thrown in (maybe copy full words first and then the left-over bytes).