My last post mentioned the ‘standard’ risks of undefined behavior such as having your hard drive formatted or having nethack launched. I even added my own alliterative risk – singing sea shanties in Spanish.
The list of consequences bothered some people who said that any compiler that would intentionally punish its users in such manners should never be used.
That’s true, but it misses the point. Undefined behavior can genuinely cause these risks and I don’t know of any C/C++ compiler that can save you. If you follow Apple’s buggy security guidance then it can lead to your customers’ hard drives being formatted.
As of May 19th, one month after my report, I see that Apple’s security guidance has not been fixed.
With the exception of launching nethack the problem is not that the compiler will mischievously wreak havoc. The problem is that undefined behavior means that you and the compiler have lost control of your program and, due to bad luck or malicious hackers, arbitrary code execution may ensue.
See this article for a great explanation of undefined behavior in C/C++
Arbitrary code execution is arbitrarily bad
If you read security notices then the phrase ‘arbitrary code execution’ comes up a lot. This means that there is a way to exploit a bug such that the attacker can take control of the target machine. It may be because of a design flaw in the software, but in many cases it is undefined behavior that opens the door to chaos.
Here are some ways that undefined behavior can format your hard drive:
The ‘classic’ use of undefined behavior to execute arbitrary code is a buffer overrun. If you can cause a program to overrun a stack based buffer with data that you supply then, in some cases, you can overwrite the return address so that the function returns to your buffer and executes your payload. The only reason these payloads don’t normally format hard drives is because there’s no money in that – encrypting hard drives for ransom or mining bitcoins are more profitable tasks.
Microsoft has several compiler/linker/OS features that help to make buffer overruns more difficult to exploit. These include:
- /GS (add code to check for stack buffer overruns)
- /NXCOMPAT (make the stack and heap non-executable)
- /DYNAMICBASE (move stacks, modules, and heaps to randomized addresses, aka ASLR)
- /SAFESEH (don’t use unregistered exception handlers)
Using these switches (gcc has equivalents) is crucial to reducing your risk, but even with all of these defenses a determined attacker may be able to use a buffer overrun to execute arbitrary code. Return Oriented Programming defeats /NXCOMPAT and there are various ways to defeat /DYNAMICBASE. Even with significant assistance from the compiler and OS a buffer overrun in C/C++ is fundamentally undefined.
Note that only /GS has any runtime cost, so use these switches!
Buffer overruns in the heap or data segment can also be exploitable. Although it seems improbable, it has been shown that in some cases a buffer overrun of a single byte with a zero can be exploitable.
Use after free
Integer overflow and underflow
Integer overflow can cause some compilers – notably gcc and clang – to remove checks that are shown to be unreachable. This can then expose code to otherwise impossible bugs. Similar optimizations can happen because of misplaced NULL checks.
I think of these optimizations like this: If I write code that assigns a constant to a variable and then checks to see if that variable has that value then I expect the compiler to optimize away the check. As an example, here is some tautological code:
void OptimizeAway(int parameter)
bool flag = true;
int* p = ¶meter;
if (flag && p)
printf(“It’s %d\n”, *p);
A good compiler should optimize away both of the checks because they are provably unnecessary, and this is the same thing that happens when compilers use undefined behavior to prune their state tree. The compiler ‘knows’ that signed integer overflow cannot happen, and generating code to handle that possibility is equivalent to generating code to handle false being equal to true.
In this simple example the compiler could warn about the tautology, but in more complex cases it may not be practical. Or it may not be desirable, if the tautology only happens in some variations of a template function.
Apple’s current secure coding guidance can trigger undefined behavior, but in a way such that the undefined behavior is unlikely to cause problems. But still, undefined behavior in a security guidance document? Sloppy.
The larger problem with Apple’s ‘fixed’ code is that it is completely broken on 64-bit platforms, even if signed integer math is defined and even if they used unsigned math. If you compile for a 64-bit platform that has 32-bit integers (i.e.; any modern platform) then Apple’s overflow checks will not catch anything. All pairs of positive numbers will pass the checks, regardless of whether they overflow a 32-bit integer. This allows for massive buffer overruns, execution of arbitrary code, and other things generally regarded as ‘bad’.
Is undefined behavior a good idea?
A full discussion of the benefits and costs of undefined behavior is above my pay grade, but I’m going to briefly opine anyway.
There are some types of undefined behavior that must stay that way. C/C++ cannot guarantee that out-of-bounds array accesses will always be caught, so their potential to corrupt memory or reveal secrets is unavoidable. Similar reasoning applies to use-after-free – there is no practical way to define this, so it must remain undefined. This isn’t a matter of compilers being malicious, this is just the C/C++ object model and performance expectations running headlong into the halting problem. The articles here and here discuss confusion about whether illegal memory accesses will cause a crash, and the short answer is ‘maybe’. C and C++ are not safe languages, and never will be.
Some constructs could be defined but are so ugly that they should not be. A classic variant that makes my eyes bleed and that should not be defined is:
i = ++i + p[++i];
Other types of undefined behavior could be defined, and many of them should be.
NULL pointer dereferences should be defined to crash the program. This requires ensuring that the pages at and before zero never get mapped. Unfortunately both Windows and Linux have historically done an imperfect job of guaranteeing that address zero is not mapped, but this is manageable.
Integer overflow should defined as twos-complement arithmetic (making –fwrapv the default in gcc/clang) or should be implementation defined. This does prevent some loop optimizations (see this article) but at some point we may need to decide that getting the right answer is more important than getting it fast. There should be ways to enable those optimizations without holding millions of lines of code hostage.
Excessive shifting of integral types could also easily be made implementation defined.
Conversion of out-of-range floats to int is currently undefined and could be made implementation defined.
In general I think that the C/C++ standards should reserve the undefined hammer for those cases that cannot be defined, instead of using it as a dumping ground for ambiguity. Visual C++ already behaves this way, but portable code cannot depend on this.
The standards committee for C++ is considering defining the behavior of left shifting a one into the sign bit, which seems like a move in the right direction.
Defining more behavior is no panacea however. In this article Larry Osterman discusses how the shift operator behaved differently for 32-bit and 64-bit processes. If the behavior of shift was implementation defined then it might still be defined differently for 32-bit and 64-bit processes, and his issue would probably still have occurred. However, the bugs caused by implementation defined behavior tend to be far more understandable than those caused by undefined behavior.
Know the standard
My uninformed opinions about undefined behavior don’t change the reality on the ground so I would recommend that all C/C++ developers read and understand this classic article on undefined behavior:
then, read the thirteen pages of undefined behavior listed in section J.2 in the C standard located here, and then, until Apple secures their security guide, consider this alternative guidance:
or this scary read:
Although, I’m not sure I agree with page 53 of Dangerous Optimizations. It says that the compiler is allowed to implement “a < b” by subtracting and checking if the result is negative. However checking for negative is not sufficient because of overflow and, despite the document’s claims, it is not allowed to ignore that because of undefined behavior because ‘<’ is defined for all values.