It was a fairly straightforward bug. A wide-character string function was called with a byte count instead of a character count, leading to a buffer overrun. After finding the problem the fix was as simple as changing sizeof to _countof. Easy.
But bugs like this waste time. A playtest was cancelled because of the crashes, and because the buffer-overrun had trashed the stack it was not trivial to find the bad code. I knew that this type of bug was avoidable, and I knew that there was a lot of work to be done.
I just finished creating the third in a series of training videos that cover Event Tracing for Windows, also known as xperf or the Windows Performance Toolkit. This set of videos, available on WintellectNow, should be enough to teach any experienced programmer how to use this amazing set of tools to investigate tricky performance problems on Microsoft Windows. You can get two weeks of access to all of the videos on WintellectNow by using promo code BDAWSON-14 – no credit card required.
I’m currently watching John Robbins’ excellent WinDBG training video (slightly condensed from Tolstoy’s original version).
“Please write a C++ function that takes a circle’s diameter as a float and returns the circumference as a float.”
It sounds like the sort of question you might get in the first week of a C++ programming class. And yet. This question is filled with subtlety if you dig into it. Let’s try some solutions.
My last post mentioned the ‘standard’ risks of undefined behavior such as having your hard drive formatted or having nethack launched. I even added my own alliterative risk – singing sea shanties in Spanish.
The list of consequences bothered some people who said that any compiler that would intentionally punish its users in such manners should never be used.
That’s true, but it misses the point. Undefined behavior can genuinely cause these risks and I don’t know of any C/C++ compiler that can save you. If you follow Apple’s buggy security guidance then it can lead to your customers’ hard drives being formatted.
As of May 19th, one month after my report, I see that Apple’s security guidance has not been fixed.
In February 2014 Apple published their Secure Coding Guide. I glanced through it and noticed that their sample code for detecting integer overflow was buggy – it triggered undefined behavior, could be optimized away, and was thus unsafe to use.
I tweeted this, then Jon Kalb blogged it, and then Apple quietly fixed their guide.
But their code is still broken, still triggers undefined behavior, and fails completely on 64-bit builds.
Update, June 17, 2014: no change. Apple’s security guidance is still entirely broken for 64-bit builds.
Update, August 6, 2014: No change. Apple’s security guidance is still entirely broken for 64-bit builds. The document was updated 7/22/2014 but the revision history gives no indication of what changed, and page 28 is still completely broken for 64-bit builds. And yes, this does matter.
After upgrading to Visual Studio 2013 I noticed that find-in-files had a problem when searching directories. The VS IDE would repeatedly hang, rendering it completely useless for the duration of the search. I filed a bug, complete with ETW traces and a detailed analysis of what was going on.
This is a success story, of sorts. Microsoft has now released Visual Studio 2013 Update 2 which ameliorates the problem. They seem to have changed the behavior so that the worker threads don’t start searching until the directory scan has finished, I think. So VS should now avoid a self-inflicted denial of service. However the VS main thread can still hang if anybody else (another copy of Visual Studio perhaps?) is hitting the disk. So the hangs still happen, at best slightly less frequently.
I recently discovered that Microsoft’s VC++ compiler loads mshtml.dll – also known as Internet Explorer. The compiler does this whenever the /analyze option (requesting static code analysis) is used. I’m no compiler architecture expert, but a compiler that loads Internet Explorer seems peculiar.
This isn’t just a theoretical concern either. I discovered this while investigating why too much static-analysis parallelism causes my machine to become unresponsive for many minutes at a time, and the mshtml window appears to be the part of the cause.