Thread Naming in Windows: Time for Something Better

Windows lets you give names to the threads in your process which can then be displayed by debuggers. This is a great idea, but the implementation is showing its age – it’s time for some fixes, or something better.

Update, Dec 2015: the race condition has been fixed and the doc writers have been asked to fix the sample code. Bugs work!

imageNaming of threads is definitely helpful. It gives additional context when debugging, where the more information that is available the better. The screenshot to the right shows the name column from the threads window in Visual Studio when debugging UIforETW.

However, thread naming suffers from a number of flaws, mostly related to the fact that it is merely a debugger convention, rather than an OS feature.

The main flaws occur because thread naming on Windows works by raising an exception. There is a convention that says that debuggers should watch for the exception code 0x406D1388 – yes, that is just an arbitrary magic number with no intrinsic meaning – and look for magic values in the associated exception record. The debugger has to do something like this:

Call WaitForDebugEvent. If there is an exception, and if the exception code is 0x406D1388, and if the number of parameters is the right value, then reinterpret_cast the exception information. Look in that struct and if dwType is equal to 4096, and dwFlags is zero, then use ReadProcessMemory to grab the thread name from the debuggee’s memory.

Oh, where to start…

The ad-hoc nature of this code is evident. The debuggee sets a bunch of parameters and then uses RaiseException to signal the debugger. If the debugger supports thread naming then it handles this specific exception (matching values of ExceptionCode, NumberOfParameters, dwType, and dwFlags) and then reads the thread name from the debuggee’s memory. Whether the debugger support the exception or not the debuggee then handles the exception and continues. It’s a crude method of IPC.

The good thing about this technique is that it exists. If this feature had been suggested as an OS feature then it could have been tied up in design review or in planning for decades. Using this hack technique meant that the Visual Studio debugger team (the MS_VC_EXCEPTION code betrays their hand) could just implement the debugger side, document how to invoke it on the client side, have it immediately work on all versions of Windows, and then get back to work. Windbg then easily implemented the same feature and all was good.

But, this expediency has left us with some problems.

Flaws that aren’t

When I first tweeted about a few of these problems I got a reply claiming that the structure definition was wrong for 64-bit builds. The mechanism is indeed fragile, but it isn’t broken. One constraint of the Win32 debug APIs is that the bitness of the debugger and debuggee must match – a 32-bit process cannot attach to a 64-bit process. And, as long as the debugger and debuggee have the same structure layout it really doesn’t matter what it is. Since the debugger and debuggee will be using the same compiler, and the same structure definition, they will have the same layout, which is all that is needed.

The #pragma pack directives are not needed in order to enforce any particular packing of the structure, but they are needed to ensure that both sides have the same layout.

Visual Studio is 32-bit and it can debug 64-bit processes but it does this by using IPC to communicate with a 64-bit debugger proxy.

The little problems

I want to get these out of the way first, even though they aren’t the serious issues.

  1. The sample code has not been updated in a long time. Because its threadName parameter is declared as char* it can’t be called with a const char array, and if you build with /Zc:strictStrings (very handy) it can’t even be called with a string constant. This minor error has now been pasted into thousands of code bases. I’d recommend fixing this in your code, but the use of cut-and-paste coding means that this poor example will live forever.
  2. The sample code triggers two /analyze warnings. One complains that the filter expression is a constant, and the other complains that the __except block is empty. The /analyze team should treat the SetThreadName documentation as proof that these constructs are often not bugs, and the documentation team should add the necessary warning suppression pragmas. Sample code that is still relevant should compile cleanly at high warning levels.

Update: comments on the connect bug suggest that the documentation bugs should be fixed.

The big problems

The big problems all stem from one consequence of the design: if no debugger is listening when the exception is raised then the thread name is lost, forever.

If a debugger attaches after thread naming is done then the debugger will not know that it has missed these exceptions. There is no way to ask the debuggee to raise the exceptions again. I suppose a debuggee could rename its threads every few seconds, but this is at best a harm-reduction strategy that will still fail entirely if a debugger attaches when a process crashes. The current strategy means that most Windows Error Reporting crashes have no thread names in them.

It turns out that debuggers are not the only tools that could benefit from knowing thread names. Profilers, such as Windows Performance Toolkit (xperf) would be greatly enhanced by having a thread name column – grouping by thread name would be wonderful. But, attaching as a debugger to processes being profiled is a terrible idea, so Windows Performance Analyzer (WPA) has no way to get this information.

In addition, the two main Windows debuggers both suffer from an avoidable race condition that causes SetThreadName to silently fail quite frequently. Both debuggers appear to only name threads if the thread-creation event arrives before the thread naming exception. If you create a thread and then immediately name from the creator thread it then it is quite easy (especially on multi-core processors) to raise the exception before the thread has had a chance to start running! The fix shouldn’t be hard – just fix the debugger so that it can handle having the two events showing up in either order. Easy. Until these debuggers fix their race conditions every application that names its threads has to do it very carefully (see later for details).

See this connect bug for a repro project and more details. Note that the bug was resolved as fixed in December 2015. It looks like VS 2015 Update 2 should have the race condition fixed.

Solving it

Coming up with a solution to all of these problems is an interesting exercise. I could easily create a DLL that exported a SetThreadName function. This DLL would then communicate with another process that would maintain an in-memory database that mapped thread IDs to names. In order to avoid slowdowns on consumer machines this would be an opt-in process, probably implemented by having programs do the LoadLibrary/GetProcAddress dance to see if the DLL/process pair were installed. This would be simple enough, but the idea suffers from two problems that are serious enough to discourage me from bothering.

  1. My naive approach would not be able to tell when threads died, which means that the thread database would quickly grow to thousands of entries. Most of these would be unused, and many would represent threads that were reusing the IDs of named-but-now-dead threads. Adding an UnsetThreadname function would reduce the clutter but could never solve it, especially in the face of threads exiting abruptly. Solving this issue in user mode without introducing race conditions would be challenging.
  2. The bigger problem is that a thread naming API needs broad support. It’s possible that I could convince thousands of developers to adopt a new thread naming API, but without tool support this would be pointless. I want this thread name database to be queried by windbg, the Visual Studio debugger, and ETW tracing. I don’t have the source code to any of these tools and I can’t find where to create a pull request.

Microsoft could implement my naive solution and add support to the tools that I care about – they own them all. More importantly, Microsoft could also do it at the kernel level, which would could make it handle thread termination.

There is precedent for adding this sort of instrumentation to the OS. The gflags tool allows developers to enable tracking of heap allocations, object creator type tracking, and much more. It is time for Microsoft to add thread names as a gflags option. The kernel could catch the existing exceptions, the debuggers and profilers could be updated to query the kernel’s thread-name database, and the world would be a better place.

I hope Microsoft also fixes the const correctness, /analyze warnings, and race conditions in their current setup.

Until then…

If you are writing a debugger then please consider adding a more robust thread naming mechanism. Maybe it will catch on. You should also support the existing standard, and fix its race condition in your debugger.

If you are writing a program where you want to name your threads then all you can do is slightly improve the situation. You can avoid the thread naming race conditions in one of two ways.

  1. The simplest solution is to have your threads name themselves. This guarantees that the thread creation event arrives before the exception event because the child thread can’t call SetThreadName until it has already started running.
  2. If you want to create threads and then name them from the creator thread then you have to wait until the threads have definitely started. This can be done by waiting for the threads to signal an event, or if that isn’t convenient then just wait “a while” and cross your fingers.

In UIforETW I went with the first solution. The only thread naming function it uses is SetCurrentThreadName, which doesn’t have a thread ID parameter. This saves me from having to pass in the thread ID and it avoids the race conditions. Chromium does the same thing in it’s SetNameInternal function in platform_thread_win.cc.

You should also make the thread name parameter a const char*, and use “#pragma warning(disable: 6320 6322)” to suppress the bogus /analyze warnings.

About brucedawson

I'm a programmer, working for Google, focusing on optimization and reliability. Nothing's more fun than making code run 10x faster. Unless it's eliminating large numbers of bugs. I also unicycle. And play (ice) hockey. And juggle.
This entry was posted in Debugging, Programming, Visual Studio and tagged , , , . Bookmark the permalink.

7 Responses to Thread Naming in Windows: Time for Something Better

  1. For your thread database issue: could you not simply rely on the DllMain DLL_THREAD_DETACH event to detect thread termination?

    • brucedawson says:

      It would be more code to have in every application, running under the loader lock, and it wouldn’t help for the crash case (which would leave zombie database entries).

  2. One way I think about (persistent) thread naming is via its entry point. Option 1) start each thread using different function and rely on function name retrieved from call stack and debugging informations (disadvantage: might not work for release build). Option 2) when starting thread, save pointer to its name somewhere near beginning of its call stack (possible in volatile variable, not to be optimized away by compiler), in thread local storage / variable and retrieve it by walking the stack and chasing this pointer.

    Marek

    • brucedawson says:

      The VS threads window does show the entry point function, and that can be helpful, until you start dealing with thread pools that have identical entry points.

      Putting the thread name in the stack is okay, but it doesn’t make it sufficiently visible. That is, I can generally figure out what a thread is supposed to be doing, but I want to be able to scan the list of threads and see all of their names. Any spelunking at all breaks this usage pattern. A debugger add-in that found the thread names and applied them would help, but it seems unfortunate that we have to do that.

      And, I want this information in WPA traces. It would be extremely nice to be able to see that the readying thread was ‘Main Game Thread’, or some-such, instead of having to interpret cryptic PIDs.

  3. Allen Bauer says:

    The good thing about the exception technique an its relative simplicity is that other tools can support it. There are other commercial tools which target Windows that also rely on this same mechanism. Not that these other tools cannot adopt (and they should) other conventions and/or APIs, the current system works… mostly.

    I do certainly agree that unless the debugger is listening when the thread is created/started, it’s name is lost forever. Parking the thread name into the actual OS thread data would clearly solve this.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s