Windows Timer Resolution: The Great Rule Change

The behavior of the Windows scheduler changed significantly in Windows 10 2004, in a way that will break a few applications, and there appears to have been no announcement, and the documentation has not been updated. This isn’t the first time this has happened, but this change seems bigger than last time.

The short version is that calls to timeBeginPeriod from one process now affect other processes less than they used to, but there is still an effect, and thread delays from Sleep and other functions may be less consistent than they used to be (see [updated] section below).

I think the new behavior is an improvement, but it’s weird, and it deserves to be documented. Fair warning – all I have are the results of experiments I have run, so I can only speculate about the quirks and goals of this change. If any of my conclusions are wrong then please let me know and I will update this.

Timer interrupts and their raison d’être

A geeky clockFirst, a bit of operating-system design context. It is desirable for a program to be able to go to sleep and then wake up a little while later. This actually shouldn’t be done very often – threads should normally be waiting on events rather than timers – but it is sometimes necessary. And so we have the Windows Sleep function – pass it the desired length of your nap in milliseconds and it wakes you up later, like this:

Sleep(1);

It’s worth pausing for a moment to think about how this is implemented. Ideally the CPU goes to sleep when Sleep(1) is called, in order to save power, so how does the operating system (OS) wake your thread if the CPU is sleeping? The answer is hardware interrupts. The OS programs a timer chip that then triggers an interrupt that wakes up the CPU and the OS can then schedule your thread.

The WaitForSingleObject and WaitForMultipleObjects functions also have timeout values and those timeouts are implemented using the same mechanism.

If there are many threads all waiting on timers then the OS could program the timer chip with individual wakeup times for each thread, but this tends to result in threads waking up at random times and the CPU never getting to have a long nap. CPU power efficiency is strongly tied to how long the CPU can stay asleep (8+ ms is apparently a good number), and random wakeups work against that. If multiple threads can synchronize or coalesce their timer waits then the system becomes more power efficient.

There are lots of ways to coalesce wakeups but the main mechanism used by Windows is to have a global timer interrupt that ticks at a steady rate. When a thread calls Sleep(n) then the OS will schedule the thread to run when the first timer interrupt fires after the time has elapsed. This means that the thread may end up waking up a bit late, but Windows is not a real-time OS and it actually cannot guarantee a specific wakeup time (there may not be a CPU core available at that time anyway) so waking up a bit late should be fine.

The interval between timer interrupts depends on the Windows version and on your hardware but on every machine I have used recently the default interval has been 15.625 ms (1,000 ms divided by 64). That means that if you call Sleep(1) at some random time then you will probably be woken sometime between 1.0 ms and 16.625 ms in the future, whenever the next interrupt fires (or the one after that if the next interrupt is too soon).

In short, it is the nature of timer delays that (unless a busy wait is used, and please don’t busy wait) the OS can only wake up threads at a specific time by using timer interrupts, and a regular timer interrupt is what Windows uses.

Some programs (WPF, SQL Server, Quartz, PowerDirector, Chrome, the Go Runtime, many games, etc.) find this much variance in wait delays hard to deal with but luckily there is a function that lets them control this. timeBeginPeriod lets a program request a smaller timer interrupt interval by passing in a requested timer interrupt interval. There is also NtSetTimerResolution which allows setting the interval with sub-millisecond precision but that is rarely used and never needed so I won’t mention it again.

Decades of madness

Here’s the crazy thing: timeBeginPeriod can be called by any program and it changes the timer interrupt interval, and the timer interrupt is a global resource.

Let’s imagine that Process A is sitting in a loop calling Sleep(1). It shouldn’t be doing this, but it is, and by default it is waking up every 15.625 ms, or 64 times a second. Then Process B comes along and calls timeBeginPeriod(2). This makes the timer interrupt fire more frequently and suddenly Process A is waking up 500 times a second instead of 64 times a second. That’s crazy! But that’s how Windows has always worked.

At this point if Process C came along and called timeBeginPeriod(4) this wouldn’t change anything – Process A would continue to wake up 500 times a second. It’s not last-call-sets-the-rules, it’s lowest-request-sets-the-rules.

To be more specific, whatever still running program has specified the smallest timer interrupt duration in an outstanding call to timeBeginPeriod gets to set the global timer interrupt interval. If that program exits or calls timeEndPeriod then the new minimum takes over. If a single program called timeBeginPeriod(1) then that is the timer interrupt interval for the entire system. If one program called timeBeginPeriod(1) and another program then called timeBeginPeriod(4) then the one ms timer interrupt interval would be the law of the land.

powercfg /energy /duration 5This matters because a high timer interrupt frequency – and the associated high-frequency of thread scheduling – can waste significant power, as discussed here.

One case where timer-based scheduling is needed is when implementing a web browser. The JavaScript standard has a function called setTimeout which asks the browser to call a JavaScript function some number of milliseconds later. Chromium uses timers (mostly WaitForSingleObject with timeouts rather than Sleep) to implement this and other functionality. This often requires raising the timer interrupt frequency. In order to reduce the battery-life implications of this Chromium has been modified recently so that it doesn’t raise the timer interrupt frequency above 125 Hz (8 ms interval) when running on battery.

timeGetTime

timeGetTime (not to be confused with GetTickCount) is a function that returns the current time, as updated by the timer interrupt. CPUs have historically not been good at keeping accurate time (their clocks intentionally fluctuate to avoid being FM transmitters, and for other reasons) so they often rely on separate clock chips to keep accurate time. Reading from these clock chips is expensive so Windows maintains a 64-bit counter of the time, in milliseconds, as updated by the timer interrupt. This timer is stored in shared memory so any process can cheaply read the current time from there, without having to talk to the timer chip. timeGetTime calls ReadInterruptTick which at its core just reads this 64-bit counter. Simple!

Since this counter is updated by the timer interrupt we can monitor it and find the timer interrupt frequency.

The new undocumented reality

With the Windows 10 2004 (April 2020 release) some of this quietly changed, but in a very confusing way. I first heard about this through reports that timeBeginPeriod didn’t work anymore. The reality was more complicated than this.

A bit of experimentation gave confusing results. When I ran a program that called timeBeginPeriod(2) then clockres showed that the timer interval was 2.0 ms, but a separate test program with a Sleep(1) loop was only waking up about 64 times a second instead of the 500 times a second that it would have woken up under previous versions of Windows.

It’s time to do science

I then wrote a pair of programs which revealed what was going on. One program (change_interval.cpp) just sits in a loop calling timeBeginPeriod with intervals ranging from 1 to 15 ms. It holds each timer interval request for four seconds, and then goes to the next one, wrapping around when it is done. It’s fifteen lines of code. Easy.

The other program (measure_interval.cpp) runs some tests to see how much its behavior is altered by the behavior of change_interval.cpp. It does this by gathering three pieces of information.

  1. It asks the OS what the current global timer resolution is, using NtQueryTimerResolution.
  2. It measures the precision of timeGetTime by calling it in a loop until its return value changes. When it changes then the amount it changed by is its precision.
  3. It measures the delay of Sleep(1) by calling it in a loop for a second and counting how many calls it can make. The average delay is just the reciprocal of the number of iterations.

@FelixPetriconi ran the tests for me on Windows 10 1909 and I ran the tests on Windows 10 2004. The results (cleaned up to remove randomness) are shown here:

Table of timeGetTime precision and Sleep(1) delays

What this means is that timeBeginPeriod still sets the global timer interrupt interval, on all versions of Window. We can tell from the results of timeGetTime() that the interrupt fires on at least one CPU core at that rate, and the time is updated. Note also that the 2.0 on row one for 1909 was 2.0 on Windows XP, then 1.0 on Windows 7/8, and is apparently back to 2.0? I guess?

However the scheduler behavior changes dramatically in Windows 10 2004. Previously the delay for Sleep(1) in any process was simply the same as the timer interrupt interval (with an exception for timeBeginPeriod(1)), giving a graph like this:

Sleep(1) delays on Windows 10 1909 vs. Global interrupt interval

In Windows 10 2004 the mapping between timeBeginPeriod and the sleep delay in another process (one that didn’t call timeBeginPeriod) is bizarre:

Sleep(1) delays on Windows 10 2004 vs. Global interrupt interval

Why?

Implications

[Updated] The section below was added after publishing and then updated several times.

As was pointed out in the reddit discussion, the left half of the graph seems to be an attempt to simulate the “normal” 15.625 ms delay as closely as possible given the available precision of the global timer interrupt. That is, with a 6 millisecond interrupt interval they delay for ~12 ms (two cycles) and with a 7 millisecond interrupt interval they delay for ~14 ms (two cycles) – that matches the data fairly well. However what about with an 8 millisecond interrupt interval? They could sleep for two cycles but that would give an average delay of 16 ms, and the measured value is more like 14.5 ms.

Closer analysis shows that Sleep(1) when another process has called timeBeginPeriod(8) returns after one interval about 20% of the time and after two intervals the rest. Therefore three calls to Sleep(1) resulting in a average delay of 14.5 ms. This variation in the handling of Sleep(1) happens sometimes at other timer interrupt intervals but is most consistent when it is set to 8 ms.

This is all very weird, and I don’t understand the rationale, or the implementation. The intentional inconsistency in the Sleep(1) delays is particularly worrisome. Maybe it is a bug, but I doubt it. I think that there is complex backwards compatibility logic behind this. But, the most powerful way to avoid compatibility problems is to document your changes, preferably in advance, and this seems to have been slipped in without anyone being notified.

This behavior also seems to apply to CreateWaitableTimerEx and its so-far-undocumented CREATE_WAITABLE_TIMER_HIGH_RESOLUTION flag, based on the quick-and-dirty waitable timer tests that you can find here (requires Windows 10 1803 or higher).

Most programs will be unaffected. If a process wants a faster timer interrupt then it should be calling timeBeginPeriod itself. That said, here are the problems that this could cause:

  • A program might accidentally assume that Sleep(1) and timeGetTime have similar resolutions, and that assumption is broken now. But, such an assumption seems unlikely.
  • A program might depend on a fast timer resolution and fail to request it. There have been multiple claims that some games have this problem and there is a tool called Windows System Timer Tool and another called TimerResolution 1.2 that “fix” these games by raising the timer interrupt frequency. Those fixes presumably won’t work anymore, or at least not as well. Maybe this will force those games to do a proper fix, but until then this change is a backwards compatibility problem.
  • A multi-process program might have its master control program raise the timer interrupt frequency and then expect that this would affect the scheduling of its child processes. This used to be a reasonable design choice, and now it doesn’t work. This is how I was alerted to this problem. The product in question now calls timeBeginPeriod in all of their processes so they are fine, thanks for asking, but their software was misbehaving for several months with no explanation.

Sacrifice

The change_interval.cpp test program only works if nothing has requested a higher timer interrupt frequency. Since both Chrome and Visual Studio have a habit of doing this I had to do most of my experimentation with no access to the web while writing code in notepad. Somebody suggested Emacs but wading into that debate is more than I’m willing to do.

I’d love to hear more about this from Microsoft, including any corrections to my analysis. Discussions:

About brucedawson

I'm a programmer, working for Google, focusing on optimization and reliability. Nothing's more fun than making code run 10x as fast. Unless it's eliminating large numbers of bugs. I also unicycle. And play (ice) hockey. And sled hockey. And juggle. And worry about whether this blog should have been called randomutf-8. 2010s in review tells more: https://twitter.com/BruceDawson0xB/status/1212101533015298048
This entry was posted in Environment, Investigative Reporting, Performance, Rants and tagged , . Bookmark the permalink.

20 Responses to Windows Timer Resolution: The Great Rule Change

  1. Lucian Bargaoanu says:

    I think some graphics are missing. Above “Implications”.

  2. Pingback: Windows Timer Resolution: The Great Rule Change – Your Cheer

  3. Pingback: Windows Timer Resolution: The Great Rule Change - GistTree

  4. Pingback: === popurls.com === popular today

  5. Morten Ofstad says:

    I guess it’s simply skipping on waking up your thread until its (local) timeBeginPeriod setting is going to be greater than the next multiple of the global timeBeginPeriod setting. So if one process has set 150Hz, and another has the default 64Hz it will only wake up every second time (since you can fit two periods of the global wakeup in the 64Hz). This is probably a good way to save power, I doubt it was done to reduce the effect one process has on another.

  6. Anton Kovalenko says:

    Missing images between “a graph like this:” and “In Windows 10 2004”,
    and “is bizarre:” and “The exact shape”

  7. Pingback: Windows Timer Resolution: The Great Rule Change – صحافة حرة FREE PRESS

  8. Pingback: Windows Timer Resolution: The Colossal Rule Alternate - JellyEnt

  9. Pingback: Windows Timer Resolution: The Great Rule Change – HackBase

  10. garaetjjte says:

    You might want to disable WordPress “pingback” feature, as it seems abused. WTF is that, it seems bots are copying content, swapping random words and reposting on some generic looking sites..? What’s even the purpose of this?

    • brucedawson says:

      That sounds like good advice. I globally unchecked these two settings in Settings->Discussions:
      – Attempt to notify any blogs linked to from the article
      – Allow link notifications from other blogs (pingbacks and trackbacks)
      I don’t know if the second one is needed, but it seemed like a good idea.

  11. Adrian says:

    On a modern machine, could the OS apply different interrupt frequencies on different cores? If one process requests a faster timer, could the OS set one core to that interval and tweak the process’s affinity to prefer that core? That could affect other processes if they happen to get scheduled on that core while the greedy process is sleeping. Thus the apparent effect on a other process could vary based on the myriad factors considered by the scheduler, but in general you’d expect them to observe something closer to the default interval.

    • Anton Kovalenko says:

      Actually that sounds like this is an adaptation for ARM processors where there are low performance cores in additional to high-performance cores

    • brucedawson says:

      Whether that helps depends a lot on the CPU design (how isolated the power domains of different processors are) and other factors too complex for me to want to analyze. I guess the short answer is “yes”, but with lots of disclaimers and provisos – thread scheduling is hard.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.