Windows Timer Resolution: The Great Rule Change

The behavior of the Windows scheduler changed significantly in Windows 10 2004, in a way that will break a few applications, and there appears to have been no announcement, and the documentation has not been updated. This isn’t the first time this has happened, but this change seems bigger than last time.

The short version is that calls to timeBeginPeriod from one process now affect other processes less than they used to, but there is still an effect, and thread delays from Sleep and other functions may be less consistent than they used to be (see [updated] section below).

I think the new behavior is an improvement, but it’s weird, and it deserves to be documented. Fair warning – all I have are the results of experiments I have run, so I can only speculate about the quirks and goals of this change. If any of my conclusions are wrong then please let me know and I will update this.

Timer interrupts and their raison d’être

A geeky clockFirst, a bit of operating-system design context. It is desirable for a program to be able to go to sleep and then wake up a little while later. This actually shouldn’t be done very often – threads should normally be waiting on events rather than timers – but it is sometimes necessary. And so we have the Windows Sleep function – pass it the desired length of your nap in milliseconds and it wakes you up later, like this:

Sleep(1);

It’s worth pausing for a moment to think about how this is implemented. Ideally the CPU goes to sleep when Sleep(1) is called, in order to save power, so how does the operating system (OS) wake your thread if the CPU is sleeping? The answer is hardware interrupts. The OS programs a timer chip that then triggers an interrupt that wakes up the CPU and the OS can then schedule your thread.

The WaitForSingleObject and WaitForMultipleObjects functions also have timeout values and those timeouts are implemented using the same mechanism.

If there are many threads all waiting on timers then the OS could program the timer chip with individual wakeup times for each thread, but this tends to result in threads waking up at random times and the CPU never getting to have a long nap. CPU power efficiency is strongly tied to how long the CPU can stay asleep (8+ ms is apparently a good number), and random wakeups work against that. If multiple threads can synchronize or coalesce their timer waits then the system becomes more power efficient.

There are lots of ways to coalesce wakeups but the main mechanism used by Windows is to have a global timer interrupt that ticks at a steady rate. When a thread calls Sleep(n) then the OS will schedule the thread to run when the first timer interrupt fires after the time has elapsed. This means that the thread may end up waking up a bit late, but Windows is not a real-time OS and it actually cannot guarantee a specific wakeup time (there may not be a CPU core available at that time anyway) so waking up a bit late should be fine.

The interval between timer interrupts depends on the Windows version and on your hardware but on every machine I have used recently the default interval has been 15.625 ms (1,000 ms divided by 64). That means that if you call Sleep(1) at some random time then you will probably be woken sometime between 1.0 ms and 16.625 ms in the future, whenever the next interrupt fires (or the one after that if the next interrupt is too soon).

In short, it is the nature of timer delays that (unless a busy wait is used, and please don’t busy wait) the OS can only wake up threads at a specific time by using timer interrupts, and a regular timer interrupt is what Windows uses.

Some programs (WPF, SQL Server, Quartz, PowerDirector, Chrome, the Go Runtime, many games, etc.) find this much variance in wait delays hard to deal with but luckily there is a function that lets them control this. timeBeginPeriod lets a program request a smaller timer interrupt interval by passing in a requested timer interrupt interval. There is also NtSetTimerResolution which allows setting the interval with sub-millisecond precision but that is rarely used and never needed so I won’t mention it again.

Decades of madness

Here’s the crazy thing: timeBeginPeriod can be called by any program and it changes the timer interrupt interval, and the timer interrupt is a global resource.

Let’s imagine that Process A is sitting in a loop calling Sleep(1). It shouldn’t be doing this, but it is, and by default it is waking up every 15.625 ms, or 64 times a second. Then Process B comes along and calls timeBeginPeriod(2). This makes the timer interrupt fire more frequently and suddenly Process A is waking up 500 times a second instead of 64 times a second. That’s crazy! But that’s how Windows has always worked.

At this point if Process C came along and called timeBeginPeriod(4) this wouldn’t change anything – Process A would continue to wake up 500 times a second. It’s not last-call-sets-the-rules, it’s lowest-request-sets-the-rules.

To be more specific, whatever still running program has specified the smallest timer interrupt duration in an outstanding call to timeBeginPeriod gets to set the global timer interrupt interval. If that program exits or calls timeEndPeriod then the new minimum takes over. If a single program called timeBeginPeriod(1) then that is the timer interrupt interval for the entire system. If one program called timeBeginPeriod(1) and another program then called timeBeginPeriod(4) then the one ms timer interrupt interval would be the law of the land.

powercfg /energy /duration 5This matters because a high timer interrupt frequency – and the associated high-frequency of thread scheduling – can waste significant power, as discussed here.

One case where timer-based scheduling is needed is when implementing a web browser. The JavaScript standard has a function called setTimeout which asks the browser to call a JavaScript function some number of milliseconds later. Chromium uses timers (mostly WaitForSingleObject with timeouts rather than Sleep) to implement this and other functionality. This often requires raising the timer interrupt frequency. In order to reduce the battery-life implications of this Chromium has been modified recently so that it doesn’t raise the timer interrupt frequency above 125 Hz (8 ms interval) when running on battery.

timeGetTime

timeGetTime (not to be confused with GetTickCount) is a function that returns the current time, as updated by the timer interrupt. CPUs have historically not been good at keeping accurate time (their clocks intentionally fluctuate to avoid being FM transmitters, and for other reasons) so they often rely on separate clock chips to keep accurate time. Reading from these clock chips is expensive so Windows maintains a 64-bit counter of the time, in milliseconds, as updated by the timer interrupt. This timer is stored in shared memory so any process can cheaply read the current time from there, without having to talk to the timer chip. timeGetTime calls ReadInterruptTick which at its core just reads this 64-bit counter. Simple!

Since this counter is updated by the timer interrupt we can monitor it and find the timer interrupt frequency.

The new undocumented reality

With the Windows 10 2004 (April 2020 release) some of this quietly changed, but in a very confusing way. I first heard about this through reports that timeBeginPeriod didn’t work anymore. The reality was more complicated than this.

A bit of experimentation gave confusing results. When I ran a program that called timeBeginPeriod(2) then clockres showed that the timer interval was 2.0 ms, but a separate test program with a Sleep(1) loop was only waking up about 64 times a second instead of the 500 times a second that it would have woken up under previous versions of Windows.

It’s time to do science

I then wrote a pair of programs which revealed what was going on. One program (change_interval.cpp) just sits in a loop calling timeBeginPeriod with intervals ranging from 1 to 15 ms. It holds each timer interval request for four seconds, and then goes to the next one, wrapping around when it is done. It’s fifteen lines of code. Easy.

The other program (measure_interval.cpp) runs some tests to see how much its behavior is altered by the behavior of change_interval.cpp. It does this by gathering three pieces of information.

  1. It asks the OS what the current global timer resolution is, using NtQueryTimerResolution.
  2. It measures the precision of timeGetTime by calling it in a loop until its return value changes. When it changes then the amount it changed by is its precision.
  3. It measures the delay of Sleep(1) by calling it in a loop for a second and counting how many calls it can make. The average delay is just the reciprocal of the number of iterations.

@FelixPetriconi ran the tests for me on Windows 10 1909 and I ran the tests on Windows 10 2004. The results (cleaned up to remove randomness) are shown here:

Table of timeGetTime precision and Sleep(1) delays

What this means is that timeBeginPeriod still sets the global timer interrupt interval, on all versions of Window. We can tell from the results of timeGetTime() that the interrupt fires on at least one CPU core at that rate, and the time is updated. Note also that the 2.0 on row one for 1909 was 2.0 on Windows XP, then 1.0 on Windows 7/8, and is apparently back to 2.0? I guess?

However the scheduler behavior changes dramatically in Windows 10 2004. Previously the delay for Sleep(1) in any process was simply the same as the timer interrupt interval (with an exception for timeBeginPeriod(1)), giving a graph like this:

Sleep(1) delays on Windows 10 1909 vs. Global interrupt interval

In Windows 10 2004 the mapping between timeBeginPeriod and the sleep delay in another process (one that didn’t call timeBeginPeriod) is bizarre:

Sleep(1) delays on Windows 10 2004 vs. Global interrupt interval

Why?

Implications

[Updated] The section below was added after publishing and then updated several times.

As was pointed out in the reddit discussion, the left half of the graph seems to be an attempt to simulate the “normal” 15.625 ms delay as closely as possible given the available precision of the global timer interrupt. That is, with a 6 millisecond interrupt interval they delay for ~12 ms (two cycles) and with a 7 millisecond interrupt interval they delay for ~14 ms (two cycles) – that matches the data fairly well. However what about with an 8 millisecond interrupt interval? They could sleep for two cycles but that would give an average delay of 16 ms, and the measured value is more like 14.5 ms.

Closer analysis shows that Sleep(1) when another process has called timeBeginPeriod(8) returns after one interval about 20% of the time and after two intervals the rest. Therefore three calls to Sleep(1) resulting in a average delay of 14.5 ms. This variation in the handling of Sleep(1) happens sometimes at other timer interrupt intervals but is most consistent when it is set to 8 ms.

This is all very weird, and I don’t understand the rationale, or the implementation. The intentional inconsistency in the Sleep(1) delays is particularly worrisome. Maybe it is a bug, but I doubt it. I think that there is complex backwards compatibility logic behind this. But, the most powerful way to avoid compatibility problems is to document your changes, preferably in advance, and this seems to have been slipped in without anyone being notified.

This behavior also seems to apply to CreateWaitableTimerEx and its so-far-undocumented CREATE_WAITABLE_TIMER_HIGH_RESOLUTION flag, based on the quick-and-dirty waitable timer tests that you can find here (requires Windows 10 1803 or higher).

Most programs will be unaffected. If a process wants a faster timer interrupt then it should be calling timeBeginPeriod itself. That said, here are the problems that this could cause:

  • A program might accidentally assume that Sleep(1) and timeGetTime have similar resolutions, and that assumption is broken now. But, such an assumption seems unlikely.
  • A program might depend on a fast timer resolution and fail to request it. There have been multiple claims that some games have this problem and there is a tool called Windows System Timer Tool and another called TimerResolution 1.2 that “fix” these games by raising the timer interrupt frequency. Those fixes presumably won’t work anymore, or at least not as well. Maybe this will force those games to do a proper fix, but until then this change is a backwards compatibility problem.
  • A multi-process program might have its master control program raise the timer interrupt frequency and then expect that this would affect the scheduling of its child processes. This used to be a reasonable design choice, and now it doesn’t work. This is how I was alerted to this problem. The product in question now calls timeBeginPeriod in all of their processes so they are fine, thanks for asking, but their software was misbehaving for several months with no explanation.

Sacrifice

The change_interval.cpp test program only works if nothing has requested a higher timer interrupt frequency. Since both Chrome and Visual Studio have a habit of doing this I had to do most of my experimentation with no access to the web while writing code in notepad. Somebody suggested Emacs but wading into that debate is more than I’m willing to do.

I’d love to hear more about this from Microsoft, including any corrections to my analysis. Discussions:

About brucedawson

I'm a programmer, working for Google, focusing on optimization and reliability. Nothing's more fun than making code run 10x as fast. Unless it's eliminating large numbers of bugs. I also unicycle. And play (ice) hockey. And sled hockey. And juggle. And worry about whether this blog should have been called randomutf-8. 2010s in review tells more: https://twitter.com/BruceDawson0xB/status/1212101533015298048
This entry was posted in Environment, Investigative Reporting, Performance, Rants and tagged , . Bookmark the permalink.

24 Responses to Windows Timer Resolution: The Great Rule Change

  1. Lucian Bargaoanu says:

    I think some graphics are missing. Above “Implications”.

  2. Pingback: Windows Timer Resolution: The Great Rule Change – Your Cheer

  3. Pingback: Windows Timer Resolution: The Great Rule Change - GistTree

  4. Pingback: === popurls.com === popular today

  5. Morten Ofstad says:

    I guess it’s simply skipping on waking up your thread until its (local) timeBeginPeriod setting is going to be greater than the next multiple of the global timeBeginPeriod setting. So if one process has set 150Hz, and another has the default 64Hz it will only wake up every second time (since you can fit two periods of the global wakeup in the 64Hz). This is probably a good way to save power, I doubt it was done to reduce the effect one process has on another.

  6. Anton Kovalenko says:

    Missing images between “a graph like this:” and “In Windows 10 2004”,
    and “is bizarre:” and “The exact shape”

  7. Pingback: Windows Timer Resolution: The Great Rule Change – صحافة حرة FREE PRESS

  8. Pingback: Windows Timer Resolution: The Colossal Rule Alternate - JellyEnt

  9. Pingback: Windows Timer Resolution: The Great Rule Change – HackBase

  10. garaetjjte says:

    You might want to disable WordPress “pingback” feature, as it seems abused. WTF is that, it seems bots are copying content, swapping random words and reposting on some generic looking sites..? What’s even the purpose of this?

    • brucedawson says:

      That sounds like good advice. I globally unchecked these two settings in Settings->Discussions:
      – Attempt to notify any blogs linked to from the article
      – Allow link notifications from other blogs (pingbacks and trackbacks)
      I don’t know if the second one is needed, but it seemed like a good idea.

  11. Adrian says:

    On a modern machine, could the OS apply different interrupt frequencies on different cores? If one process requests a faster timer, could the OS set one core to that interval and tweak the process’s affinity to prefer that core? That could affect other processes if they happen to get scheduled on that core while the greedy process is sleeping. Thus the apparent effect on a other process could vary based on the myriad factors considered by the scheduler, but in general you’d expect them to observe something closer to the default interval.

    • Anton Kovalenko says:

      Actually that sounds like this is an adaptation for ARM processors where there are low performance cores in additional to high-performance cores

    • brucedawson says:

      Whether that helps depends a lot on the CPU design (how isolated the power domains of different processors are) and other factors too complex for me to want to analyze. I guess the short answer is “yes”, but with lots of disclaimers and provisos – thread scheduling is hard.

  12. Clearly you’ve shown some new behavior. Since the KPROCESS and similar structures for the 2004 edition have additions specifically for time management, I have no trouble believing that what you’ve found is specifically new for 2004. I must add this to my ever-growing list of things to look into, but where is the Great Rule Change?

    The rule has always been as you say: “If a process wants a faster timer interrupt then it should be calling timeBeginPeriod itself”, though I would add that busy programmers do in practice need to be reminded to call timeEndPeriod when they no longer need the finer resolution. When I say “always”, I mean all the way back to version 3.10, though the rule was then only in principle since the interrupt period is fixed at startup. But the rule applied for real as early as version 3.50.

    The rule had to be established early (even if the implementation was for many years rudimentary) because as much as an operating system for programs in general wants to give each program the illusion of owning the computer, delivering this ideal for access to an interval timer is all but impossible. Even if each processor has its own timer, you can’t expect that Windows will reprogram the timer each time it switches the processor to a thread from a different process. A timer’s interrupts are anyway how the system itself learns the passing of time without having to keep asking. These interrupts are almost necessarily a shared resource. They need to be frequent enough to meet the most demanding of realistic expectations from programs, but there’s a balance since interrupts that are too frequent have their own deleterious effects on performance all round.

    Perhaps I’ve been looking at this for too long, but from this perspective it hardly seems like “decades of madness” that programs are each given the means to tell Windows how fine a resolution they desire and Windows sets the interrupt period to meet the finest requirement. If you need that your wait for 10ms be 10, not 11, then you tell Windows you want 1ms resolution. If you can tolerate that your 10ms may be 15, then you tell Windows you’re OK with 5ms resolution. You may get finer resolution but you’ve indicated that you don’t require finer resolution.

    What is mad is to depend on the precise implementation. Yes, Microsoft has written that the one process’s request for the finest resolution sets it globally, but that’s just Microsoft presenting an implementation detail as helpful background. They make a rod for their own backs by skimping on the documentation, making it inevitable that programmers end up grasping at every implementation detail they can find. Against this is that Microsoft’s technical writers, such as they are, perhaps assume that programmers read the documentation judiciously and won’t think a design is reasonable if it depends on a detail that the programmer otherwise describes as crazy.

    As you note, the new behaviour you see looks to be an attempt at improvement. A presumption in the implementation so far is that programs won’t be troubled if the timer resolution is finer than they’ve asked for. Put aside whether they can be if well written. There is plausibly some waste in making a thread ready earlier than its process has indicated is tolerable. A push to give each thread an average delay that more closely aligns with its process’s expressed expectation should therefore not surprise. I expect, though, that compatibility considerations apply. A programmer that plays the game of calling timeBeginPeriod must know that other programs may play too. Such programs may have coordinated. They may better be left alone. So I should not be surprised if waste elimination is sought only (or first) for processes that do not play the game. Well, we’ll know once someone does the research.

    This brings me to your graphs. Your graph of old behaviour can be explained very well – indeed, by the simple model you present in which the caller of Sleep(1) becomes ready for return at the first timer interrupt that occurs 1ms or more after making the call. Assume the interrupt period is constant, having been set by the other program and not changed by any other. The first Sleep(1) is random with respect to these interrupts. All the remainder are called soon after an interrupt. Over a long enough run, you measure the average time in calls to Sleep(1) that are syncronised with interrupts. Mostly then, you just measure the interrupt period. When the interrupt period is 1ms, the time you spend just to call Sleep(1) means you’ll still be asleep for the first interrupt after the one that woke you but you’ll catch the second, and your experiment therefore measures 2ms.

    Explaining your graph of new behaviour will have to wait for research, but there can’t be any surprise that you don’t get a neat result. Microsoft may be aiming that the process that has not asked for a finer resolution should find that its average time spent in a random Sleep(1) is unaffected by the system’s use of a finer resolution than the default. But your calls to Sleep(1) are not random!

    • brucedawson says:

      That is a _very_ long comment.

      > where is the Great Rule Change?

      The great rule change is that the effect of timeBeginPeriod used to work one way, and now it works another, and this difference has not been documented. This broke at least one program, and almost certainly more.

      I am hopeful that Microsoft will eventually document this, but I don’t think there is much more for them to say. It would just be good for the official documentation to say what I discovered through experimentation here.

      • It was a long comment because I wanted to present my case carefully that there is no rule change. What you present as the new rule is what the rule really has been all along. All that you’ve yet shown has changed is an implementation detail that Microsoft will have documented so that programmers should understand that setting the timer resolution has consequences for others and is better done considerately. They won’t have documented it so that programmers start depending on the particulars of those consequences.

        True, their documentation is terrible, and so misunderstanding is inevitable. But do you not think programmers bear some responsibility too to think through the consequences of what they depend on? Here, they were being told that in this, of all things, you don’t own the processor. Please be a good citizen. Use no more of this feature than you really do need. How that turns into an invitation to depend on details of the feature’s implementation, I don’t know, and they themselves ought to have been able to see how easily they might tie themselves in knots – as when you yourself describe the old implementation as decades of madness, yet you say that depending on it in some particular way for coordinating processes is a reasonable design.

        I’ll add it to my list of things I’ve seen Googlers call reasonable or even elegant but which give me the shivers. Less flippantly, I do agree with you that since Microsoft did document the implementation detail without spelling out that it was not to be depended on and since programmers inevitably will have depended on it, Microsoft ought to bear some cost and write some better documentation. But that’s enough dreaming!

        • brucedawson says:

          We must be arguing semantics because it is quite clear that the behavior has changed. It hasn’t changed for processes that call timeBeginPeriod(1), but it has changed for any process that doesn’t call timeBeginPeriod(1) running on a system where some other process has.

          That’s all I claimed, and that is definitely true. Since the reality of this change is indisputable you must be arguing about whether that counts as a rule change, which seems like an uninteresting argument.

          You seem to be arguing that since the old behavior was not documented there was no official “rule” and therefore the change cannot be a “rule” change. I disagree with that analysis, and I also think you could present that argument much more clearly if you condensed it down to a single sentence, instead of a mini blog post. See the first sentence of this paragraph for an example.

          I know of one program that was broken and had to be modified because of this change. By definition this means that it was a breaking change, and I’ll leave it up to the Talmudic scholars to decide whether “the great rule change” is an accurate description or not.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.