Graph All the Things (Using WPT 10)

Event Tracing for Windows (ETW) has always recorded a rich set of data and allowed graphing it all on the same timeline. With the creation of UIforETW (which records more data) and the new* ETW trace viewer (which can graph custom data) the ability to visualize important patterns is better than ever before.

And, what better way to visualize how to create graphs than a video demonstrating what is discussed in this post.

Unlike Microsoft’s wprui which is a passive recorder of trace information UIforETW is also a source of ETW trace data. UIforETW uses the ETWProviders.dll that it ships with to emit events containing information about CPU temperature, frequency, and power draw**, along with battery discharge information, process working sets***, the Windows timer frequency, and user input. This serves as a good demonstration of how to emit custom events using ETWProviders.dll, but most importantly this extra data helps with trace analysis.

For more information on UIforETW see this blog post. If you need more features, that’s what github forks and pull requests are for.

With earlier versions of Windows Performance Analyzer (WPA) this extra data existed only as numbers. If you wanted graphs you’d have to paste the data into Google Sheets (or Excel – is that still a thing?). With WPT 10 there is now the ability to graph this data. The WPA Analysis Assistant does a good job of describing this new feature:

image

The WPA 10 screenshot below shows the sort of information that we can now graph. At the top is the normal CPU Usage (Precise) graph. At six seconds in it shows that the Idle thread (purple) stops running and FractalX64 (blue, and a great fractal explorer) starts consuming all CPU time. At the same time the CPU temperature (TempStatus) starts rising, from 90 degrees up to 120. This makes sense. Then the CPU temperature anomalously drops to 96 degrees before going back up to 116. Curious…

The PowerStatus (third) graph shows that the processor power draw jumps from 4.9 W to 53 W, before dropping to 9.9 W and then rising back up to 41 W. This is consistent with the temperature data, but unexpected given the still 100% CPU usage (the slight drops in FractalX64 CPU usage are all from other processes stealing some cycles).

image

This peculiar behavior makes somewhat more sense when the CPU FrequencyStatus (fourth) graph from UIforETW is shown since it shows the CPU frequency going from 800 MHz to 3 GHz and then back down to 800 MHz before recovering to 2.6-2.7 GHz.

Meanwhile the ‘official’ CPU frequency graph just shows the CPU going from 800 MHz to 2.2 GHz – missing all of the variations after the six second mark.

The data that finally makes sense of all of this is the Battery Status information in the table at the bottom. This shows that my laptop went from On AC power to Discharging at around 10.18 seconds. Apparently I unplugged my laptop and it took two seconds to adjust to the new state of affairs before it continued on in a slightly less Turbo Boosted state. The last On AC power sample is selected and its location in the timeline shows up on all of the graphs.

Mystery solved.

Frequency measurement frailty

The Frequency by Cpu graph is supplied by Windows, through the POWER provider, but it has some problems. Apparently there are some times of CPU frequency changes that the CPU knows about but the OS doesn’t. Either that or the POWER provider is buggy on Windows 7. Either way it cannot be completely trusted.

Another problem with the Windows POWER provider is that it fails completely on my work machine. It’s probably my fault for having more than thirty-two logical CPUs, but I still want this first-world problem fixed.

The final problem with the Windows POWER provider is that it knows nothing about thermal throttling. Unfortunately, I don’t think that the Intel Power Gadget frequency data reflects that either, so thermal throttling must still be inferred from CPU temperature and other factors (at least until I rewrite my actual-CPU-frequency-monitor™). Thermally throttling appears to be an increasingly serious problem. I think it is shocking that some OEMs are selling machines with cooling solutions that are insufficient to let their CPUs run at their rated speeds. If you buy one of these defective machines you should return it – try RealTemp or one of the other programs I suggested in my original thermal throttling blog post. Or try Intel Extreme Tuning Utility which has a graph dedicated to thermal throttling (thanks to jaimemmoreno for the tip).

Graphing glitches

There are a few limits on the graphing. There doesn’t appear to be a way to adjust the range of the graphs, so the CPU temperature graphs are hard to read, with most of the dynamic range squished into the top of the graph.

The graphing also can’t handle negative numbers, which is why I didn’t graph the battery discharge data.

imageThe default chart type of ‘Line’ works poorly for me so I always change it to Stacked Bars. The Select Chart Type option appears after you put the data column to be graphed on the far right (and remove the Time column.)

WPA’s default aggregation mode for the generic events data fields is None, which won’t graph. The Generic Events presets that ship with WPA 10 have been changed to have Average as the aggregation type for easier graphing. You can change this to Maximum or some other setting as needed.

With the v1.12 release of UIforETW there are new presets to make this much simpler – just filter to the desired data and then select Randomascii Graph Field 2 to graph field 2 (presets for graphs 1 to 6 are available). To see this in action watch this video.

Disclaimers and addendums

The trace that I recorded can be found in the bigfiles repo in the ETWTraces directory – look for “2015-07-30_08-29-33 FX battery and power test.*”.

image* The new ETW trace viewer is the Windows Performance Toolkit (WPT) 10 version of WPA, sub-version 10*2^10. It shipped in its final version on July 29, 2015 as part of the Windows 10 SDK. There are also redistributable installers that will install just WPT.

image** CPU temperature, frequency, and power draw all require having Intel Power Gadget 3.0 installed. A reboot may be required to ensure that UIforETW sees the environment variable that indicates where to find the Intel Power Gadget binaries.See PowerStatus.cpp for details. I’d be happy to add support for AMD processor monitoring – call me @amd.

image*** Working set monitoring only occurs for a specified set of processes – see the UIforETW settings dialog for details. Note that ‘*’ will monitor all processes but this is not recommended when expensive working set monitoring (PSS and private working set) is enabled. See WorkingSet.cpp for details.

Happy graphing!

For instruction on how to do ETW trace analysis take a look at the series of training videos I created.

About brucedawson

I'm a programmer, working for Google, focusing on optimization and reliability. Nothing's more fun than making code run 10x as fast. Unless it's eliminating large numbers of bugs. I also unicycle. And play (ice) hockey. And sled hockey. And juggle. And worry about whether this blog should have been called randomutf-8. 2010s in review tells more: https://twitter.com/BruceDawson0xB/status/1212101533015298048
This entry was posted in Performance, Programming, xperf and tagged , , , , , , . Bookmark the permalink.

15 Responses to Graph All the Things (Using WPT 10)

  1. Harri says:

    About your rant on thermal throttling: I agree that in a desktop PC I would expect the OEM to implement cooling that can cope with the maximum thermal load. But in mobile devices this is not always feasible or even desirable. Especially in passively cooled devices the laws of thermodynamics put very strict limits for the amount of power that can be used in long running use cases.
    Still, there are very good reasons to design chipsets and products with higher peak performance that might not be thermally sustainable for longer that a few seconds (to make e.g. application startup, web page rendering, camera snapshot processing, etc. responsive)
    Thermal throttling is the unfortunate consequence that is needed to protect the device and users from damage. I think it is reasonable to expect that when using a well behaving application it should (almost) never happen. But I don’t believe OEMs can solve this problem alone, applications and operating systems need to be aware of the reality and collaborate in maintaining the thermal load at sustainable level. For that, we need good tools, so thanks for your contribution on that front!

    • brucedawson says:

      I agree that thermal throttling is better than damaging the hardware. Beyond that I disagree, at least for laptops (phones I’m probably willing to give a pass to).

      1) There is already a way to have brief bursts of higher CPU frequency. Intel’s version is called Turbo Boost and it raises the frequency above the advertised level.
      2) It is not feasible for applications to maintain the thermal load at a sustainable levels. Sometimes an application (particularly games, but also web browsers on some pages) need to max out several cores for more than a few seconds. Using all cores isn’t a software defect, it is a design goal in order to unlock the full power of the CPU.

      If a machine can briefly go above its rated frequency (Turbo Boost) but then may be forced to run below its rated frequency then its rated frequency has *no*meaning*. In this brave new world a machine that can Turbo Boost to 3.6 GHz could reasonably be advertised as running at any frequency from 0 GHz to 3.6 GHz.

      If an OEM wants to sell a machine that can’t maintain its rated frequency that’s fine, but they need to disclose that fact so that I know what I’m getting.

  2. Pingback: New Xperf and new WPA in the new WPT | Random ASCII

  3. jaimemmoreno says:

    I used Intel’s powergadget on my Mac laptop.
    https://software.intel.com/en-us/blogs/2012/12/13/using-the-intel-power-gadget-api-on-mac-os-x

    Pretty much showed me how easily it get’s thermal throttled. Easy to see how the cpu frequency reaches max turbo speed then slows down drastically when temp get close to max cpu temp. Running video encoding or any 3D program that uses up all core for 100% for any length of time more than a couple of minutes will do it!
    Anyways, I sent a bug report to Apple a while back about this and they replied that it was working as designed 😦
    Should’ve known they would sacrifice cpu performance for battery life and to save space, they are after all well known for their thin and light laptops!

    • brucedawson says:

      It’s good to see that Power Gadget is showing the slowdown. How drastic is it?

      I think you have two choices. You can see if the behavior contradicts advertising claims and file a consume complaint if it does. Or you can return the laptop and buy one that is properly cooled. Good luck.

      • jaimemmoreno says:

        Don’t recall the exact numbers but it was pretty significant.
        Anyways, I should point out that in this case they didn’t actually blame hot temps in the Macbook Pro, which it did get pretty warm, for the throttling but inadequate power being supplied from said laptop.
        Here’s their response:
        “After reviewing your submission engineering has determined that the behavior you reported is currently functioning as designed.

        The performance was being limited because the machine was using almost as much power as could be supplied to it. Disconnecting some of the USB devices, iPad and reducing the screen brightness may help squeeze out a little more performance, but this is expected behavior for the hardware.

        “Due to varying power characteristics, some parts with Intel Turbo Boost Technology 2.0 may not achieve maximum turbo frequencies when running heavy workloads and using multiple cores concurrently.”

        http://www.intel.com/content/www/us/en/architecture-and-technology/turbo-boost/turbo-boost-technology.html

        • brucedawson says:

          As a consumer I’m not sure i much care whether the throttling is due to insufficient power or insufficient cooling. I’d rather that they both be sufficient. Although, at least in the power case you *might* be able to make it work by unplugging excess devices. But lowering the screen brightness in order to get better compile times? That’s just lame.

          Thanks for the Tuning Utility link.

    • jaimemmoreno says:

      Oh yeah forgot to mention if you are on Windows it’s even easier to tell if your cpu is being throttled and the percentages even since Intel® Extreme Tuning Utility, which is meant for overclockers, but can be installed and used by anyone, actually has a graph entirely dedicated to thermal throttle!
      http://www.intel.com/content/www/us/en/motherboards/desktop-motherboards/desktop-boards-software-extreme-tuning-utility.html

  4. Pingback: UIforETW – Windows Performance Made Easier | Random ASCII

  5. Joel says:

    Hi, Thank you for your posts and the UIforETW tool. I was wondering if you could advise me on whether it is possible to get additional graphs. I’ve been trying to profile a program using XPERF/WPR and even when I use seemingly appropriate selections, there are no graphs to view specifically memory usage per process and network utilization per process. WPA for 8.1, 10 or XPERFVIEW – none of them show me the data I am looking for which is per process cpu, memory, disk and network utilisation. I’ve trawled pages and pages looking for this scenario and/or additional analysis profiles for WPA and I can’t find anything?

    Many people online recommend PERFMON, but imho it is a very poor tool for analysing system performance. It has granular counts for components like .NET framework and CPU usage, but in among all those counts there seem to be few useful for resource/process performance. The necessity to set up data collection sets for each scenario is a pain, as is the lack of per process counts and in my opinion it has pretty poor reporting tools. Plus it is not possible to dynamically start/stop colelctions at will once they are created.

    Basically RESMON does exactly what I want, but it doesn’t record!!!
    I was almost certain that when I ran boot traces in the past I had graphs for memory usage per process? Would appreciate if you know of anything that would help. Thanks again.

  6. Pingback: ETW Central | Random ASCII

  7. rohit says:

    Hi bruce thanks for your post.

    I am looking for a tool which shows me electric power consumed by a particular process ( eg. SQL.exe or chrome.exe) even if the Desktop/laptop is on a battery.I have used both tool WPA and Intels Gadget 3.0.Not helpful thou.

    Thanks
    Wazoomba

    • brucedawson says:

      I hear that OSX displays this information in their task manager, but I’m not aware of a way to get it on Windows. You need to sample the CPU power data on each context switch, and then do some guesswork to allocate the power consumed to the running processes.

      It’s necessarily an inexact science because perfect accounting for a process that wakes up the CPU frequently and stops it from going into a low power state is not even well defined.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.