ETW Heap Tracing–Every Allocation Recorded

Event Tracing for Windows (ETW, aka xperf) is usually used to monitor CPU usage, through its sampling profiler and its ability to record detailed information about context switches. Well, ETW is also used to monitor file I/O, and disk I/O, and sometimes registry accesses, and of course GPU activity, window-in-focus, UI Delays, process lifetimes, and a few other things. Okay, so ETW gets used for a lot of different things, normally configured in the same way and recorded across the entire system.

Heap profiling is different. Even to those who are used to the odd incantations needed to record an ETW trace it can be daunting trying to figure out how to do heap profiling. I’ve postponed writing about it because it was too much work to explain how to record heap traces and analyze them. But now that UIforETW has been released it is almost trivial to record ETW heap traces, and even analyzing them is made easier.

And, as a bonus, system memory-lists and VirtualAlloc calls for all processes are recorded in all UIforETW traces and lightly documented at the bottom of this post, with optional VirtualAlloc call stacks for lightweight memory profiling.

Continue reading

Posted in memory, xperf | Tagged , , | 22 Comments

UIforETW – Windows Performance Made Easier

Event Tracing for Windows (ETW) aka xperf is an amazing tool for investigating the performance of Windows machines – I’ve blogged about it many times and it’s helped me find some amazing issues. But recording ETW traces has always been tricky. Microsoft’s wprui.exe showed some potential, but is ultimately missing some features and often gets tripped up by ETW performance bugs.

The fallback plan is to use carefully crafted batch files to record ETW traces. These offer flexibility but… uggh. Batch files? Really? Are we savages?

I finally got annoyed enough by this situation to create UIforETW. This is a tool that records ETW traces, works around ETW performance bugs, allows configuration of trace recording options, works as a trace management UI, and more.

Continue reading

Posted in xperf | Tagged , , | 22 Comments

Profiling the profiler: working around a six minute xperf hang

Anytime my computer is a little bit slow I’m likely to record a trace and take a quick look. If I’m lucky I’ll find an easy workaround, in which case I’ve got a potential blog post and a smoother running computer. And, recording a trace to my uber-fast SSD only takes a few seconds, so the cost is minimal.

But, a few months ago I noticed that it was taking six minutes to record a simple trace, whether I used wprui or my batch files. This sort of delay really takes all the fun and spontaneity out of random performance investigations.

Spoiler alert: I profiled xperf and found a workaround, and the workaround is built-in to UIforETW.

Continue reading

Posted in Investigative Reporting, Performance, Programming, xperf | Tagged , | 13 Comments

ETW Trace Compression (and xperf syntax refresher)

Despite extolling the virtues of wprui for recording ETW traces (here, and here) I’ve actually returned to using xperf.exe in batch files to do most of my trace recording. It gives me more precise control over what is recorded, and where, and with Windows 8+ it has another advantage: trace compression!

As usual the trace compression feature is lightly documented so I’m going to explain it here, and while I’m at it I’ll explain a bit more about recording traces with xperf.

Update: I eventually gave in and wrote a UI for recording ETW traces. You can read more about it at the UIforETW announcement.

xperf syntax

Continue reading

Posted in Performance, Programming, xperf | Tagged , | 14 Comments

Knowing Where to Type ‘Zero’

Some code optimizations requires complex data structures and thousands lines of code. But, in a surprising number of cases, significant improvements can be made by simple changes – sometimes as simple as typing a single zero. It’s like the old story of the boilermaker who knows the right place to tap with his hammer – he sends an itemized bill for $0.50 for tapping the valve, and $999.50 for knowing where to tap.

Continue reading

Posted in Performance | Tagged , , , | 17 Comments

Home Network Printer Setup That Works

Whenever I add a network printer to one of my Windows computers at home I end up with a reference to a hard-coded IP address. That means that the next time my home router reboots and assigns a different IP address, I lose the ability to print. Having the printer configured to a hard-coded IP address is like browsing to instead of

In order to ensure reliable printing for my family I have had to do some printer configuration jujitsu and I want to share my steps here, if only so that I’ll remember them next time.

Continue reading

Posted in Computers and Internet, Rants | Tagged | 16 Comments

Hidden Costs of Memory Allocation

IMG_8410 croppedIt’s important to understand the cost of memory allocations, but this cost can be surprisingly tricky to measure. It seems reasonable to measure this cost by wrapping calls to new[] and delete[] with timers. However, for large buffers these timers may miss over 99% of the true cost of these operations, and these hidden costs are larger than I had expected.

Further complicating these measurements, it turns out that some of the cost may be charged to another process and will therefore not show up in any timings that you might plausibly make.

Continue reading

Posted in Investigative Reporting, Performance, xperf | Tagged , | 25 Comments