Exporting Arbitrary Data from xperf ETL files

The 8.1 and above versions of xperf/WPA/WPT comes with a tool called wpaexporter. This tool works as promised and lets you export arbitrary summary tables to .csv files, thus allowing for automated analysis of xperf traces.

Update, May 2020: Exporting to .csv files can be handy, but is also fussy to work with. For most purposes I now recommend using the Trace Processor library to write C# tools to do automated trace analysis. I have written a tutorial on how to use the Trace Processor package. Try it.

(for an overview of WPT 8.1 see New Version of Xperf–Upgrade Now, for recording traces see ETW Central)

The documentation for wpaexporter is pretty light – it just explains the command line options. It doesn’t bother to explain how the profile file that you provide is translated into a .csv file. In particular, it doesn’t say how grouping affects the results. I had to experiment for a little while to figure out how it worked.

(for an example of using wpaexporter see Process Tree from an Xperf Trace) and UIforETW’s labscripts.

It turns out that the design is quite simple. Set up your summary table and look at it in WPA without expanding any of the grouping columns. This is what your .csv file will look like. It’s exactly the same as doing a copy/paste of the entire unexpanded summary table. This means that if you have more than one column to the left of the orange grouping bar then those extra columns are not used for any sort of grouping. It also means that if you have any grouping columns then any data columns which cannot be summarized will not be displayed.

In short, you can group by one column and get summarized data for each row, or you can do no grouping and get raw data. Those are your two options.

Update: see Kyle Sabo’s comment below for how to expand particular columns in order to gain more data display options.

And filtering is good.

Getting a data summary

One use for wpaexporter would be to get a summary of the CPU time consumed by every process on the system. The most accurate CPU usage data comes from the context switch information that is shown in the CPU Usage (Precise) table.

(for documentation of the columns in this graph see The Lost Xperf Documentation–CPU Scheduling)

You can get per-process data by adjusting the summary table to have New Process as the only column to the left of the orange grouping bar. Or you could have New Process Name instead if you want multiple processes with the same name collected together. Then, to the right of the grouping bar put your data columns. Columns like New Thread Id don’t make sense because they can’t be summarized – the sum or average of thread IDs isn’t meaningful. Here’s one possible layout:

image

In this layout the first data column records the number of context switch events per process, and the last column records how many ms of CPU time were consumed. Note that I have two TimeSinceLast (us) columns – one is the sum of all of the switched out time, and the other is the maximum switched out time. When using a column which has multiple options for how they will be summarized you need to choose the one that makes most sense for your scenario.

Now you can export this profile using the Profiles menu, Export… command. I saved it as:

CPUUsageByProcess.wpaProfile

Then you run wpaexporter like this:

wpaexporter test.etl -profile CPUUsageByProcess.wpaProfile

The name of the file created depends on the name of the summary table and the name of the view you created. To set the name of your view open the View Editor (Ctrl+E), click on Manage…, and type in a name for your preset. In my case I named the view ByProcess so this created a file called CPU_Usage_(Precise)_ByProcess.csv which I could then load in to Excel. The data is identical to the unexpanded data that was shown in WPA – except for formatting differences and the loss of the sum and max subscripts on some columns.

image

Not bad. It’s not hard to imagine writing some tools to parse this and issue warnings about excessive background CPU usage.

Getting all the data

Sometimes you don’t want a summary, you want all of the data of a particular type. That might mean all context switches, all CPU samples, all file I/O, all disk I/O, etc. Let’s take CPU sampling data as an example.

(for documentation of the columns in this graph see The Lost Xperf Documentation–CPU sampling)

Normally you would group by process, thread, and stack and then drill down in the UI, but with wpaexporter that doesn’t work. Those grouping columns will all be collapsed and you’ll just get a summary. Instead you have to drag the orange grouping bar to the left and you end up with something like this:

image

This is quite useless in the UI because without grouping the volume of data is too great – and most of the samples are from the idle thread! You could export this data but it would probably contain more information than you really need. So, you need to filter. Start by opening the View Editor (Ctrl+E), then click Advanced. You’ll be on the Filter tab and you can arbitrarily add to the filter by clicking the green plus sign. You could filter out all rows where the Process Name equals Idle:

image

Note that for this to work you have to manually type in “OR” in front of the new expression that WPA adds.

Or, you could filter out all rows where the Process Name does not equal WPRUI.exe:

image

Again, don’t forget to manually type in “OR” or “AND” or whatever – I’ve circled where you need to put it in the picture above.

With the filter in place the raw data looks like this:

image

I removed the Count column because with this layout it will always be one. The weight column is still useful because if you change the sampling frequency then the weight (length in ms) of each sample will change.

In this case I named my view WPRUI_Stacks. After saving the profile you just have to run wpaexporter again – this time adding “-symbols” to the end so that the stacks are exported with symbols.

wpaexporter test.etl -profile CPUSamplesWPRUI.wpaProfile -symbols

Again, just load the resulting .csv file into Excel and you can see the raw data:

image

Loading the data into Excel isn’t how you will normally use the data – the whole point is to write scripts to process it and look for common patterns – however loading it into Excel or a text editor is useful for confirming that you have exported the data that you are interested in.

When parsing the data be aware that some symbols, especially for template functions, contain commas. These names will be properly quoted in the .csv file, just be aware that you have to handle this case.

Caveats

As of the 10.0.14393 version (Windows 10 Anniversary edition, summer 2016) the command-line parser in wpaexporter is a bit fussy. If you want to use the very useful -range command then the order of arguments suddenly matters. There are other illegal orderings, with no clear pattern. This works to export all of the data from 3 seconds to 13 seconds in the trace:

wpaexporter -profile profile.wpaProfile -i trace.etl -range 3s 13s  2>nul

The “2>nul” suffix is because wpaexporter spews a lot of unavoidable and uninteresting warning messages.

Multi-export

Launching wpaexporter, especially on a big trace, is sllooowww, so dumping multiple reports can take a long time. However you can tell wpaexporter to load a single trace and then do multiple reports. This is done using the -exporterconfig option and a .json file. Examples of creating such a config file can be found in one of the UIforETW labscripts.

That’s all folks

That’s really all there is to it. The feature works and it means that you can now export any data that you want from an ETL file. I can think of a couple of uses for this right now. I should probably update my flamegraph scripts – let me know if you do that.

I’m pleased that this feature exists now. However I’m always greedy so here is how I think this feature could be improved.

  • There should be an option to send the results to stdout. This would allow the output to be processed and filtered before going to disk and it would simplify the command line options. Learn from Unix.
  • wpaexporter should print the name of the file that it saves the data to. It’s a little thing but it would get rid of another possible source of confusion for new users.
  • wpaeporter should explicitly handle the case where it’s output file is locked – right now it displays the raw .Net call stack when the CreateFile operation fails. Sloppy.
  • The rules for how a .profile file is mapped to a .csv file should be documented. Maybe I’m too dim to instantly figure it out, or maybe I’m too smart to discard the many other ways that the translation could have been done, but either way it would be nice to have it explained (for those who don’t find this blog post).

But hey, cool feature. It handily beats the previously available alternatives.

The Old New Thing has an alternate summary of wpaexporter.

About brucedawson

I'm a programmer, working for Google, focusing on optimization and reliability. Nothing's more fun than making code run 10x as fast. Unless it's eliminating large numbers of bugs. I also unicycle. And play (ice) hockey. And sled hockey. And juggle. And worry about whether this blog should have been called randomutf-8. 2010s in review tells more: https://twitter.com/BruceDawson0xB/status/1212101533015298048
This entry was posted in Documentation, Performance, xperf and tagged , , , . Bookmark the permalink.

16 Responses to Exporting Arbitrary Data from xperf ETL files

  1. Kyle Sabo says:

    You mention groups being collapsed as an issue… were you aware that you can use the Expansion tab (to the right of the Filter tab) to selectively expand certain columns or groups? Expanding groups seems to be hit-or-miss, but you can reliably expand columns by using something along the lines of “[Series Name]:=Column Name”. With this you can get both all the raw data you are interested in while still getting aggregated values without computing them yourself.

  2. tama says:

    You seem really good with xperf. How would you troubleshoot slow boot/logins with a typical enterprise windows 7 setup with a lot of 3rd party software (probably too much disk seeking between them)?

    • brucedawson says:

      Recording an xperf trace of the boot process is the way to investigate its performance. Search on “xperf boot trace” or start by reading here:

      http://blogs.msdn.com/b/canberrapfe/archive/2013/05/22/xperf-boot-traces.aspx

      Slowdowns could be caused by excessive disk I/O, busy loops, timeouts, use of heavyweight runtimes, or many other causes. Analysis of boot traces can be tricky — xperf analysis often is — but the information you need to definitively identify the cause of the slow boot/login is definitely there. Xperf is the tool that Microsoft uses to identify slow boots.

  3. Pingback: Process Tree from an Xperf Trace | Random ASCII

  4. Pingback: You Got Your Web Browser in my Compiler! | Random ASCII

  5. Pingback: ETW Central | Random ASCII

  6. Pingback: Summarizing Xperf CPU Usage with Flame Graphs | Random ASCII

  7. Alexey says:

    thank you man,
    needed a way to parse etl and get all raw data (file io events)
    saved me a lot of time

  8. Chris says:

    Just wanted to leave my many thanks for this tutorial. Not even some of the ones working at the ETL mother company know this. I am using this verbatim to extract essential data for optimizing power and performance and would’t be possible without your contribution. Thank you again.

  9. Rodrigo says:

    Hey, thanks for your posts about xperf and WPA…

    DO you know or have some reference to explain how WPA calculates the value plotted on the graph at a specific time? I noted that this value is not simple. When we zoom, it changes the values. I would like to understand to better interpret the data I collect.

    • brucedawson says:

      What graph are you looking at? If we take CPU Usage (Precise) as an example then if you zoom in enough then the CPU Usage at a point in time is always some multiple of 100%/n where n is the number of logical processors. With 4 logical processors it will be 0%, 25%, 50%, 75%, or 100%, because at a point in time there will be threads running on 0, 1, 2, 3, or 4 logical processors. As you zoom out it averages out the variations in some unknown way and you can get any percentage.

      For other graphs similar logic applies – zoom in really far to see the details, and zoom out to see how it averages it.

  10. murtadha90 says:

    Thank you for sharing, Sir,
    How to extract one process’s size per time (time series data) or just two columns (time + size )
    how that could be.
    Thanks

    • brucedawson says:

      Take a look at Graph Explorer, Memory, Total Commit. It has a table that shows memory over time on a per-process basis. You can rearrange the columns as described in this article and then use wpaexporter.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.