In the previous episode of “Simple Changes to Shrink Chrome” I discussed how deleting ‘const’ from a few key locations could lead to dramatic size savings, due to a VC++ compiler quirk. In this episode I’ll show how deleting an inline function definition can lead to similar savings.
The savings this time are less important as they are mostly in the .BSS segment, but there are also some modest code-size savings, and some interesting lessons. It’s worth confessing up front that this time the problem being solved was not caused by a compiler quirk – it was entirely self inflicted.
Doing this investigation has reminded me that the behavior of linkers is best described by chaos theory – the details of their behavior defy prediction by simple heuristics, and the results can be changed dramatically by tiny
butterflies code changes.
In August of 2016 Chrome switched meta-build systems, from gyp to gn. Both gyp and gn are systems for creating ninja files and these are the files that define how Chrome is built. Changing from gyp to gn was a huge project because of the thousands of assumptions that were embedded in Chrome’s many .gyp files, but the change was necessary – gyp was showing its age.
Part of my contribution was to help investigate binary size regressions and one technique I used was to export symbol lists and see which ones appeared in gn builds but not gyp builds. Here are the six largest globals in chrome.dll from a gyp build near the transition:
And here are the six largest globals from a similar gn build:
The ff_sin* and ff_cos* globals (which go from 65536 entries down to 16) are from ffmpeg, and since they weren’t in the gyp build they clearly aren’t supposed to be in chrome.dll. A bit of investigation using linker_verbose_tracking.py led to this change which switched some source_set targets (lists of object files to be linked) to static_library targets, thus telling the linker to be less aggressive about pulling in their code. This worked, I closed the bug, and got on with my life.
Are you sure?
Until I noticed that the arrays were still there.
I had made the amateur mistake of failing to verify my results on Chrome’s official build. The fix should have worked on those builds as well, but Murphy’s Law doesn’t care about ‘should’.
So, I repeated my investigation using the same tools, but now using Chrome’s official build. This was a painful process. Chrome’s official build uses Link Time Code Generation (also known as Link Time Optimization) which means that every change, no matter how small, regenerates all of the code for Chrome’s enormous DLLs. My iteration times went from a couple of minutes to about an hour. Uggh.
Chrome has a “component” build configuration which can give incremental times as short as a few seconds, but this configuration is not helpful for this type of investigation.
But, it’s just computer time. I filled in the gaps with other projects. The bigger problem was that my technique was not working. No matter how many source_set targets I changed to static_library targets the linker kept finding multiple reasons to pull in ffmpeg.
My Python script works by scanning the verbose linker output and printing the chain of dependencies that lead to a particular object file (in this case rdft.obj, the one that defines the arrays) being pulled in. I decided that I needed to modify my tool to print out why each object file was being pulled in. So, I changed it to list which symbol caused a particular object file to be pulled in and I started seeing results like this:
rdft.obj referenced by
avfft.obj for symbol ff_rdft_init
avfft.obj referenced by
FFTFrameFFMPEG.obj for symbol av_rdft_calc
FFTFrameFFMPEG.obj referenced by
PeriodicWave.obj for symbol blink::FFTFrame::doInverseFFT
PeriodicWave.obj referenced by
SkGeometry.obj for symbol log2f
Looking at a few sets of these results I noticed some strange things. The crucial object file seemed to always be PeriodicWave.obj, and it was always being pulled in because some other object file needed a copy of log2f. This was peculiar because PeriodicWave.obj is a Blink (née WebKit) file, which shouldn’t be needed in chrome.dll. It’s also peculiar because log2f is a C Run-Time (CRT) function, which shouldn’t be supplied by Blink anyway.
PeriodicWave.cpp didn’t contain a definition for log2f which left me confused for a while but I eventually put the pieces together:
- MathExtra.h defined inline log2 and log2f function
- PeriodicWave.cpp included MathExtra.h and contained some calls to log2f
- When VC++ notices a call to an inline function it generates a non-inline version of that function, in case it doesn’t get inlined
- That non-inline version of log2f happened to get pulled in by the official builds which then pulled in the rest of PeriodicWave.obj, and the /OPT:REF optimization that is supposed to discard the unused pieces was unable to get rid of them
The definition of log2 and log2f in MathExtras.h were only ever intended to work around their absence in VC++ and Android. Since VC++ now had those functions the inline definitions were unnecessary, and apparently actively harmful. So I landed a pair of changes that removed them, first just for Windows and then for Android:
Don’t define log2 and log2f in blink (Windows only)
The one remaining mystery is why this only happened in the official builds. I don’t know the answer with 100% certainty, but I know enough about how symbol resolution works in linkers to have some ideas. Linkers start out with the set of object files that you have specified on the command line – the contents of these object files will be included in the final binary (unless the linker can prove that removing them makes no difference). Then the linker builds up a list of unresolved symbols and starts scanning through the set of libraries that were specified on the command line – the object files in these libraries will be included as-needed. Whenever a needed symbol is found in one of the library object files then that object file is added to the list to be linked – it becomes like one of the command-line object files. The process continues until all symbols are resolved.
The symbol search order is unspecified by C++ and due to the One Definition Rule (ODR) the search order should not matter. But, an obvious implementation is to repeatedly scan through the libraries, looking for symbols, until there are no more unresolved. This means that if you have two copies of the same symbol then it is undefined which you will get – it depends where you are when you first find that you need it. Since pulling in a symbol pulls in the rest of its object file this means that linking is fundamentally unpredictable. And, if you have ODR violations (as we did) then the results are not even defined.
So, the official builds perturbed the search algorithm enough that a different copy of log2f was found, and that led to unintended consequences.
Ideally the linker would notice the ODR violation, but doing so ends up being expensive – you have to look into every object file to see if any of functions are defined in different ways, and the definition of ‘different’ is not obvious. There are good reasons that the C++ standard does not require diagnostics for ODR violations.
I wrote about non-inlined inline functions and ODR violations in Chrome barely a month before this article, here.
Ideally chrome.dll wouldn’t even list Blink as something to be linked with. Another Chrome developer has since fixed that.
I still feel bad about not noticing that the arrays were in our official builds after I “fixed them” the first time. Luckily I’m the sort of developer who randomly downloads symbols for Chrome’s canary builds and runs analysis tools on them, so I eventually redeemed myself.
If you’re the sort of developer who likes browsing Chrome’s source then you can see the (tiny) change for Windows by typing “> git show 036e2ce” in a Chromium repo. A more accessible version can be found here.
The net effect of this change on the size of chrome.dll’s sections (in-memory size) was:
.text: -6656 bytes change
.rdata: 240 bytes change
.data: -786624 bytes change
.rodata: -592 bytes change
.reloc: -872 bytes change
Total change: -794504 bytes
More details of the techniques used can be found here. If you want to poke at some relevant chrome.dll PDBs then you can use retrievesymbols from UIforETW to grab them and then run SymbolSort or ShowGlobals on them. Just set _NT_SYMBOL_PATH to something like SRV*c:\symbols*https://msdl.microsoft.com/download/symbols and then run one of these:
> retrievesymbols B6A75B4A5FE4446CAB6AE159B22DD91F 1 chrome.dll.pdb – gyp build (September 24th, 2016)
> retrievesymbols 0D12FEA487E942BB8719672D45464C51 1 chrome.dll.pdb – gn build (October 30th, 2016)
> retrievesymbols C39CCDB899B34A7CBCC25F51E64B1F62 1 chrome.dll.pdb – recent gn build (January 13, 2017)
These weren’t the PDBs used for the original investigation, but they show the issue.