analyzing linker max vsize

mozilla-inbound is currently approval-only due to issues with Windows PGO builds. The short explanation is that we turn on aggressive code optimization for our Windows builds. This aggressive code optimization causes the linker than comes with Visual Studio to run out of virtual memory. The current situation is especially problematic because we can’t increase the amount of virtual memory the linker can access (unlike last time, where we “just” moved the builds to 64-bit machines).

We don’t really have a good handle on what causes these issues (other than the obvious “more code”), but at least we are tracking the linker’s vsize and we’ll soon have pretty pictures of the same. We hadn’t expected to have to deal with this problem for several more months. The graph below helps explain why we’re hitting this problem a little sooner than before. The data for this graph was taken from the Windows nightly build logs.

Notice the massive spike in October, as well as the ~100MB worth of growth in early January. While the data is not especially fine-grained (nightly builds can include tens of changesets, and we’d really like information on the vsize growth on a per-changeset basis), looking at the biggest increases over the last ten months might prove helpful. There have been ~300 nightly builds since we started recording data; below is a list of the top 20 daily increases in linker max vsize. The date in the table is the date the nightly build was done; the newly-included changeset range is linked to for your perusal.

Mike Hommey suggested that trying to divine the whys and hows of extra memory usage would be a fruitless endeavor. Looking at the above pushlogs, I am inclined to agree with him. There’s nothing in any of them that jumps out. I didn’t try clicking through to individual changesets to figure out what might have added large chunks of code, though.

Comments closed—Trackbacks closedRSS 2.0 feed for these commentsThis entry (permalink) was posted on Tuesday, January 22, 2013, at 5:37 pm by Nathan Froyd. Filed in Uncategorized.

It’s faster to have as much stuff as you can packed into a single library; inter-library calls are relatively more expensive than intra-library calls. I think there might also be some disk I/O wins from having all the code in a single file (don’t have to seek around the disk to get to multiple files at startup).

Your suggestion has been done in the past to get the linker’s required memory down to something manageable. We can now turn on aggressive optimizations on a per-directory basis in the source tree, and that’s somewhat easier than moving things out of libxul. There may come another point where we have to move code out of libxul again, though.

Is the non-presence of a webrtc merge in pushlog merely an instance of pushlog brokenness, then? I see the strip commits, which I remember from the webrtc landing, but I don’t see anything associated with the landing itself.