On Monday, August 05, 2013 08:38:47 pm Ramkumar Ramachandra
wrote:
> This is the rough explanation I wrote down after reading
> it:
>
> So, the problem is that my .git/objects/pack is polluted
> with little packs everytime I fetch (or push, if you're
> the server), and this is problematic from the
> perspective of a overtly (naively) aggressive gc that
> hammers out all fragmentation. So, on the first run,
> the little packfiles I have are all "consolidated" into
> big packfiles; you also write .keep files to say that
> "don't gc these big packs we just generated". In
> subsequent runs, the little packfiles from the fetch are
> absorbed into a pack that is immune to gc. You're also
> using a size heuristic, to consolidate similarly sized
> packfiles. You also have a --ratio to tweak the ratio
> of sizes.
>
> From: Martin Fick<mf...@codeaurora.org>
> See: https://gerrit-review.googlesource.com/#/c/35215/
> Thread:
> http://thread.gmane.org/gmane.comp.version-control.git/2
> 31555 (Martin's emails are missing from the archive)
> ---

After analyzing today's data, I recognize that in some
circumstances the size estimation after consolidation can be
off by huge amounts. The script naively just adds the
current sizes together. This gives a very rough estimate,
of the new packfile size, but sometimes it can be off by
over 2 orders of magnitude. :( While many new packfiles are
tiny (several K only), it seems like the larger new
packfiles have a terrible tendency to throw the estimate way
off (I suspect they simply have many duplicate objects).
But despite this poor estimate, the script still offers
drastic improvements over plain git gc.
So, it has me wondering if there isn't a more accurate way
to estimate the new packfile without wasting a ton of time?
If not, one approach which might be worth experimenting with
is to just assume that new packfiles have size 0! Then just
consolidate them with any other packfile which is ready for
consolidation, or if none are ready, with the smallest
packfile. I would not be surprised to see this work on
average better than the current summation,
-Martin
--
The Qualcomm Innovation Center, Inc. is a member of Code
Aurora Forum, hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html