On Mon, Apr 28, 2014 at 02:18:44PM -0700, Shawn Pearce wrote:
> > The read penalty is not addressed here, so I still pay 14MB hashing
> > cost. But that's an easy problem. We could cache the validated index
> > in a daemon. Whenever git needs to load an index, it pokes the daemon.
> > The daemon verifies that the on-disk index still has the same
> > signature, then sends the in-mem index to git. When git updates the
> > index, it pokes the daemon again to update in-mem index. Next time git
> > reads the index, it does not have to pay I/O cost any more (actually
> > it does but the cost is hidden away when you do not have to read it
> > yet).
>
> If we are going this far, maybe it is worthwhile building a mmap()
> region the daemon exports to the git client that holds the "in memory"
> format of the index. Clients would mmap this PROT_READ, MAP_PRIVATE
> and can then quickly access the base file information without doing
> further validation, or copying the large(ish) data over a pipe.