Diagnosing our recent failed bulk builds

From:

John Marino <dragonflybsd@xxxxxxxxx>

Date:

Fri, 09 Dec 2011 09:01:08 +0100

The last good bulk build run for x86_64 current was Oct 28. Since then
two more runs have been performed resulting in thousands of failed
packages reported. They were caused by failures at the checksum phase
where *sometimes* the bulk build script could not find the digest. One
checksum failure can cascade to hundreds of packages (see math/pari).

It's likely that something about the bulk build setup has changed since
Oct 28. I read somewhere that Justin was using NFS to access the pkgsrc
directory. Is that the setup being used here? If so, when was it set
up like this?

Can we use rsync to make a local copy of the latest pkgsrc on each build
box and take NFS out of the equation? NFS issues could explain why
sometimes the bulkbuild can access the digest folder and sometimes it can't.

There should be a significant improvement since the Oct 28 build report
on both platforms. It would be nice to figure out exactly what broke
with the bulk builds and get some updated reports here soon.