This reminds me of http://bugs.darcs.net/issue1153
From the second hackathon report:
Petr has developed a plan to solve these issues by ignoring bad cache
locations, warning about them, and deleting them when it is appropriate
to do so. He has begun studying the darcs caching source code to prepare
for future implementation work.
On Tue, Jul 21, 2009 at 16:18:05 +0000, Zooko wrote:
> administrator@SCBLR01SR001:~/trees/www$ darcs pull ../98.home.adm/www/
> Pulling from "/home/administrator/trees/98.home.adm/www"...
> The authenticity of host 'dev.allmydata.com (207.7.153.140)' can't be
> established.
> RSA key fingerprint is 92:d2:54:be:66:cd:20:77:54:9a:d8:dd:ea:40:0a:22.
> Are you sure you want to continue connecting (yes/no)? yes
> Tue Jul 21 08:10:43 PDT 2009 withheld@allmydata.com
> * Record assorted changes to www code for scancafe deployment
>
> 10:15
> why would pulling from localhost (relative path) cause it to ssh to
> another host?
> 10:15
> (!)
OK for some reason, darcs must have thought it need to fetch patches
that it had not previously obtained.
Could I just make sure this is a --lazy repo?
Also, what are the contents of _darcs/prefs/sources?

Zooko wrote:
> New submission from Zooko <zooko@zooko.com>:
>
> from <name withheld>:
>
> darcs scares me sometimes
> 10:13
> administrator@SCBLR01SR001:~/trees/www$ darcs pull ../98.home.adm/www/
> Pulling from "/home/administrator/trees/98.home.adm/www"...
> The authenticity of host 'dev.allmydata.com (207.7.153.140)' can't be
> established.
> RSA key fingerprint is 92:d2:54:be:66:cd:20:77:54:9a:d8:dd:ea:40:0a:22.
> Are you sure you want to continue connecting (yes/no)? yes
> Tue Jul 21 08:10:43 PDT 2009 withheld@allmydata.com
> * Record assorted changes to www code for scancafe deployment
>
> 10:15
> why would pulling from localhost (relative path) cause it to ssh to
> another host?
I would assume its an issue with a set of lazy repositories:
~/trees/98.home.adm/www/ may have patches listed in inventory that it
lacks in _darcs/patches/ (and which aren't cached locally for other
repos either). For whatever reason darcs thinks it needs those patch
files (or old tag inventory files?) to complete the pull so ~/trees/www
probably has a _darcs/prefs/sources line that includes the
dev.allmydata.com server.
It looks like darcs could use better status messages in such a case, but
other than that I don't think there is a bug/issue here.

I need a volunteer to produce a minimal scenario that could trigger this sort of
thing. It's likely enough to use local repositories.
I imagine the scenario would involve something like
init --repo A
get A B # maybe --lazy
get --lazy B C
pull --repo C B
# all of a sudden it wants patches from A: why?

Here's a minimal example showing how this can happen:
. ../tests/lib # Load some portability helpers.
rm -rf R S T log # Another script may have left a mess.
darcs init --hashed --repo R # Create our test repos.
darcs tag --repo R -m 1
darcs get --lazy R S
darcs tag --repo S -m 2
darcs get --lazy R T
darcs pull --repo T S -a --debug --verbose 2> log
not grep "R/_darcs" log
[Note that this uses the darcs regression testing infrastructure, so to make
this work, save this as tests/issue1503.sh and run cabal test tests/issue1503.sh]
In this example the repo you are pulling to (T) was originally gotten from the
'remote' repo R. Even thought you are pulling from S, the fact that R came
first in the cache meant that you tried it first.
Zooko: by any chance could you verify that this matches your use case?
I think a simple solution to this would be to sort the caches so that local ones
come first.

I retract my offer to submit the test case :-) It's going to be tricky to
automate the business of checking that it's not looking at remote repo in the
suite. You can probably dump something into test/network, or maybe prefer the
the repo we are actually pulling from to any other repo instead. In any case, I
leave the test to the implementor.

After scratching my head a bit with this issue, I finally found what's going
on here, seems that what is causing all the mess is the
_dacrs/prefs/sources file and the fact that they are tried before the
repository we are pulling (even if local), so a work around which could
work is checking first if the repo we are pulling is local, and then adding
it to the sources file, also we have to notice that It tries from top to
bottom, so, why we don't keep the sources files sorted ? keeping locals
first and so on. (I'm still asking if this would break something ? )
I used the test cases Eric put before and put the R repo in a server,
(trying to make it look to the original report).
I'm attaching two files, in the file "log", you can see how it goes to the
server looking for the missing patches/inventories :(.. then I manually
wrote down in the sources file the repo S, and you can see that the result
is different when trying to pull (file "log10") , it looks like this :
abuiles@abuiles-laptop:~/linkProg/gsoc2010/testRepo$ cat
T/_darcs/prefs/sources
cache:/home/abuiles/.darcs/cache
thisrepo:/media/my-disk/stuff/adolfo/programacion/gsoc2010/testRepo/T
repo:/media/my-disk/stuff/adolfo/programacion/gsoc2010/testRepo/S
repo:acadavid@lizarus.com:R

Attachments

I doubt there's any harm in sorting the sources file (on disk). It
seems that at the very least, we should sort our internal listing of the
sources.
It's also worth checking/documenting what the interaction is between the
sources list mechanism, the default repo and any repos passed in on the
command line.
Note: I noticed an unrelated error calling darcs transfer-mode and
opened issue1854