By just looking at the patch in the thread, it seems to me the main component of it is to resend packets which were not answered. I don't think this is a good idea. While it is true that UDP packets can get lost and resending might help, most often lost packets mean that either our or the receiver line is congested and in this case it is better not to bother this node anymore but use another one - after all certain "hot keyword" nodes will be flooded with request so its intended that they cannot answer them all. If all clients start to reask x times, that makes the situation worse and won't increase answer rates. The resend time is way too short btw. Also this patch will increase overhead of course.

Yes, sometimes the resending policy could have the problem that you warried. But in our experiment, there are more than 10% of responses got by resending request packets. So pakcet loss is the main problem. What's more, we indeed raise lookup success ratio up about 10%.

Some Support, on 06 September 2011 - 11:06 PM, said:

By just looking at the patch in the thread, it seems to me the main component of it is to resend packets which were not answered. I don't think this is a good idea. While it is true that UDP packets can get lost and resending might help, most often lost packets mean that either our or the receiver line is congested and in this case it is better not to bother this node anymore but use another one - after all certain "hot keyword" nodes will be flooded with request so its intended that they cannot answer them all. If all clients start to reask x times, that makes the situation worse and won't increase answer rates. The resend time is way too short btw. Also this patch will increase overhead of course.

Yes, sometimes the resending policy could have the problem that you warried. But in our experiment, there are more than 10% of responses got by resending request packets. So pakcet loss is the main problem.

That isn't a proper conclusion. Lets say one node is congested, so 50% of all packets send to (or from) it are dropped. Now if you create a single client which resends all requests 3 times, there is a good chance that at least one package/answer gets through and you get your reply - because you ask more often you had a better chance. But consider now that all clients do the same: Not only is your chance to receive an answer as low as before (or lower of the nodes down channel is congested) but you are also causing 3 times as many overhead for this node, as well as for all IPs which were part of the network before but aren't anymore (for example because the dynamic IP was reassigned to another user). These are three huge disadvantages in turn for the (imho) very rare case that a packets get really lost without congestion problems in the target or source node.

Some Support is right about the resends are a bad idea. It might be useful if the problem with packet loss is on your end, but if it is that high that the 10 node redundancy isn't enought then you might have a lot bigger problems than innacurate Kad lookups.

From my experience when experimenting with speeding up Kad searches by reducing the node lookup timeout to only cover 95% of the nodes, I noticed that it often used less number of lookups while the results where often of better quality. I haven't done any scientific measurements but the tendance seem to point into the direction that congested (slow or non responding) nodes are of limited use.

I've noticed that you try to map responses from a node, that doesn't match the ip/port number in the tried list, you attempt finding one where only the ip matches. First I have hard to see when this could happen, and secondly if a node responds from a different port it must be something wrong with it as it's not according to the protocol and is probably not very useful either.

As I see it you have tried to increase the success rate from individual nodes when the protocol is designed with a built in redundancy where the information is stored on more than one node and finding just one of them is enought. It seems overkill to me and adds overhead without any real benefit from what I can see. Maybe you have made some observations in addition to the ones you already mentioned that justifies these changes. It would be nice to hear your reasoning!

Thx for your helpful suggestions. In our measurement, we found that 14% of responses got by resending request packets. So the congestion of network link does not matter , probably. There is a technical report about the patch: http://www.rapidshar.../file-4767.html

netfinity, on 07 September 2011 - 05:14 PM, said:

Some Support is right about the resends are a bad idea. It might be useful if the problem with packet loss is on your end, but if it is that high that the 10 node redundancy isn't enought then you might have a lot bigger problems than innacurate Kad lookups.

From my experience when experimenting with speeding up Kad searches by reducing the node lookup timeout to only cover 95% of the nodes, I noticed that it often used less number of lookups while the results where often of better quality. I haven't done any scientific measurements but the tendance seem to point into the direction that congested (slow or non responding) nodes are of limited use.

I've noticed that you try to map responses from a node, that doesn't match the ip/port number in the tried list, you attempt finding one where only the ip matches. First I have hard to see when this could happen, and secondly if a node responds from a different port it must be something wrong with it as it's not according to the protocol and is probably not very useful either.

As I see it you have tried to increase the success rate from individual nodes when the protocol is designed with a built in redundancy where the information is stored on more than one node and finding just one of them is enought. It seems overkill to me and adds overhead without any real benefit from what I can see. Maybe you have made some observations in addition to the ones you already mentioned that justifies these changes. It would be nice to hear your reasoning!

In our measurement, we found that 14% of responses got by resending request packets. So the congestion of network link does not matter

Repeating this doesn't makes it true.

Also on a side note, one of Kads key security measures is to try to never send more packets to an unknown target than we have received to lead to such a request. This means if we receive a malicous routing response with fake IPs we will only send one packet to such an IP - the same amount as was needed by the malicous node to sent the routing answer. This is important to make sure that the Kad network cannot be used to amplify an attackers available bandwidth for an DDoS attack. If you resend all requests x times, this increases the potential as the attacker has to sent one packet to you and in return you send x packets to the victim.
Now in this case it is still not a major issue because the attack vectors are weak for routing answers, but still not a good idea security wise (besides the other problem i have pointed out before).