In the Kademlia paper, the last paragraph of section 2.4 states that in order to properly handle highly unbalanced trees...
Kademlia nodes keep all valid contacts in a subtree of size at least k
nodes, even if this requires splitting buckets in which the node's own
ID does not reside.
However, the previous section of the paper seem to state that if a k-bucket already has k elements, that any further additions in to that k-bucket require removing the stalest node (pinging it first to see if its alive) or otherwise caching the addition until a slot becomes available in that k-bucket.
The paper seems to be contradicting itself with these two points.
Under what conditions should a k-bucket be split and why? It seems not practical to keep "all valid contacts" in the routing table, as the routing table would then get very large very quickly. The example talks about a tree that has many nodes starting with 001 and one node starting with 000. The node starting with 000 would have to continually split its k-bucket for 001 to hold every valid node starting with 001? In a 160-bit address space, wouldn't that end up potentially storing 2^157 nodes in 000's routing table??
The wording in the quoted block is also very confusing...
"in a subtree" -- in which subtree of the routing table?
"of size atleast k nodes" -- what metric are we using to determine size of the subtree? nodes in this case refers to kademlia nodes or k-buckets or something else?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…