This paper presents PASTRY, a scalable, distributed p2p location and
routing substrate. It serves as a general substrate for building p2p
applications. The goal of PASTRY is to form a robust, self-organizing
overlay network in the Internet seeking to minimize the message travel
distance.
Each node in the PASTRY has a unique, 128-bit nodeId. The set of existing
nodeIds is uniformly distributed. Each PASTRY node maintains a leaf set, a
routing table, and a neighborhood set. The leaf set of a given node
maintains nodes that are numerically close to the given node. The routing
table organizes the routing information using the appropriate prefix of
nodeId. The neighborhood set contains the IP addresses and nodeIds of
nodes that are physically closest to the given node according to the
proximity metric. The leaf set and the routing table are used for message
routing while the neighborhood set is used for maintaining locality
properties. The message routing is simple: if the key is within range of
leaf set, forward to the closest leaf; else forward to node that shares at
least one more digit with the key in its prefix than current nodeId. If no
such node exists, then forward to node that shares as many digits with the
key as current nodeId but numerically closer. The authors also discuss how
to efficiently and dynamically maintain the node state in the presence of
node arrivals, node departure, node failure and recovery. PASTRY has good
locality properties. In each step a message is routed to the nearest node
with a longer prefix match.
However, the following limitations cannot be ignored. First, in the
routing algorithm, if the key is not in the leaf set and no better match
nodeId, the message will be forwarded to a node that shares as a prefix
with the key as long as the current node, but is numerically closer to the
key than the current nodeId. It assumes such a node must exist. This is
not always true when the nodeId of the current node or its immediate
neighbor is already the numerically closest to the key, or L/2 adjacent
nodes in the leaf set have failed concurrently. Although the probability
of such a case is very low, there is a chance that a message routing may
fail because no node to forward the message.
Second, each entry in the routing table is a node whose nodeId have the
appropriate prefix. There are maybe multiple choices of such kind of
nodes. Which one is put in the routing table may affect the routing
performance. The authors do not specify the policy. I assume here is
randomly picking a nodeId among all the potential nodes.
Third, when a new node comes, it is assigned a nodeId first, and then is
inserted. There is a chance that the new nodeId is the same as an existing
node. In this case the new node should be assigned a new nodeId. This may
be a rare case but should be taken care of.
Last, the authors do not present the details of how to update node state
while node failure happens. I think it is not trivial. The failure may
cause incorrect node behavior. As a result, the performance will be
affected. Moreover, during the information update phase, the PASTRY is not
stable; message routing may be incorrect because of the inconsistent
routing information.
Received on Sun Nov 06 2005 - 19:59:56 EST
This archive was generated by hypermail 2.2.0 : Mon Nov 07 2005 - 02:16:25 EST