Pastry learning notes
Author CNSS 2004-8-19 Copyright Reprint, please indicate http://blog.9cbs.net/cnss
★ Pastry is a set of peer-to-peer network protocols, which have the following basic features:
1. Each node has a randomly generated 128-bit NodeID. When a message containing 128-bit Key is received, the node can transmit the message to the current node in the current node, the NodeId node is closest to the key. In the PASTRY network, the complexity of the sending step should be o (log n). In each PASTRY node, the routing table must maintain the complexity of the number of nodes is o (log n). Each Pastry node passed in the message. Will notify the callback function, the application can do some of the news.
2. Each PASTRY node monitors and its NodeID value closest to the L node (this collection is called Leaf Set, where each of the nodes of NodeID is larger than the Node Node), the application can know the Leaf set by callback The addition of the new node, the failure of the node, the recovery of the node.
3. The location on the Internet is important, and the passage of the Pastry explores the minimum distance of the message, such as the basis for the delay of ping. The Pastry network is scattered, flexible, self-organized; when new nodes, dead nodes, It will be automatically configured when the node fails.
★ Several important parameters: B: Generally take 1, 2, 3, 4. Internal handling 128-bit ID is used (2 B-Party) .L: Leaf set capacity, generally taken (2 b (B 1 party). M: neighborhood set capacity, generally take (2 B (2 B 1). For expressive, N represents the network Saves the number of nodes.
★ The nodes need to maintain three data, as shown in Figure 1:
1. Leaf Set: The capacity is L for a node that is closest to Nodeid on a value, where Smaller and Larger share half.
2. Routing Tablelog is a logarithmic line of the bottom N (2 B), each line (2 B) -1 data. The NodeID indicates that NodeID is the same as the first N bit of the current node, and the first N 1 bit is different from the current node. Figure 1 shows the corresponding segment of the current node NodeID, read the shadow of each line is the NodeID of the current node.
3. Neighborhood set saves the direct neighbor of the node (such as the M node that is the smallest ping value), which is not used to route messages, but to configure network services.
figure 1:
★ The routing algorithm is as follows: R represents the column of the first (subscript) column .L represents the nest of the nearest NodeID in the Leaf Set. D (subscript L) represents Keyd The L-bit value. SHL (A, b) represents the length of the prefix shared by A and B.
★ Initialization of data: When X is added to PASTRY, since A and X are similar, X's neighborhood set is initialized to a. Assume that passes through A, B, C .. finally arrived with NodeID value Click Z. Because Z is similar to the X LEAF SET, use Z LEAF SET to initialize X, and finally by querying the largest NodeID and the minimum NodeID to get the Node of the x L should, according to the definition of Routing Table itself and The algorithm of the message routing, assuming that the Routin Table of X has I line, then the 0th line of the 0th, the first line of the first line of B, the first line of the first line .... If the first and A of X 1 bits are the same, then the 0th line of A is directly line 0 row; if different, the shadow of the 0 row is replaced with a, and the shadow of the 0th line is located on the first bit of X, Push it in this class.
★ Experience: The PASTRY network is based on NodeID. It is separated from the actual network, while the Id of the actual network is the basis of the algorithm of the routing. Other like Kademila, Chord is also route, such as Kademila Use different or calculated proactive distance. Pastry can complete the routing of the message in an uncoordinated network, and very efficient, theoretical routing step number is log of (2 B) is the log of the bottom n, it can be seen B determine the performance of the whole network, the greater the more efficient, but it will also make Routing Table becomes bigger, but the routing table will not only take more memory, but also need to detect more nodes when passing the message, and handle more complex Question. Therefore, it is necessary to select a B value according to the characteristics of the network. If you can select 4 on the Internet, you can choose 4, if you choose 2 on the local network of the small device, you have enough. Ha, I like flexible design . Each node can choose your own Leaf set quantity and the number of Neighborhood Set. If the device is relatively low (such as an embedded system), you can choose a little more value, if the device's performance is too much, The message can be passed faster, and it is an elastic design. According to the Pastry design, the data can exist on the KEY closest KEY of the data (K's size can be fixed by yourself), which seems that Pastry is not suitable for data storage. First, because the current network capabilities, the second is because now the storage capacity, the third is because it looks for things. Imagine that the netizen will put movie << hacking empire >> to the Internet, happen to "hack empire" string Hash Key and your ID, then put this movie on your computer by calling the Pastry API (unlucky !!), even more miserable, if the netizen is searching "Hacker Empire", due to Hash affirmation "Hacker Empire" is different, so he doesn't know that you have this movie here. So Pastry is more suitable for delivery messages, such as in a complex wireless network environment, you can't control various equipment confusing communications, If you know that A's NodeId, you can send him a message at all, but in this way, Pastry's mechanism should be slightly changed, and NodeId is no longer random, but is fixed like Mac, ensuring that in an unknown environment. Under, the network can be reliable. Reference: PASTRY: SCALABLE, Distributed Object location and routing for large-scale peer-to-peer systems, a. Rowstron and P. Druschel