Four of client source code analysis: PIECEPICKER class
Author: Ma Ying-jeou
Date: 2004-7-2
Rstevens at hotmail.com
Copyright, not allowable, no reproduced
PiecePicker is used to implement the "Selection Algorithm", and the selection algorithm has introduced in the "Incentives Build Robustness in Bittorrent", I will list the relevant content.
BT selection algorithm, combined with the following strategies.
l strict priority
The first policy selected by the segment is: Once a piece of sub-piece is requested, the sub-piece segment remaining is prioritized. This way, you can get a complete piece as possible.
l Leverage priority
Each Peer is prioritized to select the least items in the entire system to download, and those relatively more pieces in the system are placed later, so that the entire system tends to a better state. If you don't have this algorithm, everyone will download the most of the pieces, then these pieces will be increasing in the system, and those relatively few pieces in the system are still very small, and finally, some Peer is No longer have a piece of interest, then the system's participants are less and less, and the performance of the entire system will drop.
l Random first piece
One exception to "least priority" is when the download is just starting. At this point, the downloader does not have any pieces for uploading, so it is necessary to get a complete piece as soon as possible. And the least piece, usually only a PEER has, so it may be slower than those who have multiple PEERs. Therefore, the first piece is randomly selected until the first piece download is complete, switching to "minimum priority" policy.
l final stage mode
Sometimes, ask a piece from a very slow peer. In the middle phase of the download, this is not a problem, but it may be potentially delayed. In order to prevent this, in the final stage, Peer sends a piece of sub-pieces of a piece of PEERS to it. Once some sub-pieces are broken, then send Cancel messages to other peers, and cancel these sub-pieces Request to avoid waste of bandwidth. In fact, this method does not have a waste of how much bandwidth, and the end of the document has been downloaded very quickly.
Here is two questions that I think before analyzing:
Question 1: How to implement "strict priority"
Answer: Record the pieces that have been downloaded. Prioritize them.
Question 2: How to implement the "minimum priority" algorithm? That is, how do you go to count a piece of piece in the system?
Answer: Calculate. In the download process, you will not stop the Have message sent by other Peer, each HAVE message indicates that the other party has a piece. So, for each piece to maintain a counter, each of the corresponding counters plus 1 each. When selecting a piece, a minimum piece of the counter is selected.
In actual code, it can be seen that the variable started and seedstarted are used to implement "strict priority", they record those who have started downloading. The variable Numinterests is used to implement the "minimum preferred" algorithm.
The core function of the PIECEPICKER class is next (), which integrated a variety of policies to calculate the next piece that should be downloaded.
The difficulty of the PiecePicker class is the role of three variables Numinterests, Interests, POS_IN_INTERESTS. Because there is no comment, I think I have only understood their role, especially POS_IN_INTERESTS. Therefore, before analyzing the code, I combine the example to explain the role of these three variables. Suppose there are three pieces:
Numinterests:
The type is List, each piece corresponding to one, records the number of Have messages received by each piece. When initialization, Numinterests = [0, 0, 0].
INTERESTS:
The type is List, and each of its is a list. For example, in this example, when initialization, Interests = [0, 1, 2]], apparent, it only has one.
What is the role of Interests? Well, it is a bit difficult to express. Probably this: All the index numbers of all unfinished pieces are saved in Interests, when selection is selected, INTERESTS is available. Let's take two examples:
1, INTERESTS = [[0, 1], [2]]
2, INTERESTS = [[1], [0, 2]]]
In the first example, the pieces 0, 1 are located 0 items in Interests, and the pieces 2 are located in INTERESTS.
In the second example, the segment 1 is located 0 items located at Interests, and the pieces 0, 2 are located in INTERESTS.
No matter which case, there are 0, 1, 2 three pieces have not been downloaded yet.
So, what location should I be in Interests? The answer is based on the number of HAVE messages received by this piece, that is, Numinterests's value. For example, in the first example, it means that the number of HAVEs received by the pieces 0, 1 is 0, so the number of INTERESTS, and the number of HAVEs received by the clip 2 is 1, so in item 1. When initialization, Interests = [0, 1, 2]], indicating that the number of HAVEs received by the pieces 0, 1, 2 is 0.
Strange, why do you design this? The answer is the "minimum preferred" selection policy. We see, the more Have's pieces, in Interests, the longer the location. When performing a piece of selection, you may choose a piece from Interests (why it is possible, you can see it, you will take priority to adopt other policies, if other policies can not choose one, try to select from Interests). In this way, according to the index from small to large sequence, the less pieces of Have have, the more likely to be selected. We consider such an example:
INTERESTS = [[2, 3], [5, 0, 1], [], [], [4]]
The pieces 2, 3 have 0 Have, which cannot be selected. (At least one HAVE is considered).
The pieces 0, 1, 5 have 1 HAVE, so it will be preferred from them.
The clip 4 has 4 Have, so it is finally considered.
POS_IN_INTERESTS:
As mentioned above, it has a piece of the same HAVE number, in the same location in Interests. For example, in this example, 0, 1, 5 is in the first position. So what is the principle of this? The answer is randomly arranged. Therefore, it may be 0, 1, 5, or may be 1, 5, 0, or others. In order to record the exact location of a piece, POS_IN_INTERESTS is required. It is also a list, each piece has one, according to the above example, it should be: pOS_IN_INTERESTS = [1, 2, 0, 1, 0, 0]
What did you see? Ha ha
it mean,
The clip 0 is the first [5, 0, 1]
The piece 1 is the second of [5, 0, 1]
The piece 2 is the 0th of [2, 3]
The piece 3 is the first [2, 3]
The fragment 4 is the 0th of [4]
The piece 5 is the 0th of [5, 0, 1]
That's it, I don't know if I have clear.
# "Selection algorithm"
Class PiecePicker:
DEF __INIT __ (Self, Numpieces, Rarest_First_cutoff = 1):
Self.rarest_first_cutoff = rarest_first_cutoff
Self.Numpieces = Numpieces #
Self.interests = [Range (Numpieces)]
Self.pos_in_interests = range (numpIECES)
Self.numinterests = [0] * Numpieces
Self.started = []
Self.seedstarted = []
Self.numgot = 0 # Get a few pieces?
Self.scRambled = range (numpieces)
Shuffle (Self.ScRamble)
Received a HAVE message
DEF GOT_HAVE (Self, PIECE):
IF self.numinterests [piece] is none:
Return
Numint = Self.numinterests [Piece]
if Numint == Len (Self.INTERESTS) - 1:
Self.interests.Append ([])
Numinterests corresponds to a value plus 1.
Self.numinterests [PIECE] = 1
Adjust Interests and POS_IN_INTERESTS
Self._shift_over (Piece, Self.interests [Numint], Self.interests [Numint 1])
Lost a HAVE message? ? ? ? ?
Def Lost_Have (Self, PIECE):
IF self.numinterests [piece] is none:
Return
Numint = Self.numinterests [Piece]
Self.numinterests [PIECE] - = 1
Self._shift_over (Piece, Self.interests [Numint], Self.interests [Numint - 1])
Adjust INTERESTS and POS_IN_INTERESTS, have been analyzed in front.
Def _shift_over (Self, Piece, L1, L2):
P = Self.pos_in_interests [Piece]
L1 [P] = L1 [-1] self.pos_in_interests [l1 [-1]] = P
Del L1 [-1]
NEWP = Randrange (Len (L2) 1)
IF newp == LEN (L2):
Self.pos_in_interests [piece] = len (l2)
L2.Append (Piece)
Else:
OLD = L2 [newp]
Self.pos_in_interests [old] = LEN (L2)
L2.Append (OLD)
L2 [newp] = piece
Self.pos_in_interests [piece] = newp
Send a Requested message for a piece for a "strict priority" policy
Def Requested (Self, Piece, SEED = FALSE):
IF Piece Not in Self.Started:
Add the segment index number to started
Self.Started.Append (Piece)
If SEED AND PIECE NOT IN SELF.SEEDSTARTED:
Self.seedstarted.Append (Piece)
# If a piece is already got, then call this function
Def Complete (Self, PIECE):
Assert Self.numinterests [PIECE] is not none
Self.numgot = 1
l = self.interests [self.numinterests [piece]]
P = Self.pos_in_interests [Piece]
l [p] = L [-1]
Self.pos_in_interests [l [-1]] = P
Del L [-1]
Self.numinterests [Piece] = None
TRY:
Self.Started.Remove (Piece)
Self.seedstarted.remove (Piece)
Except ValueError:
PASS
Calculate the next selected piece
Def next (Self, Havefunc, SEED = FALSE):
Bests = None
Bestnum = 2 ** 30
First, according to the "strict priority" policy, select it from the snaps that have been downloaded.
IF SEED:
S = SELF.SEEDSTARTED
Else:
S = SELF.STARTED
The "strict priority" strategy is used in combination with the "minimum priority" policy. That is to say, in the "strict priority" piece, then choose a piece that satisfies "minimum priority". Note that "least priority" has a limit, if a piece is never received a Have message (that is, count 0), it cannot be selected. This judgment is done by the following Havefunc (i).
For i in s:
IF Havefunc (i):
IF self.numinterests [i] Bests = [i] Bestnum = self.numinterests [i] Elif self.numinterests [i] == bestnum: Bests.Append (i) After "strict priority" and "minimum priority" strategy, multiple candidate pieces may be returned, and one is randomly selected, returns. IF bests: Returns a value randomly from BESTS Return Choice (Bests) If the above steps, no one is selected. So randomly select one. This is probably the strategy of "random first piece". Because rarest_first_cutoff is set to 1 by default. That is, this policy is selected if a piece is not obtained. If the rarest_first_cutoff is set to 10, then this policy can be called "the top 10 pieces of random", huh, huh. If self.numgot For i in self.scRambled: IF Havefunc (i): Return I Return None If you cannot use the "random first piece" measure, INTERESTS is finally used. Here, I finally understand why INTERESTS is positioned with the value corresponding to Numinterests. Still "minimum priority" idea, because those pieces received less than HAVE messages are priority to be selected in Interests. For i in xrange (1, min (bestnum, len (self.interests)): For J in self.interests [i]: if Havefunc (J): Return J If you still can't choose, I have to return none. Return None DEF AM_I_COMPLETE (Self): Return self.numgot == Self.numpieces Who will add? DEF BUMP (Self, Piece): l = self.interests [self.numinterests [piece]] POS = Self.pos_in_interests [PIECE] Del L [POS] L.Append (Piece) For i in range (POS, LEN (L)): Self.pos_in_interests [l [i]] = i