This translation version is independent of the solitary wave. The original text is found http://bitconjurer.org/bittorrent/protocol.html Author BRAM COHEN Lonely Waves Enjoy the translation version of the right to modify the right non-commercial reference, please note the translator
Detailed BitTorrent Protocol
BitTorrent (BT, Bit Mountain) is a file distribution protocol that identifies the content through the URL and seamlessly combined with the network. It is the advantage on the HTTP platform that simultaneously uploads data on one of the downloaders while downloading, allowing the file source to support a large number of downloaders while downloading.
One BT file distribution requires the following entities:
· A normal network server · a static meta information file · A bt Tracker · A "original" downloader · Network terminal viewer · network terminal downloader
Here, it is assumed that the next file has multiple downloaders. Set up a BT server steps as follows: 1. Start running Tracker (this step is running); 2. Start running a normal network server-side program, such as Apache, running this step; 3. Will be on the web server .torrent file association to mimeType Type Application / X-bittorrent (I have jumped this step); 4. Create a meta-information file (.torrent file) with the full file to be published and the TRACKER's URL; 5. Put the meta The file is placed on the web server; 6. Publish a meta-information file (.torrent file) on the web page; 7. The original downloader provides a complete file (original). The BT download procedure is as follows: 1. Install the BT client program (the installed skip this step); 2. Online; 3. Click on a link to the .torrent file; 4. Select the local storage path, select the selected need to download Document (for BT client users with selection download function); 5. Wait for download to complete; 6. Users exit download (previous downloaders do not stop upload). The connection status is as follows: • The website provides a static file connection and launches the BT program on the client; • Tracker receives all downloader information, and gives each downloader a random peer list. Implementation via HTTP or HTTPS protocol; · Downloaders in a while, inform yourself, inform themselves, and download the data of PEER that has been directly connected. These connections follow the BitTorrent Peer protocol and communicate via TCP protocol. · The original downloader only uploaded without downloading, he has the entire file, so it is necessary to transfer all parts of the file in the network. In some popular downloads, the original downloader can often exit upload in a short period of time, and the downloader who has downloaded to the entire file continues to be uploaded. Email information files and TRACKER response information are transmitted in a simple and efficient scalable format (Bencoding, B). After B coding, the information is encoded, which will be described by strings and integer numbers, nested in the dictionary and list (like in Python), ignoring the dictionary unrecognized key value to enhance the scalable ability. In this way, the new feature can be added later. The B coding rules are as follows: • The representation of the string is: string length (decimal), colon, string. For example, string 'spam' (excluding quotation marks) will be represented as: '4: spam' (excluding quotation marks), 4 represents the length of the string. • Integer data indicates that the front adding 'I' is added to 'e' is a decimal number, such as I3E is equivalent to 3, i-3e is -3. The integer data has no length limit. I-0E is invalid, all of the other I0E starts in addition to I0E, which is 0e, which is ineffective. • The list is encoded behind a 'L' beginning with the items it contains (already encoded).
• The dictionary is encoded as a 'd' beginning with an alternating key value (key) and its counterpart value, and then adds a 'e'. Such as: D3: COW3: MOO4: SPAM4: Eggse equivalent to {'COW': 'Moo', 'spam': 'Eggs'} D4: Spaml1: A1: BEE is equivalent to {' spam ': [' a ',' B ']} The key value must be a processed string (encoded with the original string, and is not a digital letter mixed). The meta-information file is a dictionary with the following key value: Announce (declared) Tracker's URL. INFO (information) This key value corresponds to a dictionary containing the key values described below: The key value Name corresponds to a string, represents the name of the default download file or a directory. It is pure suggestive. The number of bytes of the block split by the key value Piece Length corresponds to the file. For transmission needs, the file is split into a block of size, except that the last piece is usually small. The block is generally 2 weight, most of which blocks the block length of 256k (2 18 power). Key Value PIECES corresponds to a string, this string length is a multiple of 20. It can be divided into a plurality of strings of each 20 bytes, respectively correspond to the SHA1 check code (HASH) of the block in the index. There is also a key value Length and Files (files), they can't appear simultaneously. When Length appears, this meta information file is just a single file download, otherwise the directory structure of multi-file is downloaded. In the case of single file, the number of words that Length corresponds to the length of the file. Multi-file cases are considered to be a large file download in the order in the file list, and the key value FILES is a list of dictionaries, of which each dictionary contains the following key values: Length (Length) The number of bytes of the length of the file. Path (Path) A list containing a string, the string is the sub-directory name, and the last string is the file name. (A Length form that is zero is wrong.) In the case of single file, the key value name is the file name; multi-file case, it has become a directory name. The Tracker challenge is two-way. Tracker gets information through the HTTP GET parameter and then returns a B encoded information. Although Tracker needs to be executed on the server, it runs a module that is unconjOAPACHE. Tracker's GET request has the following key value: Info_hash 20-byte long SHA1 verification code, under the INFO value in the B-encoded meta-information file, is a branch of the meta-information file. This value is automatically converted. Peer_ID A 20-byte long string is the id generated when each user begins to download. This value is also automatic conversion. IP A selectable parameter gives the IP (or DNS host name) where Peer is located, which is usually obtained with the original downloader of the machine to distribute files. Port listening port, the official default is starting from the 6881 port. If the port is occupied, push one port sequentially to find the idle port, until the 6889 port. UPLOADED currently summarizes, encoded as a decimal ASCII code.
Downloaded Currently Download, encoded as a decimal ASCII code. LEFT has not downloaded the number of bytes, encoding a decimal ASCII code. This number is not calculated by the file length and the number of downloaded numbers, because the file may be resumed, and some downloaded data cannot be re-downloaded through the integrity check. Event This is a selective key value. The option has Started, Completed or Stopped (or EMPTY, etc.). This declaration will be issued at a certain period of time if it is not running. When starting download, issue a started value, and make a completed time when downloading. When the file is complete, then no completed is issued, and the downloader will issue stopped when the downloader is aborted. Tracker's response is also a B-coded dictionary. If the TRACKER responds, there is a key value Failure Reason (failure reason), the cause of the string information that can be read, does not require other key values. Otherwise, the response must have two key values: Interval, the downloader regularly issues the number of seconds; Peers, Peers corresponds to a dictionary list, Peers, peer optional ID, IP address, or DNS host name One of the strings and port numbers. Remember, if the downloader has an event or want more Peers, they will not send a request in accordance with the scheduled interval. If you want to extend the meta-information file or Tracker challenge, coordinate with Bram Cohen to ensure that all extensions are compatible. The BitTorrent Peer Protocol operates through the TCP protocol. It doesn't have to regulate any Socket options to run smoothly. The connection between Peer is symmetrical. The information sent in both directions should be coordinated, and the data can flow into either party. The Peer Protocol is the file block according to the index described in the meta-information file, starting with zero. When a PEER completes a block download and checks all PEERS that he connects to him, he has got this block. The two terminals of the connection have 2 state indicators (bits), blocked or not, is blocked, which is indicated that the data no longer issued before recovery. The causes and technical issues of occlusion will be mentioned later. Data transfer happens to the other party and the other party does not block. Focus on state must be unanimous - if a PEER that is not blocked does not have data needed, others will lose attention, turn attention to the peer being blocked. It is very cautious, but this can you know which Peer can start downloading immediately after blocking disappears. // The connection will gradually disconnect the peer that is not followed and blocked. The connection begins with blocking and not being concerned.
When the data is transmitted, the downloader must prepare multiple requests to serve queues to achieve higher TCP transmission efficiency (this is called "tube request"). On the other hand, the request that cannot be written to the TCP buffer is to be discharged immediately, rather than suppressing a network buffer of an application-level, and all of which discards all of them once blocked. The Peer Connection Agreement includes a single handshake that is consistent with the constant size and determined information flow. The beginning of the handshake is character ninety (decimal), followed by string 'bittorrentprotocol'. The character of the beginning is fixed, and it is desirable to distinguish other new protocols. The integer of all feed protocols is included in 4 bytes Big-endian. After the head data in existing applications, it is one byte that reserved 0, if you want to extend the protocol by changing these 8 reserved bytes, please harmonize the BRAM Cohen to ensure that all extensions are compatible. Then, the SHA1 verification code from the INFO value from the B-encoded INFO value is the same as the value of INFO_HASH to the TRASH, but this is the original value there is a reference). If the value of both parties is different, they are disconnected. An exception is that the downloader wants to make multiple connection downloads with a port, which will get a verification code from the access connection, then the same reply, and the same reply is the same. After the verification code is 20 bytes of Peer ID reported in the Tracker request, which is included in the peer list of Tracker, is reported in the request to Tracker. If the acceptor Peer ID does not meet the sender's hope, the connection is disconnected. Handshake. Then it is a fixed interaction information flow. Zero length information is used to keep the connection and is ignored. This information is typically issued once a 2 minute, but it is easy to time out during waiting for data. All bytes that do not keep the connection information is given, the values are as follows: · 0- Blocking · 1- Non-blocking · 2- Follow 3- Non-attention · 4- has been 5-bit group · 6-request · 7-block · 8- Cancel "blocked", "smooth", "attention" and "not paying attention" information has no load. The "Bit Group" is only sent as the first information. It loads (occupied) a bit group, and the downloader has an index set to 1, and the other is 0. The downloader who did not have any data when the download was downloaded, jumped through the "Bit Group". The first byte high to the low position correspondence index 0-7, push it according to the secondary, the second byte corresponds to 8-15, and so on. The remaining bit of the tail is set to 0. "Existing" information is loaded with a number (single precision), that is, just downloaded and checks the number of indexes of the verification code. "Request" information includes an index, start and length. The latter two is byte offset. The length is generally 2 weight unless the end of the file is truncated. The current is generally 2 15 power, and the 17-powered connection greater than 2 is turned off. "Cancel" information load and "request" information have the same load. It is usually issued in the "final stage" in the download and close completion. When the download is fast, there will be a tend to download from the same thread, which will be very slow. In order to ensure the residual block download, once the remaining blocks have not been issued to anyone, first send a request for all the remaining blocks from the connector of the other party download data. To avoid inefficiency, whenever a block begins to download, issue cancellation information to other Peer. "Block" information contains an index, start and block. Remember that it and the "request" are related. When the transmission speed is slow or "blocked" "non-blocking (smooth)" information high frequency alternately or both simultaneously, it may be contained in a unwanted block. The order of the downloaders will be random, so appropriate to prevent the downloader from only the same block set or superchard. There are many reasons for blocking. The information crowd control of the TCP protocol manifests is extremely poor during the process of sending information to multiple connections. At the same time, the presence of blocking enables the downloaders to ensure a stable download rate in an algorithm for dental dental. The blocking algorithm described below is the current base configuration.