BitTorrent is a protocol for distributing files. It recognizes the content through the URL and can be interactively with the web. It is based on the HTTP protocol. Its advantage is that if there are multiple downloaders to download the same file, then each downloader is also uploaded for other downloaders, so that the file source can support a large number of users. Only the growth of appropriate loads. (Translation: Because a large number of loads are balanced throughout the system, only a small amount of growth of the machine of the source file is increased)
A BT file distribution system consists of the following entities: a normal web server a static "meta information" file a tracking (TRACKER) server terminal user's web browser terminal downloader
The ideal case is that multiple end users are downloading the same file. To provide file sharing, then a host needs to perform the following steps: Ø Run a Tracker server (or already have a Tracker server running) Ø Run a web server, such as Apache, or already have a web server running . Ø On the web server, associate file extensions. Torrent and MIME type Application / X-bittorrent (or have been associated) Ø Create a "yuan information" file based on the URL of the Tracker server and the file to be shared (.torrent " ). Ø Publish the "meta information" file to the web server Ø On a web page, add a link to the "Met Information" file. Ø Run a downloader already having a full file (becoming 'Origin', or 'seed', seed)
To start downloading the file, the end user performs the following steps: Ø Install BT (or installed) Ø Access to provide .Torrent files Ø Click to the .torrent file (Translation: At this time, BT will pop up a dialog) Ø Where is the selection to save the downloaded file? Or is a breakpoint resumed Ø Wait for download to complete. Ø End the BT program to run (if you do not end, BT will have been uploaded for other people)
The connectivity between the various parts is as follows: The website is responsible for providing a static file, and putting the BT auxiliary program (client) on the client machine. Trackers receives information from all downloaders and returns a list of a random peers. This interaction is done by HTTP or HTTPS protocol. Downloaders register cyclical to Tracker, making Trackers to understand their progress; downloaders are uploaded and downloaded by direct connection by direct connection. This connection is used by the BitTorrent peer protocol, which is based on TCP. Origin is only responsible for uploading, never downloading because it already has a complete file. ORIGIN is a must.
The response of the meta file and Tracker uses a simple, effective, scalable format, known as Bencoding, which can contain strings and integers. This format has scalability because of the ignored dictionary keywords that are not required, so this format has scalability, and other options can be easily added.
The Bencoding format is as follows: For strings, first is the length of a string, then the colon, followed by the actual string, for example: 4: spam, is "spam" integer encoding, starting with 'I', then 10 into 10 The integer value of the system, finally ended with 'e'. For example, i3e represents 3, i-3e representation -3. Integer has no size limit. I-0e is invalid. In addition to I0E, it is invalid in 0 starting integers. I0E is of course 0. The list is encoded as follows, starting with 'L', next is the encoding of the list value (also using Bencoded encoded), and finally ends with 'e'. For example: L4: spam4: eggse represents ['spam', 'eggs']. The dictionary is encoded as follows, starting with 'D', next is the optional keys and its corresponding values, the user ends with 'e'. For example: D3: COW3: MOO4: SPAM4: Eggse, represents {'COW': 'Moo', 'spam': 'Eggs'}, and D4: spaml1: Al: Bee Represents {'spam': ['A', 'b']}. The key value must be a string, and it has been sorted (not sorted in alphabetical order, but sorted according to the original string). The yuan file is a dictionary encoded by Bencoded, including the following keywords:
ANNOUNCE TRACKER server
INFO is actually a dictionary, including the following keywords:
Name: A string, when saving the file, as a suggested value. Just a suggestion, you can save the file with another name. Piece Length: For better transmission, the file is separated from a different piece, except for the last piece, this value is the size of the piezage. The fragment is almost always 2 power, the most commonly used is 256K (the previous version 3.2 of BT, which is 1M as the default size) PIECES: A string of the integer of 20. It will be separated from a 20-byte length string, each of which is a corresponding piece of the HASH value.
In addition, there is a keyword for Length or Files, which can only appear one by two keywords. If it is Length, then it means that it is only a single file to download. If File is then downloading multiple files in a directory. If it is a single file, then Length is the length of the file.
In order to support other keywords, for multiple files, it is also seen as a file, that is, in the order of files, the information of each file is connected to form a string. The information of each file is actually a dictionary, including the following keyword: length: file length PATH: List of subdirectory names, the last item of the list is the actual name of the file. (Do not allow the list of lists that appear empty). Name: In the case of single file, Name is the name of the file, and in the multi-file case, Name is the name of the directory.
Tracker query. Trakcer receives information through the parameters of the Get command of HTTP, and respond to the other party (that is, the downloader) is a message encoded by Bencoded. Note that although the current Tracker is required to implement a web server, it can actually run more light, for example, as a module of Apache. Tracker Get Requests Have The Following Keys:
Send it to the GET request of Tracker, contains the following keywords: info_hash: SHA HA HA HASH, 20 bytes of the INFO section in the meta file. This character creates almost certainly needs to be escaped (the translation: in the URL, some characters can't appear, must be encoded by Unicode)
Peer_id: Downloaders' ID, a 20-byte long string. Each downloader needs to be randomly created this ID before starting a new download. This string usually needs to be essential.
IP: An optional parameter gives the PEER's IP address (or DNS name?). Usually used in ORIGIN if it and tracker are on the same machine.
Port: PEER listens to the port. Downloaders are usually listening on the 6881 port. If the port is occupied, then it will try to 6889, and if it is occupied, then give up the monitor.
UPLOADED: The data size that has been uploaded, the decimal representation.
Downloaded: The size of the data has been downloaded, the decimal representation
LEFT: How much data is not downloaded, and the decimal representation is not downloaded. Note that this value cannot be calculated according to the file length and the downloaded data size, because it is probably a breakpoint, if you have a failure to check the completeness of the file, this also provides a chance.
Event: An optional keyword, the value is one of started, compted, or stopped (or empty, not processing). If this keyword does not appear,. At a time of downloading, this value is set to start, after the download is complete, set to completed. If the downloader stops downloading, then this value is set to stopped.
The Tracker response is a dictionary encoded with Bencoded. If there is a keyword Failure Reason in the Tracker's response, it corresponds to a string that explains the reason for the query failure, and other keywords are no longer needed. Otherwise, it must have two keywords: Interval: the time interval between the downloader between the sending requests. Peers: A list of a dictionary, each dictionary includes the following keywords: Peer ID, IP, Port, corresponding to the ID, IP address, or DNS name selected by Peer, respectively, port number. Note that if some events occur, or more PEERS, then the downloader may send a request in time,
(Downloader Sends a query request to TRACKER via HTTP, Tracker responds to a list of peers)
If you want to expand the meta-information file or Tracker query, you need to coordinate with BRAM Cohen to ensure that all extensions are compatible.
BT peer-to-peer protocols based on TCP, it is very efficient and does not need to set any Socket options. (Translation: BT peer protocol refers to the protocol of PEER and PEER exchange information) The two connections of the peer are symmetrical, and the message is also transmitted in both directions, and the data can also flow in any direction. Once a peer is finished, it also checked its integrity, then it announced that all of it has this piece of PEERS. Any end of the connection contains two bits of status information: whether Choked, is it interested. Choking is the notification of the other party, no data can be sent unless UNChoking occurs. Causes of Choking and explanation of techniques. Once one end is turned into interested, and the other end becomes non-choking, then the data transmission begins. (That is, a peer, if you want to get data from its peer, then it must first set the connection between its two to INTERESTED, actually sending a message in the past, and another peer, to check Whether it should send data to this guy, if it is unchoke to this guy, you can send it data, otherwise it is still not possible to give it a data) INTERESTED state must have been set - any time. It is better to achieve this goal with some tips, but it makes the downloader immediately know which peers will start downloading.
The peer-to-peer protocol begins with a handshake, which is a loop message stream, and there is a number in front of each message to indicate the length of the message. The process of shaking hands is first sent to send 19, then send "BitTorrent Protocol". 19 is the length of "BitTorrent Protocol". All of the following integers are used by Big-endian to encode 4 bytes after the protocol name, which is 8 reserved bytes, which are currently set to 0. Next, the INFO information in the metafile is calculated by SHA1, and the HASH value is obtained, 20 bytes long. Receive the message party, will also perform a HASH operation for the INFO. If these two results are different, then explain the other party's file, not what he wants, so it is cut off.
Next is 20 bytes of Peer ID. This is the process of shaking hands
Next is the message stream starting with the length of the message, which is optional. Messages with a length of 0, used to keep the active state of the connection, is ignored. It is usually sent to a message every 2 minutes.
Other types of messages have one byte length message type, and the possible values are as follows:
The message of 'Choke', 'UNCHOE', 'INTERESTED', no longer contains other data.
'Bitfield' is always just the first message sent. Its data is actually a bitmap. If Downloader has sent a piece, the corresponding position 1, otherwise 0. Downloaders If a piece is not, you can ignore this message. (Through this news, what can you know?)
The 'have' type of message, the back data is a simple number, which is the index of the downloader just downloaded and checked the completeness. (From this, you can see that Peer has learned this message soon.
The 'Request' type of message, which contains the index, start position, and length) The length is 2 power. The current implementation uses 215, and when the connection is closed, the length of more than 2 17 is requested. (This type of message is that when a peer wants another peer to provide a piece of segment, the request is issued) 'Cancel' type message, its data and the 'Request' message. They usually send it only when the download trend is completed, that is, send it in the 'end mode "phase. When downloading is close to complete, the last few pieces take a long time to download. To ensure the last few pieces to download as soon as possible After it, it sends a download request to all PEERS. In order to ensure that this does not bring terrible inefficienactions, once a piece download is complete, it sends a 'Cancel' message on other Peers. (Meaning, I don't want this piece, If you are ready, you don't have to send me, you can imagine that if the other party is sent over, then this side must ignore these repetitive data).
The 'Piece' type message, which protects the index number, start position, and actual data. Note that this type of message and a potential connection between the 'Request' message (the translation:) is only responded to the 'PIECE' message after the Request message is usually. If the Choke and UNCHOKE messages are too fast, or, the transmission speed changes very slow, then some are not the desired pieces. (That is, sometimes I read some pieces, but these pieces are not what they want)