Tracker Server Source Code Analysis 3: HTTPHANDLER Class

zhaozj2021-02-16 103

Author: Ma Ying-jeou

Date: 2004-6-7

This article analyzes the HTTPHANDLER class, which is in the httphandler.py file.

Previous article, Rawserver is only responsible for network I / O, which is read and transmitted from the network, as for how to analyze the data read, and what kind of data should be sent, then hand it over to the Handler class. If you are implemented with C , then the handler should be an interface class (providing a few virtual functions as an interface), but the Python dynamic language features, do not need to specifically define such an interface class, so there is actually there is such a one. class. Any class that provides the following member functions can be used as a Handler class to cooperate with RawServer, they are:

EXTERNAL_CONNECTION_MADE (): Call when establishing a new connection

DATA_CAME_IN (): When there is data readable on the connection, it is called

Connection_flushed (): When the data is sent over a connection, it is called

Httphandler is such a Handler class with an interface.

Httphandler code is very little because it handles the main work to HTTPConnection.

We look at these functions of the HTTPHandler class:

l EXTERNAL_CONNECTION_MADE ():

Whenever a new connection is new, create an HTTPConnection class.

l data_came_in ():

When there is data readable on the connection, call httpConnection :: data_came_in (). Let's take a look at HTTPConnection :: DATA_CAME_IN ().

We know that between the BT Client end is communicated between the Tracke HTTP protocol between the TRACKER server. The HTTP protocol is divided into request and response (response), and the specific protocol is deserved to see the relevant RFC documentation. I will talk about it here.

For the Tracke server, the data it read is an HTTP request at the Client.

The HTTP request is based on behavioral units, and the end of the line is "Entering Renewal", that is, ASCII characters "/ r" and "/ n".

The first line is the requested URL, for example:

Get / announce? Ip = aaaaa; port = bbbbbb http / 1.0

This line of data is divided into three parts spaces.

The first part GET represents the command, and other orders include POST, HEAD, and more, which is commonly used.

The second part is the requested URL, here is / announce? Ip = aaaaa; port = bbbbbb. If it is a normal Internet browsing web page, the URL is the relative path we have to see on the web server. However, the URL here is just a way to interact information. The client ends to report to the Tracker information, placed in the URL, the example is IP and Port, more detailed information, please see the "BT protocol specification" in the "BT Protocol Specification" Tracker Protocol section.

The third part is the version number of the HTTP protocol, ignored in the program.

The next row is all part of the HTTP protocol, for example:

Host: www.sina.com.cnaccept-encoding: Gzip

Through the message head, the Tracker server can know some of the information on the client side. This is important that accept-encoding. If it is gzip, then the Client can decompress the data in Gzip format, then the Tracker server can consider using Gzip to respond to data with Gzip After compression, turn back to reduce network traffic. We can see the corresponding processing in your code.

At the end of the message head, it is a blank line, indicating that the message head is over. For the GET and HEAD commands, the end of the message header means that the entire client's request is over. For the post command, you may follow other data. Since our Tracker server accepts only the GET and HEAD commands,, in the protocol processing, if you encounter a blank line, then the processing is completed.

HttpConnection :: DATA_CAME_IN () uses a loop to make protocol analysis:

The first is to find a line end symbol:

I = Self.buf.index ('/ n')

(I think just find "/ n" is not rigorous, you should find "/ r / n" sequence).

If not found, the index () function throws an exception, and the abnormal processing is to return True, indicating that the data is not enough, need to continue reading data.

If found, then the string before i is a complete line. So the protocol handler is called, the code is:

Self.next_func = Self.next_Func (VAL)

In the initialization of HTTPConnection, there is such a line of code:

SELF.NEXT_FUNC = SELF.READ_TYPE

Next_func is used to save the protocol handler, so the first called protocol processing function is Read_Type (). It is used to analyze the first line of the client request. At the end_type (), we see:

Return self.read_header

In this way, when the next time_func is called next time, it is called read_header (), which is to analyze the message header of the HTTP protocol.

Let's look at Read_Type first below.

It first saves the URL section in the GET command to Self.Path, because this is the most critical information of the Client side, which is used later.

Then check if it is a get or head command, if not, then the data has an error. Return none, otherwise return self.read_header

Next, let's look read_header (),

This is the most important thing to process the space line, because the blank representation of the protocol analysis ends.

After checking if the Client side supports Gzip encoding, call:

r = self.handler.getfunc (self, self.path, self.headers)

Through a layer of layers, getfunc () is actually Tracker :: get (), that is, truly analyzing the request from the client end, and how to respond, is determined by tracker. Yes, this Tracker has seen in the first article of our Tracker Source Analysis Series. After creating the Rawserver, a Tracker object is created immediately. So, what is going to understand how the Tracker server works, and we need to go in into the analysis of the Tracker class, which is our next article. After calling Tracker :: get (), returning is the data determined to respond to the Client side.

IF r is Not None:

Self.answer (R)

Finally, Answer () is called to send these data to the Client side.

Analysis of ANSWER (), we explain in the next article to analyze the Tracker class.

l Connection_Flushed ():

The TRACKER server is used by non-blocking network I / O, so it cannot be guaranteed to send all the data to be sent in operation.

This function checks if the data you need to send on a connection is all sent out. If so, turn off the sender of this connection. (Why only turn off the sender, not completely closed this connection? Doubt).

转载请注明原文地址:https://www.9cbs.com/read-21703.html

9cbs

New Post(0)