Tracker Server Source Code Analysis 2: Rawserver Class
Author: Ma Ying-jeou
Date: 2004-5-30
This article, let's analyze Rawserver and some related classes. The implementation code of the Rawserver class, in the Rawserver.py of the BitTorrent subdirectory
Rawserver's role is to implement a web server. About network programming knowledge, "UNIX network programming: volume 1" is the most classic book, if you don't understand this, it is recommended to take a look at this book. Rawserver implements an event multiplexed, non-blocking network model. It uses Poll () (rather than our common select (), comparison with poll and select, is also introduced in "UNIX Network Programming: Volume 1") function, the process is roughly like this:
First create a listener Socket, then add this socket to the event source of POLL;
Then enter the service processing loop, namely:
Call the poll () function, this function will block until some events have any events or timeouts return to the caller;
After Poll () returns, check if there is any task that is processed, if so, complete these tasks first. Then processed according to the type of event.
If it is a connection request (listener on the Pollin event on the socket), it accepts this request, if accept is successful, then connects to a client to join accept () new Socket to join the event source of POLL;
If there is a data readable on the established connection (connect the Pollin event on the socket), then the data is read from the client, and further processing;
If the established connection is ready (Connect the pollout event on the socket), you can send data, check if there is data to be sent, if there is, then send data to the Client end.
(So, Tracker is a single process server and does not use threads.)
BRAM Cohen believes that the software's maintenance is very important, making the code easy to maintain an important one is a reusable class, Rawserver is designed, fully considering reusability, concentrated in two places:
1. Separate network I / O and data analysis.
The event multiplexing of the web server, the network I / O portion is usually fixed, and the data is analyzed after reading, and the process of analyzing processing is variable. Rawserver works a variable data to hand over another abstract class handler (actually does not have such a class). For example, in the implementation of the Tracker server, the specific use is the HTTPHANDLER class, and in the BT Client implementation code that will be analyzed later, the specific handler used is an Encoder class.
2. Use the task queue to abstract the process of task processing.
Rawserver maintains a task queue unscheduled_tasks (actually a binary group List, the first item of the binary group is a function, the second item is timeout). When initialization, first join a task to this queue: scan_for_timeouts (), so, every other time, the server will check if there is a connection timeout. If there is other Rawserver member function, the external exposure is:
u __init__ :( Initialization function)
u add_task ():
Add a task in the task list (a task is a function and a combination of a specified timeout time)
u bind ():
First create a socket, then set the attribute of the socket: SO_REUSEADDR and IP_TOS, the specific meaning of these two attributes Please refer to "Unix Network Programming: Volume 1", and set the socket to non-blocking. For blocking sockets, non-blocking sockets improve many in terms of network I / O performance, but at the same time, the complexity of programming is also improved. This is like Tracker to handle thousands of concurrently connected servers, and can only use unstickful sockets.
Then then bind the socket and the specified IP.
Finally, add this socket to the source source of POLL.
u Start_Connection ():
Outover active establish a connection, this function is used when processing NAT crosses, and we analyze the NAT crossing, then explain it.
u listen_forever ():
The functionality of this function is to implement the process of processing the web server described earlier. We see that its only parameter is Handler, the role of Handler is to encapsulate specific processing of data.
Listen_Forever () handles the process of the network event to Handle_Events ().
Other functions, including handle_events (), are internal functions (that is, the external functions are not directly called). Python does not have a protection mechanism such as PUBLIC, Protected, Private, and the practice of the internal function named Python class is the following line, such as _close_dead (), etc. in RawServer.
u Handle_Events ():
The event handling process is mainly processed separately according to three different network events. First, the connection event, the second is to read the event, and the third is to write an event.
If SOCK == Self.Server.Fileno ()
This code determines whether the Socket of the event is listening to Socket, if yes, then explain the connection event.
Connection event processing:
Accepted the connection by Accept, and set the newly established socket to non-blocking.
It is determined whether the current connection has reached the maximum (in order to limit the number of concurrent connections, when initialization RawServer, you need to specify the maximum number of connections), if the maximum is reached, then the newly created connection is closed.
Otherwise, create a SingLesocket object according to the new Socket, (SingLesocket encapsulates the operation of the socket.) Add this object to the internal list SINGLE_SOCKETS for use. Add this new Socket to the event source of POLL
Finally, call the Handler's External_Connection_made () function, about this function, and discuss when HTTPHandler is analyzed later.
IF (Event & Pollin)! = 0:
This code is judged whether it is a read event
Read the process of reading:
First refresh the last update time (last_hit).
Then read the data;
If nothing is not read, then the connection is closed (in the network programming, if a connection is closed, then the read event is triggered, but only what is not read)
Otherwise, call the Handler's DATA_CAME_IN () function to process the read data.
IF (Event & Pollout)! = 0 and s.socket is not none and not s.is_flushed ():
This code determines whether it is a write event and does have data to be sent. Writing event occurs when a connection can be written.
Processing of writing events:
The actual code is in the try_write () function in the SingLesocket.
In a non-blocking connection, you can send the designated size data, which is likely that the data is not completely sent out (only a part) is returned, so after each WRITE, it must be judged to be completely sent. data. If you don't send it, then when you read an event next time, you have to return to the endless data. This is why this function is called try_write.
Try_Write () In the end, you want to reset the source source of the POLL. If the data is all sent, then only listening to the event (Pollin) does not listen to the event, but also listen to the event (Pollout), so that once the connection can be written, you can continue to send the remaining data Go out.
u scan_for_timeouts ():
Task handler, it first adds itself in the unprocessed task queue, so that after a period of time, it can ensure that this function is again called, thus achieving the effect of periodic calls.
It checks if each connection exceeds the specified time without being refreshed, if yes, the connection may be dead, then it closes this connection.
u Pop_unscheduled ():
A unprocessed task is popped from the task list.
Used with Rawserver is a SingLesocket class. This is a auxiliary class. The main purpose is to encapsulate the handle of Socket. Including the sending of data, it is handed over to it. This class is relatively simple, you can go to see yourself, I will not be jealous.
The above is an analysis of the specific implementation of RASSERVER. Maybe readers are still fainted, no way, or must go to see the source code, and then when you encounter problems, come back to see this article, will help . If you don't live to see the source code, you will be on the paper.
Let's take a knot.
Rawserver encapsulates the details of the web server, which implements an event multiplexing, non-blocking network model. It is mainly responsible for establishing a new connection, reading and transmitting data from the network, and handles the specific processing of the read data, handed over to the Handler class, thereby separating the network I / O and data processing, making RawServer Reuse. The Handler class is when calling listen_forever (), passed by the caller, and the TRACKER server is HTTPHANDLER. With Rawserver, Tracker can run as a web server. In the next section, we started analyzing the HTTPHANDLER class and TRACKER classes that implement the TRACKER HTTP protocol.