A class that completes port model packages for Winsock
Author: elssann@yahoo.com.cn
Reprinted needs to indicate the original author
Under Windows, network server program development, no doubt, Winsock completes the port model is the most efficient. Winsock's completion port model is implemented with Widnows overlap IO and complete ports, complete the port model is relatively simple, but if you want to master the Winsock complete port model, you need threads, threads, thread synchronization, Winsock API, and Windows, WINSOCK API IO mechanism has a certain understanding. If you don't understand, recommend several books: "Inside Windows 2000," Windows Core Programming "," WIN32 Multi-Thread Program Design "," Windows Network Programming Technology ". Last year, I wrote a webserver with the completed port model in the C language. I decided to rewrite this Webserver with C , add some functions to this Webserver, and improve the completion of the port operation method, such as using AcceptEx instead Accept and use the Lookaside List to manage memory so that WebServer's performance has been greatly improved.
At the beginning of the rewrite, I decided to package the completed port model into a relatively universal C class, for the development of various network server, as long as it simply inherits this class, change two virtual functions to meet various needs . Up to yesterday, Web Server rewritten, I wrote this article to make a summary of the port model and introduce my class.
One: Complete port model
As for a detailed introduction of the port and Winsock Complete the port model, see the books I have described above, here is just a little experience in completing the port model.
First we have to abstract a complete handling process:
1: Create a completed port.
2: Create a thread A.
3: A thread cycle calls the getQueuedCompletionStatus () function to get the IO operation result, this function is a blocking function.
4: The host thread cycle calls the Accept Waiting for the client connection.
5: After the main thread, Accept returns new connection creation, associate this new socket handle with CreateiocompletionPort to complete port, then send an asynchronous WSasend or WSARECV call, because it is an asynchronous function, Wsasend / WSARECV will return immediately, actually The operation of sending or receiving data is made by the Windows system.
6: The main thread continues the next loop, blocking in Accept here waiting for the client connection.
7: The Windows system completes the operation of WSasend or WSARECV, issues the result to the completion port.
8: getQueuedCompletionStatus () at the A thread returns and gets the result of the hard-completed WSasend / WSARECV from the completion port.
9: Treat this data in the A-thread (if the process is time consuming, new open process is required), then issue WSASEND / WSARECV and continue the next cycle blocked in getQueuedCompletionStatus () here.
For details, please see the drawings, where the red line indicates the processing of the Windows system, and does not require our program intervention.
Intronising in the final analysis of the port model:
We keep asynchronous WSasend / WSARECV IO operations, and the specific IO process is completed by the Windows system. After the Windows system completes the actual IO processing, send the results to the completion port (if there are multiple IOs to complete, then Just place a queue in the completion of the port). We continue to remove the IO operations from the completion port in another thread, and then issue WSASEND / WSARECV IO operations as needed. Here's this is my webserver model, supports GET and POST requests, where the GET is used to request the HTML page, and the POST is used to submit or query the data to decompose the data for the client browser. If you find a post command, Submit the action to the thread pool processing.
Second: Improving several effective methods for completing port efficiency
1: Use acceptex instead of Accept. The ACCEPTEX function is Microsoft's Winsosk extension function. This function and accept are the difference: Accept is blocked. It has always been returned after the client connection is connected, and AcceptEx is asynchronous, and it will be returned directly, so we use AcceptEx Multiple AccePtex calls can be issued
Wait for the client connection. Also, if we can foresee the client, you will send data (such as a webserver client browser), you can deliver a buffer as AcceptEx, so that the connection is successful, you can receive the client. Data into buffer, if you use, an AcceptEx call is equivalent to a continuous call for ACCPET and RECV. At the same time, several of Microsoft's extended functions are optimized for operating systems, efficient than Winsock standard API functions.
2: Use the SO_RCVBUF and SO_SNDBUF options on the socket to turn off the system buffer. This method sees benevolence, detailed introduction can refer to Chapter 9 of "Windows Core Programming". Not a detailed introduction here, I have not used this method in the class of packages.
3: Memory allocation method. Because each time a newly established socket must dynamically allocate the data structure of "single IO data" and "single handle data", then release it when the socket is closed, so if there are thousands of customers When frequent connection, make the program a lot of overhead costs on memory allocation and release. Here we can use the Lookaside List. I started to see Lookaside List in Sample in Microsoft's Platform SDK, I don't understand, there is no MSDN. Later, I found it in the DDK document,
LOOKASIDE LIST
A system-managed queue from which entries of a fixed size can be allocated and into which entries can be deallocated dynamically. Callers of the Ex (ecutive) Support lookaside list routines can use a lookaside list to manage any dynamically sized set of fixed-size Buffers or structures with caller-determined contents.
For example, the I / O Manager uses a lookaside for fast allocation and deallocation of IRPs and MDLs. As another example, some of the system-supplied SCSI class drivers use lookaside lists to allocate and release memory for SRBs.lookaside list name more eccentric (Maybe it is my lonely, the first time I saw), in fact, a memory management method, similar to the memory pool usage. My personal understanding: is a single-link list. Every time you have to assign memory, first check whether this linked list is empty. If it is not empty, you will not need a new allocation from this linked list. If you are empty, you will be dynamically allocated. After using the completion, the data structure is not released, but insert it into the linked list so that the next use. This is much higher than the efficiency. In my program, I use this single-link list to manage.
After we use Acceptex and deliver a Buffer with AcceptEx, it will bring a side effect: such as a client only executes a connection operation, do not perform send operation, then AcceptEx does not complete, according to GetQueuedCompletionStatus This does not have the result of the operation in the port, so that if there are many such connections, the program performance will cause a huge impact, we need to use a method to schedule, when a connection has been established and the connection time exceeds us Time and no data, then we turn it off. The detection connection time can be used to call getSockopt with so_connect_time.
There is also a payable place: that we can't send a lot of AcceptEx call waiting customers to connect, so that the performance of the program has an impact, at the same time, it is necessary to add ACCEPTEX call when we send the AcceptEx call, we can put FD_ACCEPT events and an Event association, then
Waiting for this event with WaitForsingleObject, when the number of AccpeTex calls has been exhausted, and the new client needs to connect, the fd_accept event will be triggered, and Event becomes the communication status.
WaitForsingleObject returns, we will reissue enough AcceptEx calls.
The completion of the port model will be introduced here. Let me introduce my package, after this class is written, I use this class to make an echoserver.
void main ()
{
CompletionPortModel P;
p.init ();
p.allocEventMessage ();
IF (false == p.postacceptex ())
{
Return;
}
p.ThreadLoop ();
Return;
}
When using this class, you only need to derive a subclass from this class and rewrite the two virtual functions of HandleData and DataAction. For data applications that require continuous transmission (such as transfer files), users need to extend these two functions For example, create a global queue, insert the queue after getting data from the completed port, and then use another thread to deal with this queue. . .
From the results, there are still many places that need to be improved, such as the case where the multiprocessor is not considered. Without considering the completion of the port thread blocked condition, if you consider the completion of the port blocking situation, you should create a CPU data * 2 completed port threads, etc. If the time and energy are limited, there is no further improvement of this class, and there may be unreasonable and wrong places in the program. Please give me a lot of advice. It is difficult for high-performance service-end program development, remember to chat with Tencent's technicians. He said that the development of Tencent QQ is not in the client, and communication and synchronization between the servers of the server. The cluster and load balancing of the service channel program are a very complicated issue. I just get in this regard, I hope to have more masters to share their own experience. When this class is encapsulated, I read the example in the latest Platform SDK once, borrowed many of these ideas and methods.