Due to a little more understanding Switch statements and C / C to Const's treatment Xie Yubo --------------------------------- -------------
Reprint, please specify the original author, to the source ~~
------------------------------------------------
Some time ago, I saw Taiwan Li Wei in the "Borland Legend >> book in the" Borland Legend >> ", the message processing section has some of the following analysis: he said that in the message processing cycle, the general form is such MSG. msg; switch (msg) {case wm_xxxxxx: .... case wm_xxxxxxx: .... Case WM_XXXXXXX: ....}; Li Wei said that this model is very inefficient, in response to compilation, this C code produces the following assembly code CMP .... ..... JNZ .... ..... CMP .... ..... JNZ .... ..... CMP. ..... JNZ .... ..... If your case is much enough, for example, you have 10,000 messages to deal with, and unfortunately you put a most common news. The last one, then when this news is handled, it will first pass 10,000 CMP and JNZ, Li Wei believes that this is very very very inefficient, it is really inefficient to bear, no need to endure ~~ P
In the beginning, I think so, but the recent reading and experiments have found that this view is very sided. Talk about this problem today (all experiments are completed under the Linux platform)
First, see programs written with C / * ------------------ FileName: Ta.c -------------- - * / int switch_test_first (int x) {int res; switch (x) {case 100: res = 1; Break; Case 102: res = 2; Break; Case 103: Res = 3; Break;} Return Res;} Then, we compile it into assembled files (using -s switch) GCC-S TA.C with GCC will get the following assembly file (ta.c ".text.globl switch_test_first .type switch_test_first, @functionswitch_test_first: pushl% ebp movl% esp,% ebp subl $ 8,% esp movl 8 (% ebp),% eax .file "ta.c" .text.globl switch_test_first .type switch_test_first, @ functionswitch_test_first: pushl% ebp movl% ESP,% EBP SUBL $ 8,% ESP MOVL 8 (% EBP),% EAX MOVL% EAX, -8 (% EBP) CMPL $ 102, -8 (% EBP) // 1 JE .L4 // 2 CMPL $ 102, -8 (% EBP) // 3 jg .L8 // 4 CMPL $ 100, -8 (% EBP) // 5 JE .L3 // 6 jmp.l2 // 7.L8: CMPL $ 103, -8 (% EBP) JE .L5 JMP .L2.L3: MOVL $ 1, -4 (% EBP) JMP .L2.L4: MOVL $ 2, -4 (% EBP) JMP .l2. L5: MOVL $ 3, -4 (% EBP) .L2: MOVL-4 (% EBP),% EAX Leave ret.lfe1: .size switch_test_first, .lfe1-switch_test_first .ident "GCC: (GNU) 3.2.2 20030222 ( Red Hat Linux 3.2.2-5) "
Note some of the files // 1 ~ // 7, from this part, we can see that GCC does turn some CASE statements to Li Wei's way to handle it, we see code There is a lot of CMPL and JMP statements, which is equivalent to you using IF..EELSE .., but is it always like this? We will change the TA.C this file, add some CASE statements / * - ------------- FileName: new_ta.c ------------------- * / int switch_test_first (int x) {int res; switch (x) {CASE 100: res = 1; Break; Case 102: Res = 2; Break; Case 103: Res = 3; Break; Case 104: Res = 4; Break; Case 105: Res = 5; Break; Case 106: RES = 6; Break;} return res;} This new_ta.c is identical to the original TA.C, the only difference is that the number of case statements has changed, let's compile this file GCC - S New_TA.C The following is the assembled file we generated. File "new_ta.c" .text.glob l switch_test_first .type switch_test_first, @
Functionswitch_test_first: pushl% EBP MOVL% ESP,% EBP SUBL $ 8,% ESP MOVL 8 (% EBP),% EAX SUBL $ 100,% EAX MOVL% EAX, -8 (% EBP) CMPL $ 6, -8 (% EBP) JA .L2 MOVL-8 (% EBP),% EDX MOVL .L9 (,% EDX, 4),% EAX JMP *% Eax .section.Rodata .align 4 .align 4.l9: // a .long .l3. Long.l2 .long .l4 .long .l5 .long .l6 .long .l7 .long .l8 .text.l3: // 1 mow $ 1, -4 (% EBP) jmp.l2.l4: // 2 MOVL $ 2, -4 (% EBP) JMP .l2.l5: // 3 MOVL $ 3, -4 (% EBP) JMP .L2 // 4 .l6: MOVL $ 4, -4 (% EBP) JMP .l2 // 5 .L7: MOVL $ 5, -4 (% EBP) // 6 jmp.l2.l8: // 7 MOVL $ 6, -4 (% EBP) .l2: MOVL-4 (% EBP),% EAX Leave Ret.LFE1: .size switch_test_first, .lfe1-switch_test_first .ident "GCC: (GNU) 3.2.2 20030222 (Red Hat Linux 3.2.2-5) "
Carefully compare this latest new_ta.s with the front TA.s, the essence is all inside! First, new_ta.s is more than one in front of TA.s, and its // 1 ~ // 7 There is no numerous CMPL and JMP statement existing in the front ta.s file, then how is this code now realizes the jump in the switch statement? Let's take a closer look at the new .L9 part of it. .Section.Rodata .align 4 .align 4.l9: .long .l3 .long .l2 .long .l4 .long .l5 .long .l6 .long .l7 .long .l8 .text is obvious, .l9 part is One of our most common data structures - table, each of which is a label, and this label is precisely the entry label for each case statement! This is easy for us to think that it is likely to use a table to store all the entrances of all the CASE statements, and then directly detect the entry address of the corresponding CASE statement from this table when performing the switch statement, then Jump to the corresponding CASE statement to execute, just like hash_table. Is it true? Let's take a look at the code to enter Switch: Pushl% EBP MOVL% ESP,% EBP SUBL $ 8,% ESP MOVL 8 (% EBP),% EAX SUBL $ 100,% EAX MOVL% EAX, -8 (% EBP) CMPL $ 6 , -8 (% EBP) JA .L2 MOVL-8 (% EBP),% EDX MOVL .L9 (,% EDX, 4),% EAX // 1 JMP *% EAX // 2 Sure enough! First, at / / 1 according to the value of the% EDP (the value corresponding to the table), the inlet address of the corresponding CASE statement is found in the table of the CASE statement, and the address is stored in the% EAX, then passed // 2 (This is an indirect jump statement) to the address stored by the% EAX, that is, the corresponding CASE statement. C compiler, really smart!
Through this analysis, we can know the following two points: 1. When the case statement is less, the C compiler turns it to IF..EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEELSE .. Types for processing, using more CMP and JMP statements, and when Case statements are more When it is more, the C compiler will be a jump form, and it is directly jumped through the jump table, which makes the Switch have very high values, and it is hardly reduced due to the growth of the CASE statement. The problem that Li Wei is concerned is that it will not happen at all. 2. You can ask the following questions: 1. Why is the integer type in the case statement is not the rest of the type? This is because this value in the CASE statement is used to make a subscript of the jump watch, and therefore, of course, it must be an integer 2. Why is the case statement with direct access when not adding BREAK? This is because jump is entry Switch is calculated, not calculated in the CASE statement, the entire CASE statement group is a complete and continuous code, just Switch let it start from different locations.
The above content, there is a very detailed discussion in "Computer Systems A Programmer's Perspective", I can go to see it carefully ~~~
Since the CASE statement needs to be an integer's constant value, would we use the const type? For example, the following code:
const INT C_1 = 100; Const Int C_2 = 102;
Void Test (INT X) {Switch (x) {CASE CASE CASE CASE: X; Case C_2: --X;
This code is compiled with the C compiler. The compiler will prompt the error, but it will not be due to the C compiler, which is mainly due to C, with the C compiler to handle the control of Const this. Let's take a look at the following C program / * ------------- FileName: const_c.c --------- * / const I = 15;
Void f (int x) {x = a;} The GCC -S const_c.c is compiled with GCC and then look at its assembly file. File "const_c.c" .globl a .section.rodata .align 4 .type A, @ Object .size a, 4a: // 1 .long 15 .text.globl f.Type F, @ FunctionF: Pushl% EBP MOVL% ESP,% EBP MOVL A,% EAX / / 2 MOVL% EAX, 8 (% EBP) Leave ret.lfe1: .size f, .lfe1-f .ident "GCC: (GNU) 3.2.2 20030222 (Red Hat Linux 3.2.2-5)" Watch // 1, the C compiler is A assigned the address and set it to 15, and at // 2, it is assigned the value in this address to the% EAX, which is nothingord in the same common common variable rather than the Const variable.