IL code underlying operation mechanism loop processing

zhaozj2021-02-16  58

IL code underlying operation mechanism

Cyclic processing

Liu Qiang

Cambest@sohu.com

October 22, 2003

Previous article We discussed the basic operational mechanism of IL code. In this article, we will discuss how the IL code handles cycles in C #. Examples also relate to array processing, as well as some new instructions. Although some people have been related to the relevant issues, I also have seen a few related articles, but I think they describe it is not very clear, so I will take the opportunity to reinite into a text, I hope to understand everyone. Net will have something. Help, and also hope to help the relevant designers who study the virtual machine mechanism.

Similarly, here is also a C # code first, then let us study their compiled IL code in detail. Here is a C # code, which contains three cycles, which are for, while, foreach loop:

Public int looptest ()

{

INT i = 3;

INT j = 9;

INT S = 0;

INT K; file: // The above statement defines the variable and initializes

For (k = 0; k <= i; k )

{

S = K;

} File: // for cycle block

K = 0;

While (K

{

S = K;

K ;

} File: // while loop block

Int [] array = {2, 3, 4, 5, 6, 7, 8, 9};

Foreach (int a in array)

{

S = a;

} File: // foreach circulation block

Return S;

}

Here, what we have to do is to figure out how the C # compile is translated into the source program to achieve loop processing, or how to use the IL language to realize the cycle in the C # language. This is very helpful for our in-depth understanding of C # language characteristics. Of course, this is not enough, I will introduce more related questions.

Let us first let's see what IL code is compiled by this function:

.method public hidebysig instance int32 looptest () CIL Managed

{

// Code size 101 (0x65)

.MAXSTACK 3

.locals init ([0] int32 i,

[1] INT32 J,

[2] INT32 S,

[3] INT32 K,

[4] INT32 [] 'Array',

[5] INT32 A,

[6] INT32 CS $ 00000003 $ 00000000,

File: // The same local variables as the function return value, maintained by the compiler, specifically for storage returns

//value. If the function is a VOID type, there is no such variable.

[7] INT32 [] CS $ 00000007 $ 00000001,

File: // Local variable, store array reference for Foreach cycles. This example corresponds to the 'Array' array.

[8] INT32 CS $ 00000008 $ 00000002

File: // Local variable, store array index. Dedicated to the Foreach cycle, maintained by the compiler.

)

IL_0000: ldc.i4.3

IL_0001: STLOC.0

IL_0002: ldc.i4.s 9

IL_0004: STLOC.1

IL_0005: ldc.i4.0

IL_0006: STLOC.2

IL_0007: ldc.i4.0

IL_0008: STLOC.3

IL_0009: Br.s IL_0013il_000b: ldloc.2

IL_000c: LDLOC.3

IL_000D: Add

IL_000E: STLOC.2

IL_000F: LDLOC.3

IL_0010: ldc.i4.1

IL_0011: Add

IL_0012: STLOC.3

IL_0013: LDLOC.3

IL_0014: LDLOC.0

IL_0015: BLE.S IL_000B

IL_0017: LDC.I4.0

IL_0018: STLOC.3

IL_0019: Br.s IL_0023

IL_001B: LDLOC.2

IL_001C: LDLOC.3

IL_001D: Add

IL_001E: STLOC.2

IL_001F: LDLOC.3

IL_0020: ldc.i4.1

IL_0021: Add

IL_0022: STLOC.3

IL_0023: LDLOC.3

IL_0024: LDLOC.1

IL_0025: BLT.S IL_001B

IL_0027: ldc.i4.8

IL_0028: NEWARR [mscorlib] System.Int32

File: // Create a System.Int32 array of length 8. It can be seen that array elements are mapped to INT32 objects.

IL_002D: DUP

IL_002e: ldtoken Field ValueType '' / '$$ Struct0x6000002-1'

' ::' $$ Method0x6000002-1 '

IL_0033: Call void [mscorlib] system.Runtime.compilerServices.RuntimeHelpers ::

InitializeArray (class [mscorlib] system.Array, valueetype [mscorlib] system.RuntimefieldHandle)

IL_0038: STLOC.S 'Array'

IL_003A: ldloc.s 'arrility'

IL_003C: STLOC.S CS $ 00000007 $ 00000001

IL_003E: ldc.i4.0

IL_003F: STLOC.S CS $ 00000008 $ 00000002

IL_0041: Br.s IL_0055

IL_0043: LDLOC.S CS $ 00000007 $ 00000001

IL_0045: LDLOC.S CS $ 00000008 $ 00000002

IL_0047: LDELEM.I4

IL_0048: STLOC.S A

IL_004A: LDLOC.2

IL_004B: ldloc.s a

IL_004D: Add

IL_004E: STLOC.2

IL_004F: LDLOC.S CS $ 00000008 $ 00000002

IL_0051: ldc.i4.1

IL_0052: Add

IL_0053: STLOC.S CS $ 00000008 $ 00000002

IL_0055: LDLOC.S CS $ 00000008 $ 00000002

IL_0057: LDLOC.S CS $ 00000007 $ 00000001

IL_0059: LDLEN

IL_005A: conv.i4

IL_005B: BLT.S IL_0043

IL_005D: LDLOC.2

IL_005E: STLOC.S CS $ 00000003 $ 000000 IL_0060: Br.s IL_0062

IL_0062: LDLOC.S CS $ 00000003 $ 00000000

IL_0064: RET

} // end of method advanced :: looptest

See the article about the function topics such as .locals init statements, please refer to Article

Command Significance Memory Method (*) Br.s absolutely jump, equivalent to jmp bLt.s smaller than turning Lower Than Ble.s less than or equal to turn Lower or Equals Ldlen acquired array length LDELEM.I4 Detection item according to index

Here we can see the .locals init directive gives the same variable name of the homologous program. This is because when it is reversible, there is a debug information file (* .pdb) in the same directory, otherwise we see the result variables represented in v_x (such as V_1, V_2, etc.). For topics about the local variables, see the "Function Related".

If you have Win32 assembler design experience, it may be familiar with how to achieve loop control. For example, to achieve the function of adding from 10 to 100, we may do this:

MOV ECX, 100 file: // ECX register storage loop count

XOR EAX, Eax file: // Clear the EAX and Sign Register

LOOP: Add Eax, ECX file: // Realize additional and saving EAX

Dec ECX file: / / counting minus one

CPR: CMP ECX, 9 file: // Judgment ECX> = 10 or ECX> 9

Jg loop file: // If the result is true (greater than), turn LOOP

This is different from the advanced language (C / C / Java / C #), the loop condition in the FOR cycle is given in the first part of the program, and the low-level language executed, such as MASM is accustomed to test the cycle condition at the end of the cycle. Then how the C # compiler is to process the C # cycle condition position and the location of the loop condition test statement position in the general assembly, and use IL to realize the cycle condition detection and correctly realize the loop? First, here I would like to explain that in the assembly language executed in order, the test cycle condition is completely in the circular head. As the IL version of the above example:

.locals init ([0] int32 Eax, [1] INT32 ECX, [2] INT32 RET_VAL)

LDC.I4 100

STLOC.1 File: // Mov ECX, 100

LDC.I4.0

STLOC.0 File: // xor Eax, Eax or Move Eax, 0

L_0000: LDLOC.1

LDC.I4 10 file: //

BLT.S L_0003 // ECX <10? YES-> JMP L_0001: NO -> Go ONL_0001: LDLOC.0

LDLOC.1

Add

STLOC.0 file: // These sentences implement EAX = EAX ECX

LDLOC.1

LDC.I4.1

Sub

STLOC.1 File: // These sentences implement ECX = ECX-1

L_0002: Br.s l_0000

L_0003: ...

Second, I want to explain the reasons that don't do this. Reasons are two, one is to destroy normal logic, this is from the compiler level. For example, for statement IF (k = j) comparison, if you truly jump out of the loop area, if you continue; you have to set an absolute jump statement at the end of the loop, to jump To the header of the comparison command. By (k = j) transition, it is very simple and intuitive for us to do more work to do more work for compilers. What's more, there are more complex Boolean expressions, such as (K> J) && (k> 34) || (j <= 56). This increases the burden on the compiler - although it is not a big burden. Moreover, since the jump statement has also increased, the positioning of the compiler on the jump position has increased difficulty. Let me know that the assembler is in processing, calculating the labeling language in the assembly language and the offset of the jump instruction, and the advanced language is more complicated. Therefore, it is easy to understand and easily adopted.

Let's take a look at the specific implementation of the three cycles in the example. For the basic operation mechanism of IL code, please see the article "IL code underlying operation mechanism". IL_0027 to IL_003F is an array initialization, it is more difficult to understand. We will put down, I will introduce.

1. For statement

It can be seen that IL_0000 to IL_0008 in the block is to perform variable initialization work. Starting with IL_0009, it is a cyclic body. IL_00009 is a direct (absolute) jumping statement, jumps to IL00_13. Let's take a look at the content here:

IL_0013: LDLOC.3

IL_0014: LDLOC.0

IL_0015: BLE.S IL_000B

Load partial variable 3 (i.e., k), then load local variable 0 (ie j). Behind is a comparison transfer instruction BLE.S. It is not difficult to see that these three statements are used to compare the size of K and J. If the comparison result is true (less than or equal), it is transferred to the cyclic body (IL_000B), and the direct loop body is continued for the time. The process is like a cloud, simply intuitive, and not explain. From here we can also see that the FOR statement is the condition test, and then the cyclic body is performed.

... ldloc.3 ldloc.0 ble.s

Top ...

TOP K ...

TOP J K ...

Top ...

The LOAD command adds the variable one by one to the program stack. BLE.S instructions are compared. It is worth noting that BLE.S also wants to make a clear operation. Not only Ble.s, other conditional transfer instructions are also true. 2. FOREACH statement

The Foreach statement and the for statement process are roughly equivalent. What we are interested is how Foreach processes boundary conditions. From the IL0041, it enters the Foreach cycle. The same direct jump instruction took us to IL_0055, let us see what it is.

IL_0055: LDLOC.S CS $ 00000008 $ 00000002

IL_0057: LDLOC.S CS $ 00000007 $ 00000001

IL_0059: LDLEN

IL_005A: conv.i4

IL_005B: BLT.S IL_0043

As I introduced the CS $ 00000008 $ 00000002 is a stored array index, CS $ 00000007 $ 00000001 is a reference to array 'array'. The process operation of IL0055 to IL005B is this: first load the current index to the program stack, then load an array reference (32-bit HashCode). The LDLEN instruction acquires array length (64-bit long integer) based on array references and converts it to 32 to compare indexes with this length. If less than, transfer to the cyclic body continues; otherwise the loop is loop. From here we can also see that IL gives strong support to array operations directly, which provides corresponding instructions.

3. While statement and do-while statement

As can be seen from the example, the WHILE and FOR loop processing methods are the same. There is no DO-While example here, but you can imagine that it is the same as for the FOR statement. However, the Do-While cycle should be noted that the direct jump instructions in which the circle is not like the FOR and Foreach loop to jump to the condition test code. Therefore, no matter what, the Do-While cycle is at least once.

In this article, I introduced several relevant conditional jump instructions, and how the C # compiler processes cycles in the C # language. In fact, this article cannot be fully considered an IL underlying mechanism related articles, but it is necessary to understand IL in depth, this foundation is still necessary.

转载请注明原文地址:https://www.9cbs.com/read-25623.html

New Post(0)