Source program safety problem in .NET system (II, intermediate language)

zhaozj2021-02-17  65

In order to understand what happened during the process of constructor with VB.NET construction, we need to create an example of generating code and assembly: Open vs.net, create a new Visual Basic engineering, add a text in the form Label, then change the text attribute of the text tag to "Good Bye Visual Basic 6.0" (Figure 1), named this app to Goodbyevb6.

figure 1

Before you go deep into the .NET system, we need to know some marginal knowledge and terminology about .NET. First, IL (intermediate language) is not a new concept, VB, C compiler generation and use IL already has a few years of history, but few people disclose it or write documents for it. One of the largest changes in .NET is the code generated by the compiler. In addition to the name, the new MSIL and the VB6 compiler have very little class. Therefore, if you have been in contact with IL before, you have to learn from the beginning. See Figure 2, it is the MSIL code snippet app for Goodbyevb6:

figure 2

This code snippet sets an 8-byte stack, then put the THIS pointer stack, call the GET_Label1 method. Next, the code presses the label text to be set into the stack and then calls the setText method.

Traditional CPUs use registers and stacks to complete all work. The execution engine provided by the CLR has only one stack, and its operation is very similar to a reverse polish representation calculator. If a process call has multiple parameters, the execution engine will put the parameter stack before sending a call. The return value of the function call is also passed through the stack.

The local variables in MSIL are easy to identify, they use .locals keyword statement. If the symbol exists, you will see the variable name; otherwise, you will see the variables such as V_1, v_2:

.locals init ([0] INT32 X,

[1] INT32 Y,

[2] Float64 Z,

[3] Class System.String VB_T_String_0)

The LDARG instruction puts the parameter into the stack, the LDC instruction puts the digital constant into the stack, and the STLOC instruction saves the value to the appropriate local variable:

@ 000064: DIM X as integer = 100

IL_0001: ldc.i4.s 100

IL_0003: STLOC.0

In this example, the constant 100 is pressed into the stack as a 4-byte integer, and then this value is saved to the first partial variable. For a complete description of the MSIL directive, see the ilinstrset.doc file for the IL programmer reference.

All MSIL outputs in this article are based on the debug version of Goodbyevb6. Non-debugging versions do not bring a line and variable name, but still provide a large amount of useful information. When viewing MSIL code, the debug symbol is important, but it is not essential.

When we run a compiler, it is not the executive file we are familiar with today, but an assembly (Assembly). The assembly is a collection of files, and the files in the program set can be deployed as a single whole. In the current Windows system, we can see a single executive file as an assembly. But from a stricter sense, the assembly aggregates the execution file and all its support files, including DLL, graphics, resources, and help files.

In general, an assembly consists of at least two files: execution section, manifest (English word origin: cargo list, passenger list). Manifest is a list of all files in the program collection. The executable portion in the program set is also referred to as a module. Conceptually, the module corresponds to a DLL or EXE file; each module contains metadata in addition to the metadata included in the parent assembly. The assembly is a enhancement version of the currently portable executable, PE. As shown in Figure 3, the beginning of the file is a standard PE header. The inside of the file contains the CLR header, and the CLR header is the description data necessary to load the code in the process space - ie metadata. Metadata provides a lot of information for the execution engine, including: how to load the module, which support files, how to load support files, how to interact with COM and .NET runtime environment. In addition, metadata also describes the methods, interfaces, and classes included in the module or assembly. The information provided by metadata makes the JIT compiler to compile and run the module. At the same time, metadata exposes a large amount of internal information about the application, making it more convenient to obtain valuable code from the disassembled IL.

image 3

The core issue of using the .NET code is being managed. The managed code is a code written specifically under the CLR control, which can be created in languages ​​such as VB.NET, C #, and C , but C is the only language that can create .NET platform is not managed by management. We can't create non-managed code with VB6 for .NET platform, because we compile code into i386 instead of IL code in VB6. Just as using VB.NET, if you want to use the managed code, you can only compile the code into IL.

Now let's take a look at what is the advantage of using this new MSIL code. If the code is compiled into MSIL, we can install and run this code on any platform that supports the CLR. As far as it is, this may not be very attractive, because the platform currently supports .NET is very small: only 32-bit Windows. But soon, 64-bit platforms and .NET for Windows CE will provide this support. Compiling code into MSIL format allows us to seamlessly transplant application to all of these platforms and future new platforms.

Another advantage of MSIL is that the JIT compiler compiles the MSIL code into machine instructions on the target machine installed, which can optimize the code according to the specific situation of the target machine. This is useful, for example, it can optimize code for special registers of the target machine, or optimize operation code for hardware devices with special processors on the target machine. Please click on the VB6 Engineering Properties window Compile tab to learn more. Due to the metadata in the program, the JIT compiler knows what the code is doing and which platform it supports, so that it can quickly optimize the decision, improve the performance performance of the code.

Another advantage involves two Validation .NET, Validation, Verification. The inspection is a series of checks for modules to ensure metadata, MSIL code, and file format. The code that cannot pass these inspections may cause the execution engine or the JIT compiler to crash. Once the module passes the test, the code is correct and can start running.

The JIT compiler converts the MSIL code to the machine code, which is checked for the code, which is a dial data to ensure that the program does not access the memory or other resource that does not have the corresponding license. The verified code is the type of security (type-way) code. This verification is also done even when the program is directly compiled into a machine code, unless verified by the JIT compiler, this verification is not 100% accurate, because the verification results depend on the metadata from other assemblies . If the source program is directly compiled into machine code, we face a risk: changes in the target machine, causing the program no longer type security. Using the JIT compiler to ensure that the inspection and verification is performed on the current version of all related assemblies. These operations ensure that the executor is always type security, and the program always operates at a safe license. You can use the .NET SDK's Peverify tool to inspect and verify the code itself.

转载请注明原文地址:https://www.9cbs.com/read-29502.html

New Post(0)