Research on Visual C # Packing and Unpacking
2004-09-15 Author: Source: 9CBS
Before discussing this issue, we may wish to ask such a few questions first, understand the theme of our system to explore today.
The viewer may have used many kinds of varieties such as the System.Console class or .Net class library numerous times. So, what do I ask what they originated? C # is how to contact them? Is there a mechanism or type system that supports our personalized extension? What types of systems are available for us? If these PLs don't know if these problems, don't know if they happen, the door of C # will throw us outside the door.
Then let us stop the lives in his hand, the territory, and make a strong study on the CTS (Common Type System) as a .NET important technology and the foundation. As the name suggests, CTS is a common type system that exists to implement rules that must be followed when applications declare and use these types. In this case, although you may be familiar with this, I still have to emphasize that .NET divides the type of the entire system into two categories-value types and reference types. At this point, you may anger: I said former days, you seem to have not cut into the topic! Don't panic! I know the characteristics of the .NET type system does not mean that you really understand the principles and existence of this type of system.
Most object-oriented languages have two types: original type (language inherent type, such as integers, enumeration) and classes. Although object-oriented technology reflects strong capabilities in realization of modularization and physical development, there are some problems, such as this system type problem, history tells us that the two groups have caused many problems. First of all, it is compatibility problem. This is also a point that Microsoft attacked. The most OO language has this weaknesses because their original type has no common base point, so they are not true objects in nature, they are not from one The generic base class is derived. No wonder, Anders Heijlsberg laughed as "magic type".
It is because of this lack, when we want to specify a Method that can accept any type of parameters that can be supported by this language, the same problem is again snapped up our brain - is not compatible. Of course, for the C PL big, maybe this is not big, they will pride, as long as the heavy-duty constructor is written for each original type, a Wrapper Class is not finished! Ok, this is finally able to coexist, but how do we get our most concerned in this magic? As a result, they will only be confidently open Boarland, skilled in writing a heavy-duty function to get the results from the Wrapper Class just. Brothers or sisters, under the historical conditions of the time, your behavior is a pioneering, but relatively, you will pay a price - low efficiency. After all, C is more dependent on the object, not the domain. Recognizing that reality is more sensible than death, it is more sensible! Flowers so vigorous, finally finished the pavilion, I want to say: The CTS of .NET environment has brought us convenience. All things in the CTS are objects; second, all objects are derived from a base class --system.Object type. This is the so-called Singly Rooted Hierarch, please refer to Microsoft's technical documentation for details on System.Object. Here we briefly talk about the two types mentioned above: Value Type and Reference Type.
One of the biggest features of the CTS value type is that they cannot be null, the meaning of the words is that there is always a value of the value type. In C #, it includes original type, structure, enumerator. Here, it is necessary to emphasize that when the variable of the value type is transmitted, we actually passed the value of the variable, not the reference to the underlying object, this point and the variable of the reference type are quite different; the CTS reference type seems to be type security The pointer, it can be null. It includes types such as classes, interfaces, delegates, arrays. In contrast the characteristics of the previous value, when we assign a reference type, the system allocates a value (memory allocation and location) on the stack of the background and returns a reference to this value; when the value is NULL, there is no reference or type. Point to an object. This means that when we declare a variable of a reference type, we are operated by this variable reference (address) instead of data. When discussing this place, the protagonist of this article finally debuted - comrades who want blood or vomiting, please bear it again. I would like to ask a question: How to effectively expand and improve system performance when using this multi-type system? Perhaps the discussion of this problem is on the blackboard, and the guys of Seattle present Box (boxed) and UNBOX ideas. simply put. Packing is the process of converting the value type (Value Type) into a reference type (in turn, it is to unpack. (In fact, this kind of thinking is produced in eight days). Below we will discuss boxes and unpacking processes in detail. In the discussion, the answer to the questions we just mentioned will be solved.
First, let's take a look at the packing process, we need to do two work first: 1. Write the routine; 2. Open the ILDASM (MSIL code check tool) To this, let's take a look at the following code:
Using System; Namespace structapp {///// // Boxandunbox's summary description. /// public class boxandunbox {public boxandunbox () {//// Todo: Add constructor logic //} / static void main (string [] args) {double dubbox = 77.77; /// Define a value Shaped variable Object ObjBox = dubbox; /// The value of the variable to a reference object ("The value is '{0}' and the boxed is {1}", dubbox, objBox.toString ( )));} /}}
In the code, this article we only need to pay attention to the two lines of code of the notebook, the first line, we created a Double type variable (Dubbox). Obviously, according to the rules, the CTS specifies that Double is the original type, so Dubbox is naturally a variable of the value type; the second line is actually three jobs, which will be seen in the following MSIL code. The first step takes out the value of the Dubbox, the second step converts the value type type conversion reference type, the third step is transmitted to ObjBox.
The MSIL code is as follows:
.method private hidebysig static void main (String [] args) cil management {.entrypoint // code size 40 (0x28) .maxstack 3.locals init ([0] float64 dubbox, [1] Object objBox) IL_0000: LDC.R8 77.769999999999996IL_0009: stloc.0IL_000a: ldloc.0IL_000b: box [mscorlib] System.DoubleIL_0010: stloc.1IL_0011: ldstr "The Value is '{0}' and The Boxed is {1}" IL_0016: ldloc.0IL_0017: box [mscorlib ] System.DoubleIL_001c: ldloc.1IL_001d: callvirt instance string [mscorlib] System.Object :: ToString () IL_0022: call void [mscorlib] System.Console :: WriteLine (string, object, object) IL_0027: ret} // end In MTHOD BOXANDUNBOX :: Main In MSIL, the IL_0000 to IL_0010 rows describe the front two lines of code. Referring to the MSIL manual of the C #, the viewer is not difficult to understand the implementation of this underlying code, which focuses on the story that happened when Dubbox is packaged: (1) Split the stack memory, allocated on the stack = Dubbox's size ObjBox and its structure occupied by the space; (2) Value of Dubbox (77.769999999999) Copy to a newly allocated stack; (3) will be assigned to the address stack of ObjBox, point to an Object Type Type of reference.
The unpacking of the box is the inverse process of the box, it seems to be very simple, in fact, there are many things that think about a lot of values. First, when Box is, we don't need explicit type conversion, but the type conversion must be performed when UNBOX. This is because the object of the reference type can be converted to any type. (Of course, this is also a difference in computers and human brains) type conversion will not be avoided by the monitoring from the CTS Management Center - its standard is naturally based on rules. (The capacity of its content is enough to specifically set up a chapter to discuss), let's take a look at the following code:
Using System; Namespace structapp {///// // Boxandunbox's summary description. /// public class boxandunbox {public boxandunbox () {/// Todo: Add constructor logic //} / static void main (string [] args) {double dubbox = 77.77; object objBox = dubbox; double Dubunbox = (double) ObjBox; // / unpacking the reference object and return value console.writeline ("The value is '{0}' and the unboxed is {1}", dubbox, dubunbox);} /} }
Compared with the previous packing code, this section is plus a line of Double Dubunbox = (Double) ObjBox; the newly added line of code has worked four jobs, which will also be embodied in the MSIL code. The first step is pressed into the stack; the second step will convert the reference type to a value type; the third step is indirectly stack the value; the fourth step is transmitted to Dubunbox.
The MSIL code is as follows:
.method private hidebysig static void main (string [] args) cil management {.entrypoint // code size 48 (0x30) .maxstack 3.locals init ([0] float64 dubbox, [1] Object ObjBox, [2] float64 Dubunbox ) IL_0000: ldc.r8 77.769999999999996IL_0009: stloc.0IL_000a: ldloc.0IL_000b: box [mscorlib] System.DoubleIL_0010: stloc.1IL_0011: ldloc.1IL_0012: unbox [mscorlib] System.DoubleIL_0017: ldind.r8IL_0018: stloc.2IL_0019: ldstr "The Value is '{0}' and The UnBoxed is {1}" IL_001e: ldloc.0IL_001f: box [mscorlib] System.DoubleIL_0024: ldloc.2IL_0025: box [mscorlib] System.DoubleIL_002a: call void [mscorlib] System. Console :: WriteLine (String, Object, Object) IL_002F: RET} // end of method Boxandunbox :: Main In MSIL, the IL_0011 to IL_0018 rows describe the new row code. Referring to the MSIL manual of the C #, the viewer is not difficult to understand the implementation of this underlying code, and I focus on the objBox encounter when the box is removed: (1) The environment must first determine the address of the legal object to the legal object, and Whether it is legal when converting this object to the specified type, if not legal, throw an exception; (2) When the determination type conversion is correct, a pointer to the value within the object is returned.
It seems that the boxes and unpacking are alive. It took half a day. Just put the 'value' to the 'box', and there is more effort to disassemble it, depressed! The careful viewer may also be able to combine the code and msil, how to appear twice in the process of calling console.writeline (), yes, I want to be lazy to escape this section, but since I have already found It should face boldly, in fact, this is the legendary "dark box operation"! Because the console.writeline method has a lot of overload versions, the version here is parameter with two String objects, and the overloader of the parameter with the Object type is the closest version found by the compiler, so the compiler for seeking You have to match the prototype of this method, you must pack the value type Dubbox and Dubunbox (converted into a reference type). Therefore, in order to avoid performance loss due to unnecessary implicit packages, it is best to pack the value before performing these multi-type overloaded methods. Now we improve the above-described code to:
Using System; Namespace structapp {///// // Boxandunbox's summary description. /// public class boxandunbox {public boxandunbox () {/// TODO: Add constructor logic //} // static void main (string [] args) {double dubbox = 77.77; Object ObjBox = Dubbox Double Dubunbox = (Double) ObjBox; Object Objunbox = Dubunbox; console.writeline ("The value is '{0}' and the unboxed is {1}", objBox, objunbox);} ///}} msil code: .method private hidebysig static void main (String [] args) cil management {.entrypoint // code size 45 (0x2D) .maxstack 3.locals init ([0] float64 dubbox, [1] Object Objbox, [2] float64 Dubunbox , [3] object objUnBox) IL_0000: ldc.r8 77.769999999999996IL_0009: stloc.0IL_000a: ldloc.0IL_000b: box [mscorlib] System.DoubleIL_0010: stloc.1IL_0011: ldloc.1IL_0012: unbox [mscorlib] System.DoubleIL_0017: ldind.r8IL_0018 : stloc.2IL_0019: ldloc.2IL_001a: box [mscorlib] System.DoubleIL_001f: stloc.3IL_0020: ldstr "The Value is '{0}' and The UnBoxed is {1}" IL_0025: ldloc.1IL_0026: ldloc.3IL_0027: call Void [mscorlib] system.console :: writeline (string, object, object) il_002c: return} // end of method boxandunbox :: ma IN I am dizzy! What is this! After reading it, it is not the vomiting blood of the blood, the hanging hanging! I believe that comrades who can insist on reading the last "!" Must be a good comrade. In fact, we can also add it to speculate: The reference type should be an advanced type, and the value type belongs to the original type, and the box is just a concept, an order, set rule or accurate is a logic. The original thing is based on the basis, and its complexity and logic will not be very high, and advanced things are not so stable. It will continue to evolve and develop, because this logic 'box' will continue to be expanded and improved. . From this idea, we are not difficult to predict the future we need to work in the future and the possible place where the success will exist.