Basic principle of type use in Microsoft .NET platform

zhaozj2021-02-08  242

The basic principle of the type used in Microsoft .NET platform ---- Microsoft .NET platform series of articles / Zhao Xiangning In the last discussion, I introduced many Microsoft .NET platform public language runtime CLR (Common Language Runtime) Basic concepts related to the type. Here, focus on how to derive all types from the System.Object type, as well as a variety of forced type conversion mechanisms (such as C # operators) that programmers can use. Finally, I mention how the compiler uses the namespace and how the CLR is ignored when the public language is running. In this article, we will continue the discussion of the last type of basis. First, start from the introduction of the simple type, and then quickly enter the discussion of the reference type and numerical type. For all developers, proficient in masterpieces and numerical types of applications are especially important. During writing code, if you use improper use of these two types, you can cause program bugs and cause performance issues. Simple Types Some commonly used data types, many compilers can process them through simple syntax. For example, in the C # language, you can use the following syntax to allocate a integer variable: int a = new int (5); but I will definitely, you will feel that use such syntax to declare and initialize a integer variable is very awkward . Fortunately, many compilers (including C # compilers) allow you to use the following syntax instead: int a = 5; this makes the code's readability is more readable. Whether using the syntax, the occurrence of the occurrence of the intermediate language. The data type directly supported by the compiler is called a simple data type. These simple data types are directly mapped to the type existing in the base class library. For example, the int type in the C # is mapped directly to System.Int32. So the following two lines of code can be the same as the two lines of code mentioned earlier: system.int32 a = new system.int32 (5); system.int32 a = 5; Figure 1 is the simple data type and base class in C # The type of corresponding table in the library (other languages ​​also provide similar simple data types) Reference Type and Numeric Type When the object is assigned from the managed heap, the New operator returns the memory address of the object. This address is usually stored in a variable. This method is the reference type variable because the variable does not include the actual object, but the bit of the reference object. Some performance issues should be considered when processing the reference type. First, memory must be allocated from the pilot heap so that garbage is encycloped. Second, the reference type is always accessed by a pointer. So, each time you reference the members of the objects in the stack, in order to achieve the desired processing, you must generate and execute the code for the recovery pointer. This affects the size of the program and the speed of the program execution. In addition to the reference type, there is a lightweight value type in the actual object system. Numeric type objects cannot be allocated in a heap that can be recyclable, and indicating that the object's variable does not contain a pointer to the object, but the variable contains the object itself. Because the variable contains an object, the processing object does not have to consider the problem of pointer recycling, thereby improving performance. The code in Figure II illustrates the reference type and numerical type difference. Rectangle type declaration uses the structure without using a more common class. In C #, the type of structure declaration is a numeric type, and the category declaration is a reference type. Other languages ​​may describe the numerical types and reference types, such as C , using different syntax, such as C .

Review the code line mentioned in the previous discussion: system.int32 a = new system.int32 (5); compile this statement, the compiler disabate that System.Int32 is a numeric type and optimizes the generated intermediate language (IL) code. In order to make this "object" do not assign from the heap; and put this object in the local variable A of the thread stack. Where possible, you should use a numeric type instead of a reference type, so that you can make your application better. Especially when using the following data types, you should declare the variable as numeric type: * Simple data type. * Data types that do not inherit from other types. * There is no data type derived from it. * Type objects will not be transmitted as a method parameter, because it causes frequent memory copy operations to harm performance. This will explain more detailed explanation in the discussion of the boxes and boxes below. The main advantage of numeric types is that they are not allocated in the pile. But compare the reference type, there are several limitations in using numeric types. The following is a comparison for numeric types and reference types. Numeric type objects There are two representations: forms and boxes in the frame. The reference type object is always represented as a frame form. The numeric type is implied from the System.ValeType type. This type of method is the same as the method defined by System.ValeType. However, System.ValueType is overloaded to returns True when matching two object instance fields. In addition, System.ValueType is overloaded to generate a Hash code value in the Object Instance field. When you define your own numeric type, you strongly recommend you to overload and provide external equals and gethashcode methods. Because the numerical type can not be declared as the base class, the new numeric type or new reference type cannot be declared. The numeric type should not have virtual functions, and cannot be abstracted, and the implicit package (package type cannot be used as a new type of base class) ). The reference type variable contains the address of the object in the heap memory. By default, the reference type variable is initialized as empty (NULL), that is, this reference type variable is currently not pointing to the valid object. A reference type variable attempting to use the value is an extraction of NullReferenceException. In contrast, for numerical type variables, it always contains a potential value of the potential type, and in the default, all members are initialized (ZERO). NullReferenceException is not possible when accessing numerical types. When you assign a value type variable to another numeric variable, the variable value is copied. When you assign a reference type variable to another reference type variable, just the memory address of the variable is copied. This conclusion can be drawn from the above discussion, and a single object in the heap can involve more than two reference type variables. This allows for use as an operation with a variable to affect the object referenced by another variable. On the other hand, each numerical type variable has its own object data copy, and the operation of one of the numerical type variables does not affect other numerical type variables. When running, you must initialize numeric types and unable to call it default constructor. For example, this happens, when the non-tube thread performs the hosting code, the thread is required. Numeric type. In this case, the type constructor cannot be called when running, but it is still guaranteed that all members are initialized to zero or empty. To do this, you recommend that you don't want to define a variable constructor with a value type. In fact, the C # compiler (and other compilers) will think that the error is no longer compiled. This problem is rare, and it does not happen on the reference type.

The parameterized constructor for numeric types and reference types does not have these limitations. Because the value type of the frame is not assigned in the heap, as long as the method of defining this type of instance is no longer active, you can allocate storage area. That is to say, the memory is not received when the memory of the numerical type object of the frame is recovered. However, the numerical type of the box is used as a Finalize method called when garbage is recovered. You must never use the Finalize method to implement a numeric type. Like the parameter constructor, C # thinks this is an error without compiling the source code. In many cases, in many cases, the numerical type is used as a reference type to use it for a process. Suppose you want to create an ArrayList object (it is the type "defined in the System.Collections name space) to store some points (Points). See Figure 3. Each loop Point value type in the code is initialized, then the point is stored in ArrayList. But think about it, what is actually stored in ArrayList? Is the Point structure or the address of the Point structure, what else, or something else? In order to get the answer, you must check the ARRAYLIST's Add method to see what type is defined. In this paragraph code, you can see that the add method is protected by: Public Virtual Void Add (Object Value) Obviously, the argument of the add method is an object. The object is always seen as a reference type. But in fact, I passed in the code is a P, which is a Point value type. This code is to run, and the Point value type must be converted to a real stack of tube objects and must be able to get a reference to this object. Converting numeric types to reference types called boxes. Its internal conversion mechanism can be described as: 1, allocated from the heap, the memory size is equal to the memory of the numeric type, which is added, and the additional overhead includes the doubling pointer and the synchronization block pointer. Memory. 2, the bit type of bit is copied to the newly allocated stack memory. 3, the address of the object is returned. This address is both current reference types. Some languages, such as C #, automatic generation of intermediate language code (IL) required to enter the numerical type, but is important to understand the internal mechanism of the box into the conversion to understand the amount of code and performance issues. When the add method is called, the POINT object is allocated in the pile. Residing in the current Point numerical type (P) is copied to the newly allocated Point object. The Point object address (reference type) is returned and is then passed to the add method. This Point object will be retained in the heap until it is treated as garbage. Point numerical type variable (P) can be used or released because A RRAYLIST will never know any information about the Point value type variable. The frame is enabled to unify the type, and any type of value is basically processed as an object. Relative to the box, the frame is re-enables reference to the numerical type (data field) included in the object, and its internal mechanism can be described as: 1, CLR (Common Language Runtime) first guarantee the reference type variable is not empty, and it is Hope the value of the value type, if these two conditions are not true, generate an invalidcastexception exception. 2. If the type does match, the numeric type pointer included in the object is returned, and the numeric type referred to in this pointer does not contain overhead that is usually associated with the real object: ie the virtual table pointer and the synchronization block pointer. Note that the box is always created a new object and copy the bit to this object. The box is just simply returns a data pointer to an object: no copy of memory occurs.

However, the usual situation is that the code will cause the data referred to by the reference type of the frame to be copied. The following code demonstrates the frame into and box :: public static void main () {INT32 V = 5; // Create a frame out of the value type variable Object o = v; // {is both a box of V in V. Version V = 123; // Changing the value of 123Conse Ole.WriteLine (V "," (int32) o); // Display "123, 5"} From the above code you can imagine how many boxes occur ? You will be surprised to find the answer is 3! Let us analyze the code carefully to truly understand what happened. First created a value type V, which is an int32 frame, is 5. Then create an object reference type O and try to point to V. However, the reference type always must point to objects in the heap, so C # should generate the corresponding intermediate language code to frame into the variable V and store the V-block version of the address in O. 123 is now the box out and the referenced data is copied to the value type V, which does not affect the V-entered version, so the box is allowed to keep it value 5. Note that this example demonstrates how O is boxed (pointers in return O), and the data of O is the numerical type V that is copied to the box. WriteLine is now called. It requires a String object to pass it, but you don't have a string object, but there are three known items: an int32 bit box out of value V, a string (",") and an int32 reference type (or box into Type) O. They must be combined to form a string. In order to construct the String object, the C # compiler generates the code that calls the String object static ConcAT method. There are several overload versions of the Concat method. The functions they implement are both, and different are different from the number of parameters. If you want to format a string with three known items, the compiler will select the following CONCAT method: public static string concat (Object Arg0, Object Arg1, Object Arg2); the first parameter is Arg0, used to pass V. However, V is the value parameter of the frame, and Arg0 is an object, so V must be boxed and the address of the V is passed with arg0. The second parameter is Arg1, which is the address of the string ",", namely a String object. The last parameter is Arg2, O (an object reference) is forced to convert to INT32. It creates a temporary INT32 numeric type, which receives the box published by the value currently being previously O. This temporary numerical type must be reused in the memory address box passed by arg2. Once Concat is called, it calls each Tostring method of the specified object and connects the string values ​​of each object. The String object returned from Concat is passed to WriteLine to display the last result. It should be noted that if WriteLine is called in the following form, the resulting intermediate code (IL) will be more effective: console.writeLine (V "," O); // Display "123, 5" this line code and the previous version is The same, just to force the O front "int32" to turn it off. It is more effective because O is already a reference type of an object and its address is passed directly to the Concat method. This avoids the one-time frame operation to avoid one frame operation.

转载请注明原文地址:https://www.9cbs.com/read-1552.html

New Post(0)