C / CLI basic type
Wen / Stanley Lippman Li Jianzhong
Guide: This article reveals what adjustments in Microsoft do when the CLI type system and the ISO-C semantic framework are integrated, and how to adjust the priority of the various situations that appear during the integration process when necessary . At the same time, this also reminds everyone to pay attention to issues that need to be considered in the process of re-constructing a local type as a CLI class.
The basic types supported by C / CLi, such as int, double, bool, etc., in some respects, it can be followed by the type of ISO-C - the same usage, the same results, such as addition Or assign a value. But C / CLI also introduces some new things for these basic types.
In a Universal Type System (CTS), each basic type has a corresponding class in the system namespace (see Table 1). For example, int acts completely equivalent to System :: INT32. We can use any one of the two to declare an integer:
INT IVAL = 0;
INT32 IVAL2 = 0;
For portability, when using these basic types, we recommend you to use built-in keywords, not class names in the System namespace.
Basic types corresponding namespace System of Note / Usage boolSystem :: Booleanbool dirty = false; charSystem :: SBytechar sp = ''; signed charSystem :: SBytesigned char ch = -1; unsigned charSystem :: Byteunsigned char ch = '/ 0 '; wchar_tSystem :: Charwchar_t wch = ch; shortSystem :: Int16short s = ch; unsigned shortSystem :: UInt16unsigned short s = 0xffff; intSystem :: Int32int ival = s; unsigned intSystem :: UInt32unsigned int ui = 0xffffffff; longSystem :: Int32long lval = ival; unsigned longSystem :: UInt32unsigned long ul = ui; long longSystem :: Int64long long etime = ui; unsigned long longSystem :: UInt64unsigned long long mtime = etime; floatSystem :: Singlefloat f = 3.14f; doubleSystem :: Doubledouble D = 3.14159; Long Doublesystem :: Doublelong Double D = 3.14159L;
Table 1 Basic types and their classes corresponding to the system namespace
For the public static member of the class in the System namespace, we can access them through the built-in keyword, can also be accessed via class names in the System namespace. For example, in order to obtain a numeric value range, we can use the built-in keyword to access its static attribute MaxValue and MinValue.
INT iMaxval = int :: maxValue;
Int iminval = int32 :: minValue;
Each numerical type supports a member function called PARSE to convert a string into the value represented. For example, give a string below:
String ^ bonus = "$ 12,000.79";
Calling PARSE will initialize mybonus to 12000.79:
Double Mybonus = Double :: Parse (Bonus, NS);
Where NS represents some of the results of some NumBerstyles enumeration types or BitWise OR. NumberStyles is an enumeration type located in the System :: Globalization namespace for characterization of blank, currency symbols, decimal points, or comma. Look at the following code:
Using Namespace System;
Using Namespace System :: globalization;
Double Bonusstring (String ^ Bonus)
{
NumberStyles NS = NumBerstyles :: ALLOWLEDINGWHITE
NS | = NumBerstyles :: allowcurrencysymbol;
NS | = NumBerstyles :: allowthousands;
NS | = NumBerstyles :: allowdecimalpoint;
Return Double :: Parse (BONUS, NS);
}
We can also use transition symbols to explicitly transform between types.
INT IVAL = (int) mybonus;
Or some conversion methods for using the System :: Convert class, such as TodouBLE (), TOINT32 (), TodateTime (), etc.:
INT IVAL2 = Convert :: TOINT32 (Mybonus);
The two conversion methods are different: the explicit transformation will cut off directly, while the member function of Convert is a rounding algorithm. For example, the result is 12,000 after IVAL assignment in the above example, and the result is 12001 after IVAL2 assignment.
We can also use literal constants to call their corresponding type of members, although this seems to be a bit weird. For example, we can write the following code:
Console :: Write ("{0}:", (5) .tostring ());
Where (5) .tostring () returns the string representation of the literal integer 5. Note 5 The parentheses outside is a must because it will cause the compiler to bind the back member to the operator point number on the integer 5, not the '5.' parsing the literal constant of a Double type - that If the next TString () will become inevitable. Why do we sometimes do this? One possible situation is to deliver a string to the member function of Console to be more efficient than passing the actual value.
For character and literal constants such as a string, we can also call their member functions like the above integers, but their behavior has a little bit. For example, the following code:
Console :: WriteLine (('a') .tostring ());
97 is printed on the console, not the 'a' character. To print characters 'a', we need to transform it to System :: Char:
Console :: WriteLine ((wchar_t) 'a'). TOSTRING ());
C / CLI takes a special processing policy on string field constants. To a certain extent, string of string constants is closer to System :: String, not a C-style string pointer in the C / CLI. Obviously, this will affect the identification of overload functions. For example: public ref class r {
PUBLIC:
Void foo (system :: string ^); // (1)
Void foo (std :: string); // (2)
Void foo (const char *); // (3)
}
Void Bar (R ^ R)
{
// Which foo is called?
R-> foo ("POOH");
}
In ISO-C , this will be analyzed as the third foo (), because string field constant is closer to Const Char *, not the String type in the ISO-C standard library. However, in C / CLI, the above call will be analyzed as the first Foo () because the string field constant is considered closer to System :: String, not a character pointer. To understand the reason, let us return two steps, first to see how ISO-C and C / CLI analyze a heavy-duty function, then see how ISO-C and C / CLI analyze a string field constant.
An analysis process of a heavy-duty function typically contains three steps:
1. Select the candidate function collection. The candidate function refers to a function that matches the called function name from the lexical category. For example, since we are calling foo () on an instance of R (), all those functions of all names foo but not R or a group of members, will not be considered a candidate function. In this way, we now have three candidate functions, ie the three names of the R-name foo. If the candidate function set obtained at this stage is empty, then call the call failed.
2. Select the available function set from the candidate function collection. The available function is the number of parameters at the time of the function declaration and those that matches their types when the call is specified. In our example, three candidate functions are available functions. If the available function set obtained at this stage is empty, the call will also fail.
3. Select the most matching function from the available function collection. This phase will rank the conversion between the actual passing parameters and the parameters declared by the available functions. This process is relatively simple for functions that contain only one parameter. But for functions containing multiple parameters, this process becomes relatively complicated. If there is no best matching function winning, then call will fail. That is to say, the conversion between the parameters of each available function to the actual parameter type is considered to be the same, and there is a confusion between multiple calls.
Then there are two questions in front of us now: (1) What type is we actually passed? "POOH"? (2) What algorithm is employed when determining the advantages and disadvantages of the type conversion?
In ISO-C , the type of string field constant "POOH" is const char [5] - Note that there is an implicit truncated character null behind the string field constant. It is obvious that there is no exact match in the above example, so some form of conversion must be applied. In this way, two ISO-C candidate functions (2) and (3) will compete:
Void foo (std :: string); // (2)
Void foo (const char *); // (3)
So how do the compiler judges the available functions? C language pair type conversion is defined in the priority order, in which this structure is better than another conversion, then it will be ranked front. In C / CLI, we integrate the CLI type behavior into the standard type conversion hierarchy of ISO-C . Below is a description of the hierarchical structure after integration: 1) Exact match is the best. It should be noted that exact match does not mean that the actual parameter type and function declaration form type of parameter type are fully matched. They only need to "close". We will see below, "It's close to" There are some different meanings for string field constants in ISO-C and C / CLI.
2) In standard conversion, broadening the conversion is better than non-broadcasted conversion. For example, the SHORT has been broadly optically converted to Double.
3) The standard conversion is better than the boxing conversion. For example, convert INT to Double superior to the Object box.
4) Packing conversion is better than user-defined implicit conversions.
5) User-defined implicit conversion is better than no conversion!
6) Otherwise, the user must use an explicit transition symbol to represent the desired conversion.
For candidate functions under the above two ISO-C , the string field constant is converted to a std :: string belongs to Article 5, ie implicitly calling the String constructor to create a temporary String object. Convert the string field constant to a const char * belong to Article 1 above. Article 1 is preferred in Article 5, so the function of the parameter is Const char * win in this competition.
This belonging in Article 1 "Accurate Match" is actually very strict in terms of technology. There are a total of 4 such trivial conversions can be returned to exactly match. Even in these 4 Trivial Conversions, there is also a priority sorting in order to standardize the type of language.
Most readers and programmers may not have more interest to such details, and usually do not have to go deep into these details. But if we have to get an intuitive language behavior and make sure they have the same performance in different implementations, these rules are necessary. This is because as a programming language, its behavior generally has some degree of "type perceive" capability, allowing programmers to ignore these details.
Let us make a simple understanding of these four Trivial Conversions. Three of these are called left value transformation. The left value is a program entity that is addressable and can be performed. The fourth is a Qualification Conversion, for example, adding a const modifier on a type declaration that belongs to this conversion. Three of these left value is superior to qualifying conversion.
In our example, the local array to the pointer is converted, that is, the const char [5] to const char *, is a left value conversion. In most cases, we don't even see this as a conversion.
This form of left value is still applicable in C / CLI, but after we introduced the System :: String class, the conversion of string field constants to const char * is no longer the best match. In fact, in C / CLI, "POOH" string field constant is both const char [5] (C / CLI to reserve results for local type system), and also System :: String (C / CLI) Type of managed types). In this way, between C / CLI, string-string constants and system :: string types are precise matching, which is superior to Trivial Conversion, which is constant to Const Char *. Some friends see this, maybe not happy, "Why do ISO-C do not meet the C / CLI's binding needs?" The reason why C / CLi does not meet the binding needs of the string? " Land constants is a basic element in our program, while ISO-C behavior is not intuitive in many cases. In fact, these rules have been changed and changed to the previous year before we see the results.
This reflects a basic difference in ISO-C and C / CLI during the respective type system. In ISO-C , the type is independent unless explicitly exists in a class library. Therefore, there is no implicit type relationship between string field constants and std :: string, although they share the same abstract domain (Domain).
But in C / CLI, we support a unified type system. Each type, including literal constant values, is a subclass of Object. This is why we can directly call the method on a literal value, or on an object of the built-in type. The type of integer graphic constant 5 is int32, the type of string field constant "POOH" is String. It is considered that the string of string constants is closer to the C style string, or it is not suitable as a string of C-style.
The integrated type conversion hierarchy makes a normal running ISO-C program to still present the same behavior after using the / CLR compiler switch, but new C / CLI programs that use the CLI type in processing string field constant The new type of priority sorting rules will be embodied. The length of this discussion may not be appropriate relative to the importance of this topic, but it reveals what works we did in the end of us to integrate the CLI type system and the ISO-C semantic framework, And how to adjust the priority of the various conditions that appear during the integration process when necessary. At the same time, this also reminds everyone to pay attention to some issues that need to be considered in the process of re-constructing a local class as a CLI class. For example, in some cases we have to make new designs for those who accept string-string constants, rather than simply adding a function of a parameter to these overload functions.
It is also important to note that String represents a Unicode character set. Unlike the ASCII character set, this requires two bytes to represent a character. Although the type of string field constant in C / CLI is String, this does not mean that in C / CLI, a string field constant will inevitably resolve into a double-byte character stream. In local C , we have to add an L in front of the string field constant to tell the compiler as a double-byte character stream. In C / CLI, we still need to do this.