C ++ implementation of discriminated unions

zhaozj2021-02-16  63

A C implementation andrei AlexandRescu that identifies unionInd unions: Identification of combination (also often said variable types or tags) is a data structure that stores objects and objects belonging, and objects can come from arbitrary. Type collection. It can be identifiable to be very useful in some application interpretors, database programs, and data communication. Some identifiable joints implemented with C have been published [1], [2]. This article describes the implementation of C generic identification, this implementation has the following features: (1) Specify the ability to accept the type of type. (2) No difference in the type of built-in type and user-defined. (3) Users can use the "Total Decoherence" mode to get all possible types of possible types in the identifiable combination, otherwise compile errors. {4} Avoid obtaining high efficiency using a free storage area and a type conversion providing O (i) efficiency. Features {1} and {3} are particularly novel relative to other implementations. This article uses the Loki C class library [3] and assumes that the reader also has this knowledge. LOKI provides generic components for design patterns and common means (IDiom) so users do not have to start designing from very lateral. The components used are Typelists, Visitor and Hierarchy Generators,

1. Describe that many applications need to store unrelated data types with a unified format. Assuming an application with a database communication. The field stored by the database belongs to some type, such as string, integer, decimal, floating point number, and date. When querying the database, the result is returned in the form of a table, and the table contains these types of columns. When writing a query function, the C program must use these types in a unified format, so that these query results can be unified (such as two-dimensional arrays) assume that there is now an old C-based database API, which puts fields (Field) The value is allowed to accept all possible types of union, this federation has an integer or enumeration value (tag (tag)), which is used to specify which field (field) is currently valid, for example::

// Example 1: a database field indicating the type struct DatabaseField {enum TypeTag {typeNull, typeInt, typeDouble, typeString, typeDate} tag_ way the style of C; union {int fieldInt_; double fieldDouble_; char * fieldString_; Date typeDate_;} No need to say, this method is clumsy and wrong. Object-oriented methods in the face of POLIYMORPHICs in the previously described relationship database. This method has the advantage of type security (TYPE SAFE). However, this advantage was offset by clumsy design. Object-oriented programming (at least a static type like C ) pre-definitions a unified interface of a class level, which is unrealistic in our current example. Because of the different types of strings, integers, date or Boolean, these different types do not have a universal interface. In the end, the last result of this method can only be: a large number of types of transitions or bloated erroneous interfaces, in which the runtime of each type is checked for each type of runtime (Runtime) is another example of the union of the United States is In the Dynamic Type Language (LISP, BASIC), which type depends on what is it being assigned. These languages ​​are inference and check the variable types based on the context. Implementing problems caused by these types and problems caused by the resulting database becomes similar.

2. Type Safety Recognition Joint Example 1 describes a C-style identifiable joint design that must be used to eliminate the weaknesses of the design with the following steps. * Introduce type security. The combination of tags does not provide type of security. There must be a type of security mechanism that does not sacrifice performance. * Generalized possible types of collections. In order to make the identifies more common, the type used cannot be written in the joint, but can allow the user to support the user's custom type in the external configuration * like supporting the basic type. There is a detail problem in the fieldstring_ member in Example 1. Usually, if tag_ is equal to String type, DatabaseField's destructor is called DELETE [], such a practice violates the generalization principle (translation: because DELETE, DELETE, or what Needless to do it, this is completely dependent on what type of object stored in the joint storage. So it violates the generalization principles). The best way to store String fields with a good user-defined type with a good design, like std :: string. However, C combines the type of construct and destructive functions that cannot be stored. A common implementation method that can be identifiable must be seamlessly supported simultaneously. A possible method of supporting type in a commonly identifiable combination is Typelists. Typelists [Sort by year is 7, 8, 9, 3] can operate type strings, have a similar value. This article is connected to the template class loki :: Typelist using [3] defined. The skeleton of the Variant class to be implemented is like the Template Class Variant {...} user can define the Variant: // Example 2 containing the specified type collection by instantiant using the TypeList object. 2: A Typelist implementation Identification of the United Method Typelist_4 (int, double, std :: string, date)> DatabaseField; variant type must include at least the following feature * value semantic; default constructor, copy constructor, assignment operation ASSIGNMENT OPERATOR, and a destructor that does not leak resources * Conformally-acceptable Typelist's constructor * assignment operation function can accept Typelist * that can use any type of Typelist * Transformation to any type of security to transfer to any included Type of Typelist * There is a function that can return now in Variant's Type * Use the IQ (IDioms) and some STL algorithms to support two Variant objects High-efficiency value interchange (swapping) * If you can perform type transformation, Efficient way to change the type of Variant object. For example, INT changes to Double, which requires Variant's operating function with flexibility and consistency. Such a design can obtain all of the above functions or even more, but is allocated to allocate memory with unnecessary free storage regions. The implementation of this article describes any type in the Variant object correctly. In order to achieve this, Variant needs a compile time algorithm to calculate the maximum size of type in TyPelist. The algorithm is described as follows:

Template Struct MaxSize;

Template struct maxsize {enum {result = 0};

Template strunt maxsize > {private: enum {tailresult = (size_t)}; public: enum {results = Sizeof HEAD> TAILRESULT? SIZEOF (HEAD): (SIZE_T)};}; Any type of typelist tlist, maxSize :: Result Returns all the largest size contained in TLIST. Use maxSize, Variant to store data (Alignment) problem in the next section): Template Class Variant {eNMU {size = maxsize :: result}; unsigned char buffer_ [size]; ...}; also, Variant must store the discriminator. That is, help correctly distinguish the flags of the type in the Raw Buffer. A simple way is to use an integer logo. Due to flexibility and speed, Variant stores a pointer to the array of static function pointers - a simulation of virtual function tables. Section 5 describes the implementation of type markers

4, alignment calculation for solving the problem of C . No method is completely common. Because this language lacks a suitable basic function (such as: some compiler has a _Alignof_ keyword extension). However, the alignment is correctly implemented on many platforms, as long as it works some efforts and some reasonable assumptions. You can solve the alignment problem: given a Typelist, return a POD (Simple Data Plain Old Data) type, which ensures proper alignment of any type in TyPelist. To calculate alignment, we first consider a Typelist called TypeOfalLights. It includes a variety of different align a) all basic type B) points to all basic types of pointer c) function pointer D) member variable pointer e) member functions Pointer f) The class with virtual functions adds a POD structure for each type mentioned in (a) to (e), which contains the unique member of this type. The increased structure is required because alignment structures and alignment base types are different on some compilers, even when their internal structures are the same as the basic type. The AlignmentCalculator algorithm calculates alignment of Typelist named TLIST. The calculation steps are as follows * Assign TypeSofallignments to Temp * Calculate all types of TLIST in TLIST in Section III. Store the results in the compile time. MaxSize. * Remove all the size of the TEMP greater than the type of maxSize. The result is a federated type, which contains members of each type of TEMP. Since only the POD type is included in Temp, this joint is not a problem. (You cannot use TLIST to construct a combination, because there may be a user-specific type with a constructor and a destructuring function in TLIST.) A general way to get the right alignment is that single define a combination of each type containing TypeSofalLalignments . This method is to obtain possible maximum alignment needs with the cost of distributing additional memory. The interesting of AlignmentCalculator is that it is not an additional memory overhead and compatibility loss without additional memory overhead and compatibility loss. With AlignmentCalculator, Variant implemented in the following manner aligned: template :: Result> class Variant {enum {size = MaxSize :: result}; union {unsigned char buffer_ [size ]; Align dummy_;}; ...}; alignment operation is specified by the template parameter, the default is calculated. This allows Variant users with different alignment needs to specify the required alignment type without changing the variant code. The storage structure defined above allows VARIANT to properly store any basic type or user-defined type to avoid any unnecessary free storage area allocation.

5. Type flags are all previously mentioned, and Variant is distinguished and operated by a pointer to the function table. This is a universal, efficient way. The table structure is described as follows: Template <...> Class Variant {struct vtable {const st :: type_info & (* typeId_) (); void (* destroy_) (const variant&) (Const Variant &, Variant &) (CONST VARIANT &, VARIANT &) Void (* clonetypeonly_); void (* swap_) (Void * LHS, VOID * RHS); BOOL (* ChangeType_ [Loki :: TL :: Length :: Value]) (Variant & ); ...}; vtable * vptr_; ...}; The above-mentioned identification name suggests the structure used and the so-called "vTable" and "VPTR" (available in a typical implementation of the C virtual function) Similarity. These functions describe the following: * TypeID_ Pointed Function Returns the Std :: Type_info * Destroy_ of the current stored object to call * Clone_ Pointing function Copy the Variant object * CloNetyPEONLY_ Pointed function to copy an empty Variant object (copy type, no value) * swap_ exchange two Variant objects * ChangeType_ is a fixed size array of function pointers, these functions change the type of Variant object, make it an instantiated Typelist in this Variant. Any other type of VTABLE needs a function pointer to initialize its members. These are available in the VTableImpl template class, also defined in the Variant Template <...> Class Variant {Template strunt vtableimpl {static const st: type_info & typeid () {returniD () {Return TypeId ();} static void Destroy (const variant & var) {const t & data = reinterpret_cast (& var.buffer_ [0]); data. ~ T ();} ... static vTable vtbl _;}; ...}; Type T, VTableImpl has a static member function corresponding to all VTABLE members, and a static VTABLE member variable is also named VTBL_. When a Variant object is initialized, its VPTR_ initially points to the appropriate vTableImpl :: VTBL_, VTableImpl :: VTBLE_ 's instantiation depends on the type T used to initialize Variant. Define the Static_Check macro within the LOKI, check Boolean constants at compile, if the constant is False, give the compile time error.

Template <...> Class Variant {... public: template variant (const t & val) {static_check ((Loki :: TL :: Indexof (TLIST, T> :: Value> = 0), Invalid_Type_used_as_initializer ); New (& Buffer_ [0]) T (VAL); VPTR_ = & VTABLEIMPL :: VTBL_;}}; after determining the type of input, a T object is constructed in Buffer_ In the space, the VPTR_ member is set to a vtableImpl instance that points to a processing object type T. Because Buffer_ and VPTR_ are initialized, VTableImpl functions can safely think that buffer_ contains a T object and can get it through ReinterPret_cast Once VPTR_ correctly initialize, Variant can perform all the features by the function pointer stored in the VTBL_. Now let's see the initialization of VTableImpl :: VTBL_, each VTableImpl must pass T to VTABLE Constructor. This can be implemented by the simple template class of Loki :: Type2Type, the Type2Type can carry type information without creating the expenditure of this type of value. VTable constructor is instantiated by templated, accepting a useless The parameter loki :: type2type . Then initialize each function pointer, the pointer points to the function address defined in VTableImpl: Template <...> Class Variant {... struct vtable {template vTable loki :: Type2Type tt) {typeId_ = & VTableImpl :: TypeId; destroy_ = & VtableImpl :: Destroy; clone_ = & VtableImpl :: Clone; cloneTypeOnly_ = & VtableImpl :: CloneTypeOnly; SWAP_ = & VTABLEIMPL :: swap; init (changetype_, tt, tlist ()); // Look below} ...};

6. Convert VTABLE a member worthy of special attention is: BOOL (* ChangeType_ [Loki :: Value]) (Variant &); ChangeType_ type is the function pointer array (length Loki :: TL :: Length :: Value) Accepts Variant & Parameters and returns Bool. This array is used to change the type within a Variant object. The syntax of the Nth function of this array is that if there is a conversion from the nth type of Typelist, the function converts the Variant object to that Nth type and returns True. Otherwise, the function returns false. Each type supported by TypeList has a location in an array. Thus, from any type of time to any other type of cost O (1), O (1) depends on the number of types stored in Typelist. Ideally, all other members of ChangType_ members and static vTable objects should initialize at compile time. But this is actually impossible, because in C , static array initialization grammar cannot allow it to expand yourself through generic code when compiling. Check if it can be converted by using the Loki :: Conversion template class. This template class is described in [3]. Type Conversion By selecting a function or a function that returns FALSE when you select a function or a function that performs conversion and returns TRUE. Here is the code template <...> Class Variant {... struct vtable {... template void init (Bool (** PChangeType) (Variant &) (Variant &) (Variant &) (Variant &) Type2Type tt, Tlist) {typedef typename Tlist :: head head; typedef typename Tlist :: Tail Tail; enum {canConvert = Loki :: Conversion :: exists = 0};! * pChangeType = & VtableImpl :: Converter :: convert; init (pchangetype 1, tt, tail ()):} Template void init (Bool (**) (Variant &, Loki :: type2type < T>, LOKI :: NullType) {// Nothing to dp - stop recuty}};}; this init template function is recursive at compile time. Each time you call the init initialize a function pointer, then execute a tail recursive to initialize the remaining pointer. This recursive is implemented by the last parameters of init. This parameter is a Typelist. Init is reduced by this Typelist through each recursive call. Translation: Finally, INIT overloads that accept a Loki :: NullType object (Typelist terminator) Termination recursive. Converter is a simple template class, which is automatically translated - the type conversion or does not exist. Then provide a static function Convert to perform conversion.

7, Decoherence and Visit (Visitablility) In this article, "Decent" refers to the actual type that is stored in the Variant object. This term is borrowed from the quantum theoretical system. It was originally referred to as a Cuantic (translation: sorry: Sorry: Sorry: Sorry: I could not find this word) system switched to a typical physics. Similarly, for the Variant object, "Deliest" means looking for the "typical" C type hidden by Variant. The simplest "Dissatal" of Variant is executed by the template function getPtr. If Variant has a type int, Variant :: getPtr () returns a pointer to this int object, otherwise returns 0Template <...> Class Variant {... public: const st: type_info & typeId Const {return (vptr _-> typeid _) ();} template t * getPtr () {RETURN TYPEID () == TYPEID (T)? Reinterpret_cast ((& Buffer_ [0]): 0 There is also a const version GetPtr. Similar to the convenience of the two template functions get and added. They return references rather than the pointer. If the type of type and the type stored in Variant, throw STD :: Runtime_ERROR. In addition, if you are in non-Const's Variant storage is t but want to get Const T, the result is the failure. This behavior is determined when designing. If you provide the type you know, use getPtr and get to Variant Object stored objects: typedef variant DatabaseField Fld; ... if (int * pint = fld.getptr ()) {... Pint Points to the INTEGER Stored ...} else if (double * pdbl = fld.getptr ()) {... pdbl points to the double stored ...} else of (std :: String * pstr = fld.getptr ()) {... pstr points to the string stored ...} else if (date * pdate = fld.getptr ()) {... PDATA POINTS to the date stored ...} else {assert (false); // should not be here} This type of manual dislining is: This code is very fragile when the change is in the face. Then the VARIANT plus type, the compiler cannot ensure that each type is inspected in the case of the number of IF / ELSE statements (the ASSERT will be triggered). A better relief mechanism that can be checked by the compiler is to use Variant's Visitor mode [5]. There is a strong association between identifiable combination and Visitor mode. This is because Visitor provides the ability to perform different and unrelated operations for type sets. Each operation is converted to a real class derived from an abstract base class, and each type-related operation unit is converted into a virtual member function. The operation stream ensures that the correct member function is called based on the actual type being accessed. Imagine using Variant using Visitor.

For a Variant instance, Visitor mode predefines an abstract class, this abstract class has all types of Visit member functions that correspond to Variant possible. For example, for the DatabaseField type of Example 2, you must define the following interface: struct databasefieldvisitor {Virtual void visit (int &) = 0; Virtual Void Visit (Double &) = 0; Virtua; Void ViIT (std :: string &) = 0; Virtual Void Visit (Date &) = 0;}; each type in DatabaseField corresponds to a pure virtual function in DatabaseFieldVisitor. This means that each Variant instance must define a dedicated Visitor base class. However, this process can be implemented automatically, and there is a specific description in [3]. When using Loki's Visitor, the DatabaseFieldVisitor changes to: typedef loki :: CyclicVisitor DatabaseFieldVisitor; Loki can define the Universal Visitor interface with Typelists, so You can add the above typedef to the Variant. This allows the customer code to use it: Template <...> Class Variant {... public: typedef loki :: cyclicvisitor strictvisitor;}; in order to use the Visitor mode provided by LOKI, Variant must define an Accept function. This function accepts unique parameters: Strictvisitor, and then pass it to the stored type. This adds a new function pointer in vTable, then calls it in the Variant's Accept member function:

Template <...> Class Variant {... public: typedef loki :: cyclicvisitor strictvisitor; void accept (stringvisitor & visitor) {(vptr _-> accept _) (* this, visitor);}}; accept _ Binding specific function implementation simply calls the correct Visit's overload function for the Strictvisitor object. The final good result is that the correct determined procedure processes the program included in all Variant. The client from Variant <..> :: Strictvisitor must implement all member functions, otherwise it will be compiled when compiling. However, some users may wish to limit the restrictions on the access process without abandoning the benefits of visiting (Visiting). For example, it is envisaged having a database function that performs complex digital calculations for specific columns of query results. This feature is only only for Double type and DatabaseNull (a placeholder indicating the field in the relational database). If you only want to meet the type of requirements, you are very bad when you do anything you do; in this case, a more complex method is required. The Acyclic Visitor described in [6] is the solution to this need. Acyclic Visitor is a more flexible way to allow Visitor objects to access any subset of all given types. Acyclic Visitor also has an introduction in [3], and provides a generic implementation. By using the realization, we can easily achieve Variant of Acyclic Visitation: template <...> class Variant {... public: typedef Loki :: BaseVisitor NonStrictVisitor; Bool Accept (NonStrictVisitor & visitor) {return (vptr_acceptNonStrict _) (* this , Visitor);}}; these two access methods can be coexisting and unaffected by each other. If the type is indeed accessed, this ACCEPT's non-limiting version returns the Boolean True. Variant also added two Accept overload functions: the Const version of the functions described above. Without this two new addresses, you cannot access the contrant object of Variant.

8, Variant-to-Variant Conversions achieved access functions, some of the original complex work became very simple. Consider the following example, mutual conversion between Variants: typedef Variant DatabaseField; typedef Variant FilteredData; ... DatabaseField DBField; ... FilteredData MyData (dbfield); our purpose is to initialize another Variant's instance with an instance of a Variant. If the source Variant (here is DatabaseField), a type that can be converted to target Variant (here is FilteredData), then the conversion between this variant is allowed. For example, if dbfield contains a std :: string, MyData should be initialized by this String. If DBField contains a Date, this initialization action will throw an accident (assuming that Date cannot be converted to string or integer). Finally, if DBField contains an int, because Int can convert to unsigned int and unsigned shorts simultaneously, an unique accident will be thrown. In summary, a Variant instance is converted to another constructor performing the following algorithms: * If the source Variant storage object is in the Typelist of the target Variant type, the target value stored in the target Variant is also the value of the source Variant storage object. The actual type of storage is also the same. * If the source Variant type can implicate a type in Typelist in the target Variant, then this conversion * Other all situations will throw accidents that can be very slow, can be used to access (Visitation) solve. The constructor of the target Variant uses a "Convert VITOR" access source Variant instance. In its Visit member function, Convertingvisitor performs the search and distribution to initialize the target Variant when compiling. It will throw an exception when there is an unsteracy or cannot be converted. Convertingisitor is generated with Loki :: GenlineArhiRchy [3], which is a class library that can generate a complete class level. In this case, you need a class level to perform all Visit overload functions. Once again, we use Loki :: conver to determine if the type can be converted. 9, VariantS supports the free type (unbounded types) We review the Template Template Constructor Template <...> Class Variant {... Public: Template Variant (Const T & Val) {Static_Check ((Loki :: TL :: IndexOf :: Value> = 0), Invalid_Type_USED_AS_Initializer; New (& Buffer_ [0]) T (VAL); VPTR_ = & VTABLEIMPL :: VTBL_;}}; in the constructor Static_Checkt to ensure that T belongs to the type collection of Variant Accept. Therefore, all types accepted by Variant matches buffer_ and align. Such strategies introduce many interesting expansion applications.

Types no longer limit in defined collection, you can store any type in Variant as long as it meets space and alignment restrictions. Includes those that start definition of Variant not considered. If Variant's space is large enough to put a pointer or a smart pointer, Variant becomes a data packet that can be stored any type. In order to use free type, we only need to add another template parameter type in Variant: a flag to indicate that the Variant contains the identified type or free type. Class definitions and constructors becomes: template :: result> class variant {... public: template variant (const t & val) {static_check (Unbounded && sizeof (t) <= sizeof (buffer_) || (Loki :: TL :: IndexOf :: Value> = 0), Invalid_Type_USED_AS_Initializer; New (& Buffer_ [0]) T (VAL) VPTR_ = & vtableImpl :: VTBL_;}}; this method combines limited and freely identifiable union: you can also use the type of Typelist, but also store other in Typelist. Type, if you are not interested, you should include those types. But I want to store space overhead less than a fixed value, you can let TLIST only include char [n]. For example, the following typedef defines a Variant, which can be stored at a maximum of 64 bytes. Typedef Variant Any; Non-strict Visitation described in Section 7 described above ensures the true and beautiful degree of decentralization of the free Variant instance. 10. Summary This article introduces the identifiable combined full generic implementation. This series of features showing a range of features that can be flexibly appropriately applied to a large number of applications. The basic feature of Variant is that there is a designation capability that may be stored. No differential support for basic types and user custom types, obtain high efficiency by using a function pointer and avoiding the free storage area. A feature-rich dislining mode, and a powerful type conversion system.

11, thank you (Andrei AlxanderaScu) to thank Thant Tesman for his thorough inspection.

转载请注明原文地址:https://www.9cbs.com/read-23047.html

New Post(0)