Wide <Programming>: Discriminated Unions (2)

zhaozj2021-02-16  72

Wide : Discriminated Unions (2) Andrei AlexandRescu

You know "Syntax: SynbTactic Suger, the language used to improve readability, but does not work on the language itself)? It led to the habit of chaotic semicolons? [1] Ok, joke is enough, today we have a lot of things to do, let us start. This part then improves the identifiable combination of C . Today we will end the discussion about alignment and write some actual code achieved by Variant. Before this, let's review our progress in the previous part "Programming>". The focus of the previous section has discussed what kind of storage model we have discussed after intending the list of demands that can be identified. After encountering the problem of alignment, we believe that the exception-friendly storage model should be accrued, high efficiency, high efficiency, and excellent.

Union {unsigned char buffer_ [neededsize]; align dummy_;};

Here NEEDEDSIZE is the size of the largest type in the joint, and Align is a POD [3] type that guarantees proper alignment. The identifiable combined storage may be optimal, it may be too large, or it may be defective, depending on what extent, what is the align selected. Even if you use the best align, the above implementation is still unacceptable to all types of 100% universal. In theory, there is a compiler that is fully implemented according to standards but still does not correctly handle it. This is because the standard does not guarantee that all user custom types have a POD alignment. But such a compiler is more likely to appear in an imaginary understanding of a metamorphous language expert, not in a realistic language compiler.

Realizing a aligned calculation has several valid alignment calculation algorithms available. You can see several implementations in the Boost library [4]. A algorithm that can be well calculated to calculate a type T should be: 1, starting from all basic types. 2. Remove the type of greater than t from the collection. 3. The generated Align is a combination that contains all types in the collection of the second step. The basic idea of ​​this algorithm is that any user's custom type T eventually contains only basic types, T, and the same as one of those types. The larger type has a large alignment requirement; this naturally inference the upper limit of T is the upper limit of its greatest member.

This algorithm may "over align" results. For example, if t is char [5], then the alignment of T may be one. At the same time, if SizeOf (int) is four. The result is that Align's alignment will be like an int (very likely to be four). In most cases, it is too much harmful to too much damage (not enough to be a disaster). But on the other hand, it is too surprising to waste space. Our step 2 guarantees that only a size of the size of the size is equal to T is selected. In order to achieve this calculation of Align's algorithm, let's recall the front part of "Wild ", we have two suitable tools to dominate: type strings and configurableunion. Type strings let's operate the type of type, let's perform steps 1 and 2. Configurableunion generates a collection of C types from a type string so that it can be used in step 3. It is very important that the initial set of types should include some simple structures, including some simple structures, only one basic type as a member. We use a simple template to generate these settles (STUB) structures:

Template struct structify {u Dummy_;};

The reason for introducing these small structures is simple. Some compilers perform higher alignment requirements than a simple INT that contains an int. In this way, the compiler author ensures that all user-defined types have the same alignment requirements. This decision on structure makes it easy to make different parts of the compiler. This starts from the type string containing all the basic types and all "structured" types:

Class unknow;

TypeDef Cons , StructifyMshort int>, Structify , Structify , Structify , Structify < Double>, Structify , structify , structify , structify , structify , structify , structify , structify , Structify , Structify , Structify , Structify ,> :: type TypeOfAllAlignments ;

OK, this basic type is there; the type of structured is also there - this is what we want. But some rows of the above strings look like a result of dancing on the keyboard. These lines are:

Unknown (*) (unknown) unknown * unknown :: * unknown (unknown :: *)

These structured versions appear at the bottom of the type string simultaneously. What are these things, where did they come from? In order to let them look more familiar, let's give them the name:

Unknown (* T1); unknown * unknown :: * t2; unknown (unknown :: * t3) (UNKNOWN);

Ah! If we add Typedef before each row, they use it like C's statement syntax. T1 is a function pointer with unknown parameters returns unknown; T2 is a member of Unknown, and this member is unknown * .. T3 is a pointer to the member function of Unknown, which returns a unknown as a result of a unknown as a parameter. The skill here is that unique naming unknown actually does not define. In this way, the compiler will consider unknown to define and make the assumption that the worst alignment needs otherwise. (Otherwise, the compiler will optimize the memory layout of Unknown. Optimization is here and universal apart.) Ok, interesting part is coming, let us remove all of the given size from TypeSofalLalignments by writing the following template value.

Template struct computeAlignBound; First we deal with recursive termination versions, just like any good recursive algorithm:

Template struct computeAlignbound {typedef null_typelist result;};

Then we deal with universal versions:

template struct ComputeAlignBound , size> {typedef typename ComputeAlignBound :: result TailResult; typedef typename select , tailresult> :: result result;

First, ComputeAlignBound generates the recursive calculation result of the tail of type strings, placed in TailResult. Now, if the HEAD size is equal to SIZE, the result is a type string consisting of Head and TailResult. Otherwise, the result is that only TailResult - Head is no longer included in the type string. This type of selection is performed by a small template tool Select. In order to save column space, I have to let you see [5] to get the specific details. This is a very useful small template. If you are interested in generic programming with C , you should study it yourself. We need to do some ending to integrate all these. Let's recall what we have, what we need. We already have a MaxSize template, which calculates all types of size in the type string. We have configurableunion that creates a joint from the type string. Finally, we have computeAlignBound, which calculates a type of alignment requirement, which may be equal to or more stringent in a type of string all types of alignment requirements. Here is how specific code we need.

Template class AlignedPOD {enum {maxSize = MaxSize :: result}; typedef ComputeAlignBound :: Result AlignTypes; public: typedef ConfigurableUnion Result;};

You can complete the package by placing ComputeAlignBound, UnkNown, Structify and ConfiguRableunion on the private namespace. This details can be hidden well.

Identification of Joint Implementation: The Simulated virtual function table (VTABLE) common method allows us to return to identifiable. In this step, half of this article is discussing how to calculate alignment. I hope that these efforts have no white flowers; it is very important in the low-level procedures, but also helps higher levels of abstract efficiency. Now we have the storage mechanism of the object in the joint, we need the identifier - stored in the variant and indicate what the actual type in the object is. The recognizer must be able to perform some types of related operations, such as type identification and type secure data access operations. We have a lot of design, the simplest is to store a integer recognizer:

Template :: result> class variant {union {charffer_ [size]; align dummy_;}; int distribiminator_; public: ...}; this scheme is not very "clever". In order to complete various tasks through this identifier, the Switch's solution is not escaped. Switch means coupling. Maybe we can use int to use Int as an index of some tables, but why not directly use pointers in the table? (We will discuss this program below) The second solution is to use a proxy (Proxy) polymorphism

template :: Result> class Variant {union {char buffer_ [size]; Align dummy_;}; struct ImplBase {virtual type_info & TypeId () = 0; virtual Variant Clone (const Variant &) = 0; Virtual Void Destroy (Variant &) = 0 ;; ...}; Implbase * PIMPL_; Public: ...};

The idea here is to operate differently from the same data to the same data through a pointer to the polymorphism. Thus, what is the actual data based on buffer_, this pointer points to different specific IMPLBASE, then this polymorphic object performs a specific operation. This method is actually very clear, except for one thing: efficiency. When a function is called, for example, DESTROY. The access step is: * Variant object. * Dereference PIMPL_. * Promote the virtual function table of PIMPL_ is the so-called "vtable" [6]. * Find the correct function by the index in VTABLE. With constant as an index access VTABLE and access a value domain (ultimately by indexing access) is not a problem, the problem is that the probing pointer - there is two indirect hierarchies between the start call and get the function. However, an indirect hierarchy call to dispatch correct type should be enough, but what should I do? (I will explain it, if you want to know why I forgot the two provisions about optimization: (1) Don't do optimization. (2) Don't do optimization - I really remember. So why is Variant to optimize? Answer? Answer It is very simple. This is not the application code, this is the library, in addition, this is a quite analog simulation for a language characteristic. The more you use it, the more you use it; the contrary, its implementation efficiency is getting poor, it It's like a toy program so you don't use it in a real program.) A good method should be analog compiler behavior: a vTable and ensure an indirect hierarchy. This will lead to the following code:

template :: Result> class Variant {union {char buffer_ [size]; Align dummy_;}; struct Vtable {const std :: type_info & (* typeId _) (); void (* Destroy_); Void (* Clone_);}; vtable * vptr_; public: ...}; vTable structure contains pointers to different functions, these functions access Variant objects (careful attention Destroy_ and clone_ definition syntax; which is a function pointer stored in vtable, just like a general data member. Here the C-Type declaration is also the price of the War, with more effort to implement the price of the code, Simulated vTable Provide only one indirect hierarchy and a very flexible way to Variant objects, let us now look at how we initialize and use this simulated VTable.

Initializing a Variant object When constructing, Variant needs to initialize its VPTR_ member. In addition, it needs to correctly initialize each pointer within the VTable. In order to achieve this, let's define a template VTableImpl . This template defines a set of static functions that matches the type of function pointer in the VTable.

... in Variant ... Template struct vtableImpl {static const st: type_info & typeid () {return typeid (t);} static void destroy (const variant & var) {const t & data = * reinterpret_cast < Const t *> (& var.buffer_ [0]); Data ~ T ();}

Static Void Clone (CONST VARIANT & SRC, VARIANT & DEST) {NEW (& DEST.BUFFER_ [0]) T (ReinterPret_Cast (& src.buffer_ [0])); dest.vptr_ = src.vptr_;}};

Take a look at some interesting places that vTableImpl implementations: * All functions are static functions, and the Variant object references as the first parameter. * When you need to access the actual object of Type T, VTableIMPL converts & buffer_ [0] to T * by forced conversion to T *. In other words, all functions in all vTableImpl assume that there is a T object within BUFFER_. Synchronizing the type of function pointer and buffer_ is simple - this is the work of constructor. ... in Variant ... Template variant (const t & val) {new (& buffer_ [0]) t (val); static vtable vtbl = {& vtableImpl :: TypeId, & vtableImpl :: Destroy, & vtableImpl :: clone,}; vptr_ = & vtbl;

Yes, it is so simple. We created a copy of the incoming VAL and placed very well (by using the Placement New operator) buffer_, (we have worked in front of us to ensure that it is properly aligned to store a T), we immediately build A static VTABLE, initializes it with a static function in three vTableImpl. The remaining is to initialize VPTR_ with the address of the newly established VTable. In this way, everything is done slowly - not really "all", see this: type> :: type number; string s ("Hello, World!"); Number Weird { S};

The above code can be compiled because it can instantiate the constructor of Variant with String. However, Number is naturally apparent to accept INT and DOUBLE only in the constructor. So we need to make sure that the Variant's template constructor cannot be instantiated with a type other than the type string of Variant. Here are two auxiliary tools: Type string operating mechanism, which calculates an index value in the type string when compiling; any type of serial library belongs to this. Another tool is compiled time-assessment - if a logical condition is not satisfied to generate code that cannot be compiled. I don't want to repeat the details in this article, Boost [7] and Loki [8] have type string operating mechanisms and compile time assertions, they only have some slight differences in grammar. If you want to go fundamentally to implement this restriction, you can do it through a gadget:

Template struct ensureocurence {typedef encurence

Template Struct EnsureOccurence , t> {typef t result; // may be t or any type};

If you initialize this:

TypedEf Ensureoccuance :: type, double> :: result test

The first special version is instantiated and recursively instantified this template with a string, in this example, the second specialization version is finally matched and terminated. On the contrary if you want to write this:

TypeDef Ensureoccurence :: type, long int> :: results;

The second ENSUREOCCURENCE version will never be matched; recursive instances instantiate null_typelist :: tail by requiring compiler, indicating that type strings have been exhausted. In this way, the revision of the Variant template constructor is like this:

... in Variant ... Template Variant (const tydef Ensureocurence :: Result Test; New (& Buffer_ [0]) T (VAL); static vTable vtbl = { & VTableImpl :: TypeId, & VTableImpl :: Destroy & VtableImpl :: Clone,}; vptr_ = & vtble;

How to deal with the default constructor? The conservative approach is to ban it. However, a class that does not have the right constructor is the so-called "minimalistic". At least you can't store Variants in the STL container. A viable decision is to initialize Variant as the first type of the type string as the default constructor. You can't initialize Variant, define a constructor, let the user put the "most easily initialization" type first. ... in Variant () {TypeDef TypeName Tlist :: Head T; New (& Buffer_ [0]) T (); static vtable vtbl = {& vtableImpl :: TypeID, & VTableImpl : : Destroy, & VTableImpl :: Clone,}; vptr_ = & vtbl;

Eliminate the duplicate code between the template and the default constructor as the reader's practice.

Using analog virtual function table Let's see how to let Variant's users use VTableImpl features. For example, get the type identifier of a Variant object is like this:

Template > :: result> Class Variant {... public: const st: type_info & typeid () const {return (vptr _-> typeid _) ();} ...} These code only use a line of code to call the function of the "TypeId_ function pointer to the VPTR_. Now returns a function of type pointer to a Variant object requires only six lines of code:

... in Variant ... Template T * getPtr () {Return TypeId () == TYPEID ()? Reinterpret_cast (& Buffer_ [0]): 0;}

(In practical applications, you need to write the corresponding const function versions) The destructor is even simple:

... in Variant ... ~ variant () {(vptr _-> destroy _) (* this);

To now we have a beautiful Variant template core that can construct and destroy objects correctly. The stored data is obtained in a type safe manner. This small class is urgently needed, as we will see in the "return " in the next part.

转载请注明原文地址:https://www.9cbs.com/read-23044.html

New Post(0)