Boost source note: boost :: Multi

xiaoxiao2021-03-06  47

Boost source note: boost :: Multi_Array

Xie Xuan / article

motivation

C is a free language that allows you to express your intentions, right? So we can get a one-dimensional array, you should also get a multi-dimensional array, right. Let's take an example:

INT * PONEDIMARR = New int [10]; // New 10 elements of one-dimensional array

PONEDIMARR [0] = 0; // Access

INT ** PTWODIMARR = New int [10] [20]; // Error!

PTWodimarr [0] [0] = 0; // Access

However, it is a pity that the behavior of the three four lines of code is not as if you think - although they look so "natural" from the grammar.

The problem here is that NEW INT [10] [20] is not an int ** type pointer, but the INT (*) [20] type pointer (this pointer is called a row pointer, "for it" 1 "It is equivalent to the size of a line (this example is 20), that is, let it point to the next line), so our code should be like this:

int (* ptwodimarr) [20] = new int [i] [20]; / / correct

PTWodimarr [1] [2] = 0; // Access

Note that PTWodimarr type --int (*) [20] is a very special type, it cannot be converted to int **, although the syntax form of the two index elements is "p [i] [j]" form However, the number of times the access memory is different, the semantics are different.

The most critical problem is: Creating a multi-dimensional array in this simple way, there is a biggest limit, except for the first dimension, other dimensions must be determined by the compile period. E.g:

INT (* pndimarr) [N2] [N3] [N4] = new int [n1] [n2] [N3] [N4];

Here, N2, N3, N4 must be compiled. Only N1 can be variables, this restriction is related to the indexing of the multi-dimensional array - whether the array of dimensions is linearly stored in memory, so:

PTWodimarr [i] [j] = 0;

The code generated by the compiler is similar to:

* ((int *) PTWodimarr i * 20 j) = 0;

20 is the line of two-dimensional arrays, the problem is that if the linearness of the two-dimensional array is also dynamic, the compiler will not generate code (what should 20 where 20 is located?). Based on this, C only allows the first dimension of multi-dimensional arrays to be dynamic.

Unfortunately, due to this limit, the multi-dimensional array in C has become a famous and unique thing in most cases. We can often see questions about multidimensional arrays on the forum. The core of this type of problem is: how to imitate a "fully dynamic" multi-dimensional array. It means that "complete dynamics" here is that all dimensions can be dynamic variables, not just the first dimension. The answers given in the forum are different, some are quite good, but either lack scalability (ie, the case of extending to N-dimensional), or in the form of an access element from the built-in multi-dimensional array Access form, either consume too much extra space. In the final analysis, we need a multi-dimensional array of multi-dimensionalities:

/ / Create an int type DIMS dimension group, DIM_SIZES is used to specify the size of each dimension, namely N1 * N2 * N3MULTI_ARRAY MA (DIM_SIZES [N1] [N2] [N3]);

MA [i] [j] [k] = value; / / to assign values ​​for elements of the J line k column column IPIN

MA [I] [j] = value; // Compile the wrong!

MA [I] = value; // Compile the wrong!

MA [i] [j] [k] [l] = value; // compile!

Such a MULTI_ARRAY can automatically manage memory, owns and build an interface consistent interface, and the size of each dimension can be variables - which is in line with our requirements. It seems that this Multi_Array is not difficult, but the truth is always unexpected. The following is an analysis of a Multi_Array implementation in Boost - you almost certainly find some unexpected (even amazing) local.

Multidimensional array implementation in BOOST - Boost :: Multi_Array

There is a powerful multiarray library for describing the multi-dimensional array in the Boost library. It implements a common, an interface consistent with the standard library, and has the same interface and behavior as a multi-dimensional array built in C . It is based on this versatile design, and the Multiarray library can have good compatibility between the standard library components and even user-defined generic components, and can work well.

In addition, Multiarray also provides an extremely useful feature such as changing size, reshaping, and view access to a multi-dimensional array, so that Multiarray is more than other ways to describe the multi-dimensional array (e.g.,: std :: vector >) More convenient, efficient.

Below we will deepen, go to the mystery of Boost :: Multi_Array - debug the sample program, tracking is one of the most effective means of analyzing the source code. We will start with the sample programs in the Multiarray documentation:

// Slight to the header file

Int main () {

// Create a three-dimensional array of sizes 3 × 4 × 2

#define DIMS 3 // Array is a few days

Typedef Boost :: Multi_Array array_type; // (1-1)

Array_Type A (Boost :: Extents [3] [4] [2]); // (1-2)

/ / Element is assigned to an array

A [1] [2] [0] = 120; // (1-3)

...

Return 0;

}

In the above code, the two template parameters of Boost :: Multi_Array represent the dimensions of the type and array of array elements, respectively. (1-2) is the constructing statement of the three-dimensional array object. Boost :: EXTENTS [3] [4] [2] is used to indicate the size of the array, the meaning here is: define a 3D array of 3 * 4 * 2. You are sure to doubt to Boost :: EXTENTS - Why can I use "[]"? What will happen if you use more or less? [] "? Below I will strip all the mystery of Boost :: EXTENTS layered -

EXTENTS - the way in consistent with the built-in array

Boost :: EXTENTS is a global object, in Base.hpp:

Typedf Detail :: Multi_Array :: EXTENT_GEN <0> EXTENT_GEN

...

Multi_Array_types :: eXtent_gen extents; // Note its type!

It can be seen that the type of Extents is Extent_Gen, the latter is located in eXtent_gen.hpp: // EXTENT_GEN.HPP

Template

Class extent_gen {

Range_list ranges_; // (2-1)

...

EXTENT_GEN (Const Extent_Gen & RHS, Const Range & A_RANGE)

{// (2-2)

... //

}

EXTENT_GEN Operator [] (INDEX IDX) / / (2-3)

{RETURN EXTENT_GEN (* this, Range (0, IDX));}

/ / Return an extent_gen, but the second template parameter increases 1

}

Boost :: EXTENT_GEN Heavy loaded Operator [] operator, but since there is only one "[]" here, why can we write "Extents [N1] [N2] [N3] [...]? Continue to see -

If Boost :: Extents [N1] [N2] [N3] is expanded to operate to operate, it is equivalent to:

Boost :: EXTENTS.OPERATOR [] (N1) .Operator [] (n2) .Operator [] (n3);

The type of boost :: Extents object is extent_gen <0> - its NumRANGES template parameter is 0. EXTENTS.OPERATOR [] (N1) Call (2-3), return an extent_gen , which is extent_gen <1>, because this type is also extent_gen, so it can still call Operator [] (N2), Returns Extent_Gen <2>, then calls Operator [] (N3) on the return value, and finally returns an Extent_Gen <3> type object.

Here, each of the Operator [] calls turn to (2-3), where the parameter IDX is put into the constructor (2-2) in the range of the parameter IDX, pay attention to the extent_gen type at this time. Constructor. As for Range (0, IDX), a subscriber section of [0, IDX) is represented.

Summarize the basic work mode of Extents - each pair it calls Operator [], will return an Extent_Gen type object, so for boost :: extents [n1] [n2] [n3], returned The type is:

EXTENT_GEN <1> => eXtent_gen <2> => EXTENT_GEN <3>

The last one is also the final return type --Extent_gen <3>. Its template parameter 3 indicates that this is specified for the subscript of the three-dimensional array. In its members ranges_, there are three sets of intervals of [0, N1), [0, N2), [0, N3). These three sets of intervals specify the subspace of the three dimensions of the multi_array object we define, it is worth noting that these intervals are closed backwards (this is the same as the interval of STL's iterator). When Boost :: Extents is ready, it is passed to the constructor of Multi_array, which is used to specify the substruction section of each dimension: // Multi_array.hpp

Explicit Multi_Array (const extent_gen & ranges):

Super_Type ((t *) initial_base_, ranges) {

Allocate_space (); // (2-5)

}

Here, MULTI_ARRAY accepts information in the RANGES parameter, takes out the subspace of each dimension, and then saves, and finally call allocate_space () to assign the underlying memory.

Use Extent_Gen Benefits - DO Things Right!

The construction process of the Boost :: Extents is consistent with the construction of the built-in multi-dimensional array, concise, and clear semantics. First, Boost :: Extents uses "[]", it is easy to think of the declaration of the bucket in the bucket, which is clearly clearly expressed in the meaning of the value in square brackets - indicating the substruction section of each dimension; The key or use Boost :: Extents, you can prevent the user from writing an error, for example:

Multi_Array A (Boost :: Extents [3] [4] [2] [5]);

//wrong! You cannot use a four-dimensional subscript to create a three-dimensional array!

The above statement is completely wrong, because Mult_Array is a three-dimensional array, but Boost :: extents has followed four "[]", which is obviously an error. This error is banned in the compile period, which is due to the grammatical level, Multi_array constructor can only accept the parameters of the extent_gen <3> type, and analyze the extents in front of our front, Boost :: Extents [ 3] [4] [2] [5] Returns an Extent_Gen <4> type object, which will result in compilation errors. This compile period is forced to prevent users from being accidentally committed (if you are doze down?), It is also very clear and clearly expressed (forced) the demand for semantics.

Another alternative and disadvantages

In addition, there is an alternative to declaring each dimension, which is to use so-called Collection Concept, for example:

// Declare a shape ("shape"), namely each dimension

Boost :: array shape = {{3, 4, 2}};

Array_Type B (Shape); // 3 * 4 * 2 3D array

This constructor will call the second constructor of Multi_Array:

// Multi_Array.hpp

Template

Explicit Multi_Array (Extentlist Const & Extents):

Super_Type ((t *) initial_base_, extents) {boost :: function_requires

Detail :: Multi_Array :: CollectionConcept > ();

Allocate_space (); // (2-6)

}

This constructor is the type of EXTENTS that is in line with the Collection Concept. The behavior of this constructor is the same as the constructor that accepts Extents_Gen - still takes out the RANGE of each dimension to save, and then assigns the underlying memory. As for the code at (2-4), it is not described here again in order to comply with the Collection Concept.

Put this way with a simple comparison in the way using Extent_Gen, it is easy to see that the advantages and disadvantages: use this way, it cannot guarantee the correctness of the compile period, for example:

Boost :: array shape = {{3, 4, 2, 5}}; // a four-dimensional array of Shape

Multi_Array a (shape); // can be compiled! !

Here, use a four-dimensional Shape to specify a three-dimensional multi_Array is inappropriate, but it passes compiled, which is because the constructor has no special requirements for its parameter extents, just treat it as an ordinary collection. The function will take out the desired range of the desired range from ExtensS from Extens, so that the constructor takes out the first three values ​​from the Shape as the subspace of the three dimensions, regardless of the shape There are several values. Such statements are unclear and even errors in semantics. However, since this constructor exists, the designer naturally has his truth, and the documentation is clearly shown that the maximum use of this constructor is the code of dimension-independent, and the Multi_Array librar is thinking before A constructor.

Multi_Array architecture

Regardless of which constructor is used, the process is similar - in which a series of subscript intervals are incoming into the base class, the same allocate_space () function is called after the base class configuration is completed (see (see (2 (2 (2 (2 (2 (2 (2 (2) -5) and (2-6)), allocate_space, as the name suggests, used to assign space for the element of the multi-dimensional array.

However, behind this seemingly simple construction process, it hides two-three-layer hierarchical structure. The structure and function of each layer are different. Some levels can be pulled out separately from other places alone. .

Let's take a look at Multi_Array's architecture. What is J1 First, Multi_Array inherits from Multi_Array_Ref:

// Multi_Array_ref.hpp

Template

Class Multi_Array_ref: // Multi_Array base class! !

Public const_multi_array_ref

{

TYPEDEF ConST_MULTI_ARRAY_REF Super_Type; ... ...

Explicit Multi_Array_ref (t * base, // Pointer to array storage space

Const Extent_gen & Ranges): // Detailed

Super_type (base, ranges) // forward the initialization task to the base class (3-1)

{}

...

}

And Multi_Array_ref is based on const_multi_array_ref:

// Multi_Array_ref.hpp

Class const_multi_array_ref: // multi_Array_ref base class! Management underlying storage!

Public MULTI_ARRAY_IMPL_BASE

{

...

Explicit const_multi_array_ref (TPTR Base,

Const Extent_gen & Ranges):

Base_ (base), Storage_ (c_storage_order ()) // (3-2)

{INIT_FROM_EXTENT_GEN (RANGES);

...

Storage_order_type storage _; // support multiple storage strategies! (3-3)

}

The path of MULTI_ARRAY is extended to (3-2) (constructor of multi_array_ref) (constructor) (Const_Multi_Array_ref) - This seems to be ended, because no parameters are passed to const_multi_array_ref Class multi_array_impl_base. But the heart is still confused: Why do there so many layers of inheritance structure? What is the mystery of such a class hierarchy design?

Multi-layer inheritance mystery - access interface && reuse

Go to the statement of the base class const_multi_array_ref, it seems to see some end:

Template <...>

Class const_multi_array_ref {

...

/ / And all the iterator interfaces consistent with all STL containers! !

Const_iterator begin () const;

Const_iterator End () const;

...

// and std :: vector uniform elements to access the interface! !

Const_reference Operator [] (INDEX I) Const;

...

}

Seeing these statements, is it a bit familiar? STL! Yes, the statement of these member functions is exactly consistent with the Container Concept in STL. That is, here is provided with an access interface consistent with the STL container, and the compatibility with STL is reflected here. Const_Multi_Array_ref is more "class," all access elements in const_multi_array_ref, and the member functions such as query array information returns Const's Reference or Iterator. And the declaration of MULTI_ARRAY_REF, which is only more access to elements than const_multi_array_ref, and query the corresponding Non-Const version member function of the array information.

So what is the responsibility of const_multi_array_ref Multi_array_Impl_base? Multi_array_impl_base is the implementation details, its role is only calculated based on the array information (const_multi_array_ref) calculating the offset, the step size, etc., that is, the multidimensional subscript Finally, conversion to a one-dimensional offset. The function of multi_ARRAY_IMPL_BASE -Value_accessor_n or value_accessor_one is to provide an access to the original data. This way of access is to convert multidimensional index into a one-dimensional offset of multidimensional indexes. As for why there are two versions of value_accessor_n and value_accessor_one, which will be explained in detail later. At this point, the rough architecture of Multi_Array has already appeared:

MULTI_ARRAY -> MULTI_ARRAY_REF -> const_multi_array_ref -> multi_Array_impl_base -> value_accessor_n / value_accessor_one

Each layer is a respective role:

¨ Multi_Array: Assign space for array elements to forward various operations to the base class.

¨ Multi_Array_ref: Provides a data access interface that is consistent with the STL container. It can also be used independently as an adapter.

¨ const_multi_array_ref: Provides a Const's STL Data Access Interface. You can also use it as a const adapter.

¨ MULTI_ARRAY_IMPL_BASE and its base class: The most underlying implementation provides a set of basic operations for raw data.

This architecture seems complicated, but provides high reuse, where (const_) Multi_Array_ref class can be independent as an Adapter used - for example::

INT A [24]; // 10 element array of one dimensions, located on the stack!

// Take a 3D array of 3 * 4 * 2: a 3 * 4 * 2 3D array:

Multi_Array_ref Arr_ref (a, boost :: extents [3] [4] [2]);

Arr_ref [i] [j] [k] = value; // and Multi_array use interface interface

Very simple! Since then, you don't have to simulate multi-dimensional arrays. Even a one-dimensional array on the stack, you can also "see" it is a multi-dimensional array. If you don't want Multi_Array from the assignment of memory, you can assign an array yourself (you can be on your stack or pile) and then package it into a multi-dimensional array with multi_array_ref.

Multi_Array's storage strategy

Next, let's take a look at Multi_Array storage policies, such as: C-style multi-dimensional storage mode is stored in line, and Fortran is the opposite, which is stored, or even users may have their own storage policy requirements. So how do you support a variety of storage strategies? The secret is the code (3-3), member storage _-- its type for const_multi_array_ref, the following declaration pointed out the "Original Friend" --General_Storage_Order ::

// Multi_Array_ref.hpp

...

TYPEDEF General_Storage_Order Storage_Order_Type; ... ...

// storage_order.hpp

Template

Class general_storage_order {

General_Storage_Order (const c_storage_order &) {// (4-1)

For (size_type i = 0; i! = NUMDIMS; i)

{Order_ [I] = NUMDIMS - 1 - I;

Ascending_.assign (true); / / The default is an ascending order

}

...

Boost :: array Order_; // Priority in each dimension

Boost :: array ascending _; // ascending alignment or descending order

}

In the constructor in (4-1), Order_ and Ascending_ are two arrays. By default, when the function (4-1) is executed, the elements in Ordering_ should be {NumdiMS-1, Numdims-2, ... 1, 0}, that is, NumdiMS-1 dimension is first, then NumdiMS-2D, ..., if these elements are used as the identity of each dimensional storage order - has a smaller ORDERING_ Value Dimensional Priority - then the storage method in this and C language is completely consistent, and ascending_ should not be used to indicate whether each dimension is ascended. In fact, General_Storage_Order has a template constructor that supports more generalized storage policies (such as Fortran, or user-defined storage policies). It does not make details here.

In addition to the storage policy, the construct of const_multi_array_ref also takes the contents in Extens to take the contents in the extens, and other variables at any other variables in other variables ((3-3) are set, and the specific details are not Take it again.

Now all information about a multi-dimensional array is already ready, it can be described as "all ownership, only 'space'". Multi_Array What to do is to call the allocate_space mentioned earlier to assign space in an array.

// Multi_Array.hpp

void allocate_space () {

...

Base_ = allocator_.allocate (this-> num_lements (), no_hint);

... std :: uninitialized_fill_n (base_, allocate_elements_, t ());

}

It turns out that in the bottom, the storage is still a stored storage for a one-dimensional array: Allocate_space uses allocator_ assigns a continuous space to store an element, where num_lements () returned is the size of the array of dimensions, ie the total element of the array Number. After the assignment, the first finger is assigned to a member base_, and std :: uninitialized_fill_n is responsible for the default initialization of the array. The construction of Multi_Array finally was finally made.

Consistent interface - Soul of GP

Another important feature of Multi_Array is to support the same access method as built-dimensional arrays, that is, Multi_Array supports access to array elements in continuous "[]". Take the assignment statement at the beginning of the sample code (1-3) as an example, let us see how Multi_Array supports this way of accessibility compatible with built-in arrays. // Multi_Array_ref.hpp

// Use Operator [] to access the element, what is the type Reference? Not T &!

Reference Operator [] (INDEX IDX) {

Return Super_Type :: Access (Boost :: Type (),

IDX, Origin (), THIS-> Shape (), this-> Strides (),

This-> index_bases ());

}

This call is transferred to value_accessor_n :: access (...) :: Access (...):

// base.hpp

// in class value_accessor_n

Template

Reference Access (Boost :: Type ,

Index IDX, TPTR Base, Const size_Type * Extents,

Const Index * Strides, Const Index * INDEX_BASE

{

TPTR newbase = base idx * stride [0];

Return Reference (Newbase, Extents 1,

StriDES 1, INDEX_BASE 1);

}

This process and extend_gen is very similar - a "Proxy", this is not actual element per call, this is not the actual element, because this time is still following "[]", so This avatar should also call Operator [], so that this process can go back until the back is not "[]". "

For example, if you access the elements in A in a [x1] [x2] [x3], it is equivalent to

A.operator [x1] .operator [x2] .Operator [x3] // Continuous call "[]"

These three Operator [] The type of call returns is:

Sub_Array -> SUB_ARRAY -> T &

Note that the second template parameter (dimension) of SUB_ARRAY is decremented, when it is decremented to 0, it indicates that the last dimension has been returned to the last dimension, so the last call "[]" returned to the element Quoted (this is just the first "[]" the number and the number of dimensions mentioned before and only when the number of dimensions is the same, otherwise you get the return value or SUB_ARRAY <.. .> (Rather than elements), how do you do this because of trying to call "[]" in T & S & I continues to call "[]"?

From the above recursive process, we can easily see: I really have access to the elements and return T & I to Sub_Array . So what is this SUB_ARRAY?

Sub_Array's Secret

Sub_Array defines in Sub_Array.hpp:

// Sub_Array.hpp

Template

Template

Class const_sub_Array:

Public MULTI_ARRAY_IMPL_BASE

//base.hpp

Template

Class Multi_Array_Impl_base: Public

Value_accessor_generator > :: type;

Hey, the original SUB_ARRAY finally inherits from multi_Array_Impl_base, the latter's base class is generated by value_accessor_generator - it generates different base types according to NUMDIMS:

// base.hpp

Template

Struct value_accessor_generator {

...

TypeDef Typename // If Numdims is 1, the type is Value_Accessor_one

MPL :: Apply_iF_c <(Dimensionality == 1),

Choose_Value_accessor_one ,

Choose_Value_accessor_n

> :: type type; // Put this type as the base class of multiple_Array_Impl_base!

}

Obviously, if Dimensionality == 1, then ":: type" is value_accessor_one , and only to value_accessor_one "[]" can return T &, otherwise, ":: type" is derived to value_accessor_n, this is just a " Proxy "," [] "will only return Sub_Array , so that it can continue this continuous call" [] "process until DimensionAlity is reduced to 1, the base class of Sub_Array It became Value_Accessor_one, and "[]" will return T &!

Take out elements

According to the above analysis, the task of taking the element is ultimately handed over to Value_Accessor_one, and its member function Access is as follows:

Template

Reference Access (Boost :: Type , Index IDX, TPTR Base,

Const size_type *, const index * strides,

Const index *) const {

Return * (base idx * strides [0]); // finally took out the data!

}

Here, the return type of Access is a reference to the element type in the T &, ie the array, so that the reference to the specified element can be returned to reach the purpose of accessing array elements. Seeing this, Multi_Array Access the array of access to array elements in Multi_Array, is basically clear, as for some details, especially the details of the calculation address, such as: the calculation of the offset, the use of the steps, etc. went. Maybe you will have such a doubts: Is it true that the ability of the "[] [] [] ..." access method to the multi-dimensional array "[] [] ..." is really so important? It is not so much vigorous, it is better to write such a multi-parameter method to overload Operator [] ("[i, j, ...] form)! Is such a big price really worth it? worth it! This is not doubtful. The most important manifestation of the ability to access the multi_array elements in-array access methods is: can treat Multi_Array as an built-in multi-dimensional array, and the consistency of the built-in type is the key to your class and generic algorithm. For example: the user wrote a function template,

Template

Returntype SomeAlgo (_3Array & MDA) {// can act in built-in multi-dimensional array

...

MDA [x] [y] [z] = MDA [z] [x] [y]; // The form of access to built-dimensional array!

...

}

Because there is a capacity to access the multi_array element in an array of access to the array, this SOMEALGO algorithm can be applied to the in-array and Multi_array (otherwise the user has to provide a separate overload version for Multi_Array), one, code Reusability, scalability is greatly improved.

effectiveness

Efficiency is C eternal theme, and the Multiarray library is no exception. On the execution of time efficiency, look at the access code of the Multiarray library to the array element, although the function calls the nested layer, but most calls are simple forwarding or compiler, under a mature modern C compiler, These forwarded function calling code should be easily optimized, so there is almost no loss in efficiency. In terms of space, due to a large number of template technologies, it is basically able to determine the content determined in the compile period, and there is no burden on the operating period. In general, the Boost.Multiarray library is indeed a rare efficient and universal multi-dimensional array.

Conclusion

This article only makes a concise analysis of Multi_Array's most basic function code. As the article starts, Multi_Array has many very useful features. If the reader wants to fully understand the Multi_Array's operational mechanism and implementation skills, please analyze in depth. Multi_Array's code, I believe you will be gainful!

转载请注明原文地址:https://www.9cbs.com/read-55706.html

New Post(0)