Boost source note: boost :: Multi_Array
Xie Xuan / article
motivation
C is a free language that allows you to express your intentions, right? So we can get a one-dimensional array, you should also get a multi-dimensional array, right. Let's take an example:
INT * PONEDIMARR = New int [10]; // New 10 elements of one-dimensional array
PONEDIMARR [0] = 0; // Access
INT ** PTWODIMARR = New int [10] [20]; // Error!
PTWodimarr [0] [0] = 0; // Access
However, it is a pity that the behavior of the three four lines of code is not as if you think - although they look so "natural" from the grammar.
The problem here is that NEW INT [10] [20] is not an int ** type pointer, but the INT (*) [20] type pointer (this pointer is called a row pointer, "for it" 1 "It is equivalent to the size of a line (this example is 20), that is, let it point to the next line), so our code should be like this:
int (* ptwodimarr) [20] = new int [i] [20]; / / correct
PTWodimarr [1] [2] = 0; // Access
Note that PTWodimarr type --int (*) [20] is a very special type, it cannot be converted to int **, although the syntax form of the two index elements is "p [i] [j]" form However, the number of times the access memory is different, the semantics are different.
The most critical problem is: Creating a multi-dimensional array in this simple way, there is a biggest limit, except for the first dimension, other dimensions must be determined by the compile period. E.g:
INT (* pndimarr) [N2] [N3] [N4] = new int [n1] [n2] [N3] [N4];
Here, N2, N3, N4 must be compiled. Only N1 can be variables, this restriction is related to the indexing of the multi-dimensional array - whether the array of dimensions is linearly stored in memory, so:
PTWodimarr [i] [j] = 0;
The code generated by the compiler is similar to:
* ((int *) PTWodimarr i * 20 j) = 0;
20 is the line of two-dimensional arrays, the problem is that if the linearness of the two-dimensional array is also dynamic, the compiler will not generate code (what should 20 where 20 is located?). Based on this, C only allows the first dimension of multi-dimensional arrays to be dynamic.
Unfortunately, due to this limit, the multi-dimensional array in C has become a famous and unique thing in most cases. We can often see questions about multidimensional arrays on the forum. The core of this type of problem is: how to imitate a "fully dynamic" multi-dimensional array. It means that "complete dynamics" here is that all dimensions can be dynamic variables, not just the first dimension. The answers given in the forum are different, some are quite good, but either lack scalability (ie, the case of extending to N-dimensional), or in the form of an access element from the built-in multi-dimensional array Access form, either consume too much extra space. In the final analysis, we need a multi-dimensional array of multi-dimensionalities:
/ / Create an int type DIMS dimension group, DIM_SIZES is used to specify the size of each dimension, namely N1 * N2 * N3MULTI_ARRAY
MA [i] [j] [k] = value; / / to assign values for elements of the J line k column column IPIN
MA [I] [j] = value; // Compile the wrong!
MA [I] = value; // Compile the wrong!
MA [i] [j] [k] [l] = value; // compile!
Such a MULTI_ARRAY can automatically manage memory, owns and build an interface consistent interface, and the size of each dimension can be variables - which is in line with our requirements. It seems that this Multi_Array is not difficult, but the truth is always unexpected. The following is an analysis of a Multi_Array implementation in Boost - you almost certainly find some unexpected (even amazing) local.
Multidimensional array implementation in BOOST - Boost :: Multi_Array
There is a powerful multiarray library for describing the multi-dimensional array in the Boost library. It implements a common, an interface consistent with the standard library, and has the same interface and behavior as a multi-dimensional array built in C . It is based on this versatile design, and the Multiarray library can have good compatibility between the standard library components and even user-defined generic components, and can work well.
In addition, Multiarray also provides an extremely useful feature such as changing size, reshaping, and view access to a multi-dimensional array, so that Multiarray is more than other ways to describe the multi-dimensional array (e.g.,: std :: vector
Below we will deepen, go to the mystery of Boost :: Multi_Array - debug the sample program, tracking is one of the most effective means of analyzing the source code. We will start with the sample programs in the Multiarray documentation:
// Slight to the header file
Int main () {
// Create a three-dimensional array of sizes 3 × 4 × 2
#define DIMS 3 // Array is a few days
Typedef Boost :: Multi_Array
Array_Type A (Boost :: Extents [3] [4] [2]); // (1-2)
/ / Element is assigned to an array
A [1] [2] [0] = 120; // (1-3)
...
Return 0;
}
In the above code, the two template parameters of Boost :: Multi_Array represent the dimensions of the type and array of array elements, respectively. (1-2) is the constructing statement of the three-dimensional array object. Boost :: EXTENTS [3] [4] [2] is used to indicate the size of the array, the meaning here is: define a 3D array of 3 * 4 * 2. You are sure to doubt to Boost :: EXTENTS - Why can I use "[]"? What will happen if you use more or less? [] "? Below I will strip all the mystery of Boost :: EXTENTS layered -
EXTENTS - the way in consistent with the built-in array
Boost :: EXTENTS is a global object, in Base.hpp:
Typedf Detail :: Multi_Array :: EXTENT_GEN <0> EXTENT_GEN
...
Multi_Array_types :: eXtent_gen extents; // Note its type!
It can be seen that the type of Extents is Extent_Gen, the latter is located in eXtent_gen.hpp: // EXTENT_GEN.HPP
Template
Class extent_gen {
Range_list ranges_; // (2-1)
...
EXTENT_GEN (Const Extent_Gen
{// (2-2)
... //
}
EXTENT_GEN
{RETURN EXTENT_GEN
/ / Return an extent_gen, but the second template parameter increases 1
}
Boost :: EXTENT_GEN Heavy loaded Operator [] operator, but since there is only one "[]" here, why can we write "Extents [N1] [N2] [N3] [...]? Continue to see -
If Boost :: Extents [N1] [N2] [N3] is expanded to operate to operate, it is equivalent to:
Boost :: EXTENTS.OPERATOR [] (N1) .Operator [] (n2) .Operator [] (n3);
The type of boost :: Extents object is extent_gen <0> - its NumRANGES template parameter is 0. EXTENTS.OPERATOR [] (N1) Call (2-3), return an extent_gen
Here, each of the Operator [] calls turn to (2-3), where the parameter IDX is put into the constructor (2-2) in the range of the parameter IDX, pay attention to the extent_gen
Summarize the basic work mode of Extents - each pair it calls Operator [], will return an Extent_Gen
EXTENT_GEN <1> => eXtent_gen <2> => EXTENT_GEN <3>
The last one is also the final return type --Extent_gen <3>. Its template parameter 3 indicates that this is specified for the subscript of the three-dimensional array. In its members ranges_, there are three sets of intervals of [0, N1), [0, N2), [0, N3). These three sets of intervals specify the subspace of the three dimensions of the multi_array object we define, it is worth noting that these intervals are closed backwards (this is the same as the interval of STL's iterator). When Boost :: Extents is ready, it is passed to the constructor of Multi_array, which is used to specify the substruction section of each dimension: // Multi_array.hpp
Explicit Multi_Array (const extent_gen
Super_Type ((t *) initial_base_, ranges) {
Allocate_space (); // (2-5)
}
Here, MULTI_ARRAY accepts information in the RANGES parameter, takes out the subspace of each dimension, and then saves, and finally call allocate_space () to assign the underlying memory.
Use Extent_Gen Benefits - DO Things Right!
The construction process of the Boost :: Extents is consistent with the construction of the built-in multi-dimensional array, concise, and clear semantics. First, Boost :: Extents uses "[]", it is easy to think of the declaration of the bucket in the bucket, which is clearly clearly expressed in the meaning of the value in square brackets - indicating the substruction section of each dimension; The key or use Boost :: Extents, you can prevent the user from writing an error, for example:
Multi_Array
//wrong! You cannot use a four-dimensional subscript to create a three-dimensional array!
The above statement is completely wrong, because Mult_Array is a three-dimensional array, but Boost :: extents has followed four "[]", which is obviously an error. This error is banned in the compile period, which is due to the grammatical level, Multi_array
Another alternative and disadvantages
In addition, there is an alternative to declaring each dimension, which is to use so-called Collection Concept, for example:
// Declare a shape ("shape"), namely each dimension
Boost :: array
Array_Type B (Shape); // 3 * 4 * 2 3D array
This constructor will call the second constructor of Multi_Array:
// Multi_Array.hpp
Template
Explicit Multi_Array (Extentlist Const & Extents):
Super_Type ((t *) initial_base_, extents) {boost :: function_requires / (2-4)
Detail :: Multi_Array :: CollectionConcept
Allocate_space (); // (2-6)
}
This constructor is the type of EXTENTS that is in line with the Collection Concept. The behavior of this constructor is the same as the constructor that accepts Extents_Gen - still takes out the RANGE of each dimension to save, and then assigns the underlying memory. As for the code at (2-4), it is not described here again in order to comply with the Collection Concept.
Put this way with a simple comparison in the way using Extent_Gen, it is easy to see that the advantages and disadvantages: use this way, it cannot guarantee the correctness of the compile period, for example:
Boost :: array
Multi_Array
Here, use a four-dimensional Shape to specify a three-dimensional multi_Array is inappropriate, but it passes compiled, which is because the constructor has no special requirements for its parameter extents, just treat it as an ordinary collection. The function will take out the desired range of the desired range from ExtensS from Extens, so that the constructor takes out the first three values from the Shape as the subspace of the three dimensions, regardless of the shape There are several values. Such statements are unclear and even errors in semantics. However, since this constructor exists, the designer naturally has his truth, and the documentation is clearly shown that the maximum use of this constructor is the code of dimension-independent, and the Multi_Array librar is thinking before A constructor.
Multi_Array architecture
Regardless of which constructor is used, the process is similar - in which a series of subscript intervals are incoming into the base class, the same allocate_space () function is called after the base class configuration is completed (see (see (2 (2 (2 (2 (2 (2 (2 (2) -5) and (2-6)), allocate_space, as the name suggests, used to assign space for the element of the multi-dimensional array.
However, behind this seemingly simple construction process, it hides two-three-layer hierarchical structure. The structure and function of each layer are different. Some levels can be pulled out separately from other places alone. .
Let's take a look at Multi_Array's architecture. What is J1 First, Multi_Array inherits from Multi_Array_Ref:
// Multi_Array_ref.hpp
Template
Class Multi_Array_ref: // Multi_Array base class! !
Public const_multi_array_ref
{
TYPEDEF ConST_MULTI_ARRAY_REF
Explicit Multi_Array_ref (t * base, // Pointer to array storage space
Const Extent_gen
Super_type (base, ranges) // forward the initialization task to the base class (3-1)
{}
...
}
And Multi_Array_ref is based on const_multi_array_ref:
// Multi_Array_ref.hpp
Class const_multi_array_ref: // multi_Array_ref base class! Management underlying storage!
Public MULTI_ARRAY_IMPL_BASE
{
...
Explicit const_multi_array_ref (TPTR Base,
Const Extent_gen
Base_ (base), Storage_ (c_storage_order ()) // (3-2)
{INIT_FROM_EXTENT_GEN (RANGES);
...
Storage_order_type storage _; // support multiple storage strategies! (3-3)
}
The path of MULTI_ARRAY is extended to (3-2) (constructor of multi_array_ref) (constructor) (Const_Multi_Array_ref) - This seems to be ended, because no parameters are passed to const_multi_array_ref Class multi_array_impl_base. But the heart is still confused: Why do there so many layers of inheritance structure? What is the mystery of such a class hierarchy design?
Multi-layer inheritance mystery - access interface && reuse
Go to the statement of the base class const_multi_array_ref, it seems to see some end:
Template <...>
Class const_multi_array_ref {
...
/ / And all the iterator interfaces consistent with all STL containers! !
Const_iterator begin () const;
Const_iterator End () const;
...
// and std :: vector uniform elements to access the interface! !
Const_reference Operator [] (INDEX I) Const;
...
}
Seeing these statements, is it a bit familiar? STL! Yes, the statement of these member functions is exactly consistent with the Container Concept in STL. That is, here is provided with an access interface consistent with the STL container, and the compatibility with STL is reflected here. Const_Multi_Array_ref is more "class," all access elements in const_multi_array_ref, and the member functions such as query array information returns Const's Reference or Iterator. And the declaration of MULTI_ARRAY_REF, which is only more access to elements than const_multi_array_ref, and query the corresponding Non-Const version member function of the array information.
So what is the responsibility of const_multi_array_ref Multi_array_Impl_base? Multi_array_impl_base is the implementation details, its role is only calculated based on the array information (const_multi_array_ref) calculating the offset, the step size, etc., that is, the multidimensional subscript Finally, conversion to a one-dimensional offset. The function of multi_ARRAY_IMPL_BASE -Value_accessor_n or value_accessor_one is to provide an access to the original data. This way of access is to convert multidimensional index into a one-dimensional offset of multidimensional indexes. As for why there are two versions of value_accessor_n and value_accessor_one, which will be explained in detail later. At this point, the rough architecture of Multi_Array has already appeared:
MULTI_ARRAY -> MULTI_ARRAY_REF -> const_multi_array_ref -> multi_Array_impl_base -> value_accessor_n / value_accessor_one
Each layer is a respective role:
¨ Multi_Array: Assign space for array elements to forward various operations to the base class.
¨ Multi_Array_ref: Provides a data access interface that is consistent with the STL container. It can also be used independently as an adapter.
¨ const_multi_array_ref: Provides a Const's STL Data Access Interface. You can also use it as a const adapter.
¨ MULTI_ARRAY_IMPL_BASE and its base class: The most underlying implementation provides a set of basic operations for raw data.
This architecture seems complicated, but provides high reuse, where (const_) Multi_Array_ref class can be independent as an Adapter used - for example::
INT A [24]; // 10 element array of one dimensions, located on the stack!
// Take a 3D array of 3 * 4 * 2: a 3 * 4 * 2 3D array:
Multi_Array_ref
Arr_ref [i] [j] [k] = value; // and Multi_array use interface interface
Very simple! Since then, you don't have to simulate multi-dimensional arrays. Even a one-dimensional array on the stack, you can also "see" it is a multi-dimensional array. If you don't want Multi_Array from the assignment of memory, you can assign an array yourself (you can be on your stack or pile) and then package it into a multi-dimensional array with multi_array_ref.
Multi_Array's storage strategy
Next, let's take a look at Multi_Array storage policies, such as: C-style multi-dimensional storage mode is stored in line, and Fortran is the opposite, which is stored, or even users may have their own storage policy requirements. So how do you support a variety of storage strategies? The secret is the code (3-3), member storage _-- its type for const_multi_array_ref, the following declaration pointed out the "Original Friend" --General_Storage_Order
// Multi_Array_ref.hpp
...
TYPEDEF General_Storage_Order
// storage_order.hpp
Template
Class general_storage_order {
General_Storage_Order (const c_storage_order &) {// (4-1)
For (size_type i = 0; i! = NUMDIMS; i)
{Order_ [I] = NUMDIMS - 1 - I;
Ascending_.assign (true); / / The default is an ascending order
}
...
Boost :: array
Boost :: array
}
In the constructor in (4-1), Order_ and Ascending_ are two arrays. By default, when the function (4-1) is executed, the elements in Ordering_ should be {NumdiMS-1, Numdims-2, ... 1, 0}, that is, NumdiMS-1 dimension is first, then NumdiMS-2D, ..., if these elements are used as the identity of each dimensional storage order - has a smaller ORDERING_ Value Dimensional Priority - then the storage method in this and C language is completely consistent, and ascending_ should not be used to indicate whether each dimension is ascended. In fact, General_Storage_Order has a template constructor that supports more generalized storage policies (such as Fortran, or user-defined storage policies). It does not make details here.
In addition to the storage policy, the construct of const_multi_array_ref also takes the contents in Extens to take the contents in the extens, and other variables at any other variables in other variables ((3-3) are set, and the specific details are not Take it again.
Now all information about a multi-dimensional array is already ready, it can be described as "all ownership, only 'space'". Multi_Array What to do is to call the allocate_space mentioned earlier to assign space in an array.
// Multi_Array.hpp
void allocate_space () {
...
Base_ = allocator_.allocate (this-> num_lements (), no_hint);
... std :: uninitialized_fill_n (base_, allocate_elements_, t ());
}
It turns out that in the bottom, the storage is still a stored storage for a one-dimensional array: Allocate_space uses allocator_ assigns a continuous space to store an element, where num_lements () returned is the size of the array of dimensions, ie the total element of the array Number. After the assignment, the first finger is assigned to a member base_, and std :: uninitialized_fill_n is responsible for the default initialization of the array. The construction of Multi_Array finally was finally made.
Consistent interface - Soul of GP
Another important feature of Multi_Array is to support the same access method as built-dimensional arrays, that is, Multi_Array supports access to array elements in continuous "[]". Take the assignment statement at the beginning of the sample code (1-3) as an example, let us see how Multi_Array supports this way of accessibility compatible with built-in arrays. // Multi_Array_ref.hpp
// Use Operator [] to access the element, what is the type Reference? Not T &!
Reference Operator [] (INDEX IDX) {
Return Super_Type :: Access (Boost :: Type
IDX, Origin (), THIS-> Shape (), this-> Strides (),
This-> index_bases ());
}
This call is transferred to value_accessor_n :: access (...) :: Access (...):
// base.hpp
// in class value_accessor_n
Template
Reference Access (Boost :: Type
Index IDX, TPTR Base, Const size_Type * Extents,
Const Index * Strides, Const Index * INDEX_BASE
{
TPTR newbase = base idx * stride [0];
Return Reference (Newbase, Extents 1,
StriDES 1, INDEX_BASE 1);
}
This process and extend_gen is very similar - a "Proxy", this is not actual element per call, this is not the actual element, because this time is still following "[]", so This avatar should also call Operator [], so that this process can go back until the back is not "[]". "
For example, if you access the elements in A in a [x1] [x2] [x3], it is equivalent to
A.operator [x1] .operator [x2] .Operator [x3] // Continuous call "[]"
These three Operator [] The type of call returns is:
Sub_Array
Note that the second template parameter (dimension) of SUB_ARRAY is decremented, when it is decremented to 0, it indicates that the last dimension has been returned to the last dimension, so the last call "[]" returned to the element Quoted (this is just the first "[]" the number and the number of dimensions mentioned before and only when the number of dimensions is the same, otherwise you get the return value or SUB_ARRAY <.. .> (Rather than elements), how do you do this because of trying to call "[]" in T & S & I continues to call "[]"?
From the above recursive process, we can easily see: I really have access to the elements and return T & I to Sub_Array
Sub_Array's Secret
Sub_Array defines in Sub_Array.hpp:
// Sub_Array.hpp
Template
Template
Class const_sub_Array:
Public MULTI_ARRAY_IMPL_BASE
//base.hpp
Template
Class Multi_Array_Impl_base: Public
Value_accessor_generator
Hey, the original SUB_ARRAY finally inherits from multi_Array_Impl_base, the latter's base class is generated by value_accessor_generator - it generates different base types according to NUMDIMS:
// base.hpp
Template
Struct value_accessor_generator {
...
TypeDef Typename // If Numdims is 1, the type is Value_Accessor_one
MPL :: Apply_iF_c <(Dimensionality == 1),
Choose_Value_accessor_one
Choose_Value_accessor_n
> :: type type; // Put this type as the base class of multiple_Array_Impl_base!
}
Obviously, if Dimensionality == 1, then ":: type" is value_accessor_one
Take out elements
According to the above analysis, the task of taking the element is ultimately handed over to Value_Accessor_one, and its member function Access is as follows:
Template
Reference Access (Boost :: Type
Const size_type *, const index * strides,
Const index *) const {
Return * (base idx * strides [0]); // finally took out the data!
}
Here, the return type of Access is a reference to the element type in the T &, ie the array, so that the reference to the specified element can be returned to reach the purpose of accessing array elements. Seeing this, Multi_Array Access the array of access to array elements in Multi_Array, is basically clear, as for some details, especially the details of the calculation address, such as: the calculation of the offset, the use of the steps, etc. went. Maybe you will have such a doubts: Is it true that the ability of the "[] [] [] ..." access method to the multi-dimensional array "[] [] ..." is really so important? It is not so much vigorous, it is better to write such a multi-parameter method to overload Operator [] ("[i, j, ...] form)! Is such a big price really worth it? worth it! This is not doubtful. The most important manifestation of the ability to access the multi_array elements in-array access methods is: can treat Multi_Array as an built-in multi-dimensional array, and the consistency of the built-in type is the key to your class and generic algorithm. For example: the user wrote a function template,
Template
Returntype SomeAlgo (_3Array & MDA) {// can act in built-in multi-dimensional array
...
MDA [x] [y] [z] = MDA [z] [x] [y]; // The form of access to built-dimensional array!
...
}
Because there is a capacity to access the multi_array element in an array of access to the array, this SOMEALGO algorithm can be applied to the in-array and Multi_array (otherwise the user has to provide a separate overload version for Multi_Array), one, code Reusability, scalability is greatly improved.
effectiveness
Efficiency is C eternal theme, and the Multiarray library is no exception. On the execution of time efficiency, look at the access code of the Multiarray library to the array element, although the function calls the nested layer, but most calls are simple forwarding or compiler, under a mature modern C compiler, These forwarded function calling code should be easily optimized, so there is almost no loss in efficiency. In terms of space, due to a large number of template technologies, it is basically able to determine the content determined in the compile period, and there is no burden on the operating period. In general, the Boost.Multiarray library is indeed a rare efficient and universal multi-dimensional array.
Conclusion
This article only makes a concise analysis of Multi_Array's most basic function code. As the article starts, Multi_Array has many very useful features. If the reader wants to fully understand the Multi_Array's operational mechanism and implementation skills, please analyze in depth. Multi_Array's code, I believe you will be gainful!