Effective Standard C Library: for_each () vs. Transform ()
Klaus Kreft and an Agelika Langer
http://www.cuj.com/experts/1902/langer.htm?topic=experts
NOTE: Article Updated on January 5, 2001
FOR_EACH () and TRANSFORM ()
The generic algorithm for_each () and transform () are often considered very similar, all of which are applied to each element in the input interval (provided in the form of functor). The difference is the return value of the for_each () ignores the operation, and Transform () copies the return value to the element of the output interval. This understanding is a very common excessive idea. However, according to standards, the difference between the two algorithms is fundamentally. The purpose of this section of the column is to explain the two algorithms in conceptual differences and point out potential portable problems caused by the understanding of understanding.
Reading standard
Before we entered a substantive discussion, let's take a look at what the standard is said. [Note 1].
For_each. The for_each () algorithm is described in the Non-Modifying Sequence Operation section.
Template
Function for_each (InputITerator First, InputITerator Last, Function F);
l Effect: On the reverse of each of the Iterator in the interval [first, last), start from First - 1.
l Return value: f
l Complexity: Application of F. Ethylene Last - First.
l Note: If f has a return value, this return value is ignored.
TRANSFORM. The Transform () algorithm is described in the standard of the Mutating Sequence Operation. It has two versions, and a version (Unary Transform) works on an input sequence, and another version (Binary Transform) accepts two input sequences. Since we want to compare for_each () and transform (), we only consider Unary Transform.
Template
OutputIterator Transform (InputITerator First, InputItemrator Last,
Outputiterator Result, UnaryOperation OP;
l Effect: Each of the sections [Resutl, Result (Last - First)) assigns a new value, this value is equal to OP (* (I - result)).
l Requirements: OP should have no side effects [Note 2].
l Return Value: Result (Last - First)
l Complexity: Exact Last - First Application OP.
l Note: Result can be equal to FIRST.
Indeed, both the Unary Transform algorithm and the for_each () algorithm actively act on each element of the input interval and is precise. In addition, they have little common point. The difference includes: l for_each () is a non-variable algorithm; Transform () is a variable algorithm.
l for_each () ignores the return value of the calculation; Transform () will return the value assigned to the continuous element in the output interval.
l for_each () Returns a copy of the functor; Transform () returns the item of End of the output interval.
l for_each () is applied in a determined order, which is the beginning of the input interval to the end of the input interval; transform () does not give such a commitment.
l Passing to transform () must have no side effects; there is no such limit for the calculation transmitted to for_each ().
Let us see what these differences mean and why they exist.
intention
When using a generic algorithm, we expect this algorithm to have an effect; otherwise the call is meaningless. A typical effect includes generating an element in a return value and a sequence modified.
return value. The typical return value generated by the generic algorithm is a Boolean value (such as include ()), a count value (such as count_if ()), pointing to the Iterator of a particular element of the input sequence, pointing to the END of the output sequence Iterator (such as copy ()), or a pair of Iteerator (such as equal_range ()). Most generic algorithms generate return values, only few have little (for example, Fill (), Replace (), sort (), and swap ()).
According to the modification of the elements in the sequence, the algorithm is divided into a variability algorithm (Mutating or Modifying) and non-variability algorithm (inspecting or non-modifying).
Mutators. REMOVE (), replace (), copy () and sort () algorithm actively generate side effects, namely modify the elements in the sequence. Typically, they give a new value by reverse reference iTrator. For example, copy () assigns an element of the input interval to the element of the output interval. If the modified sequence is an input sequence, the standard is called an In-Place algorithm; if the modified output interval is modified, the standard is called a copy algorithm. For example, replace_if () is an In-Place algorithm, while the replace_copy_if () is a copy algorithm.
INSPECTORS. Relatively, non-modified algorithms do not assign any elements. Examples of non-change algorithms are Find_IF (), count_if () and search (). The non-modified algorithm is actually modified any elements, but generates a return value.
In this sense, Transform () is a variability copy algorithm because it modifies the element, which assigns the results of the function in the output interval; and for_each () is a non-variable algorithm because it does not do any element Value.
As mentioned earlier, the unique purpose of the non-varying algorithm produces a return value. FOR_EACH () is a non-variable algorithm and it returns the functor transmitted to it. Some people may want to ask: If it does not modify any elements and returns something it receives, what is the element in the sequence uses for_each ()? Does for_each () have an effect? Indeed, for_each () does not actively produce any effect. The actual purpose of calling for_each () is to transmit the function generated when acting on each element. More precise: function can generate the effect of modifying the input interval, or by modifying itself during being called. Because this reason, the operation to for_each () does not limit the side effects; use a FUNCTOR called for_each () completely meaningless. This and the operation to transform () is fundamentally different. According to the standard, it is necessary to pass the operation to transform () must have any side effects. And, here, we have seen the fundamental differences of for_each () and transform (). For_each () relies on the side effects of the functor, while transform () produces its own effects and prohibits Functor from producing any side effects.
(WQ Note: The original text, the following paragraphs are quite repeated)
In this sense, transform () is a modified copy algorithm because it modifies the element, which assigns the results of the function to the elements in the output interval, and for_each () is a non-modified algorithm because it does not do any element Value.
The sole purpose of the non-varying algorithm is to generate a return value. FOR_EACH returns a Functor object that is passed. Strictly speaking, for_each () does not actively generate any side effects. Calling for_each () is the effect of transmitting the function generated by the function. Functor can generate an effect by modifying the input sequence, or modify itself by calling itself.
Because this reason is transmitted to for_each () does not limit side effects; this is different from the transform (), according to the standard, the operation to transform () must have no side effects. This is the fundamental difference between for_each () and transform (). For_each () relies on the side effects of the functor, while transform () produces its own effects and prohibits Functor from producing any side effects.
This difference explains why for_each () guarantees the order and number of calls to call the function. When an operation has side effects, we need to know how this calculation is called in the frequency and sequence because it may be sensitive to the number of calls. On the other hand, transform () prohibits its Functor has any side effects and only guarantees the number of calls without any description of the call sequence.
in conclusion
Let us consider the inference of the standard given for_each () and transform (). It exports a simple concept: "There are different and very similar algorithms only on the return value of the called Functor", "in many cases.
side effect
Functor with side effects can be passed to for_each (), but cannot be passed to Transform (). The standard intent is that the for_each () does not make sense when the functor with side effects is not required, and transform () does not require Functor to provide any side effects outside the return value. According to the standard, the functor passed to for_each () can have any side effects, and the functor to transform () will never have any side effects. In practice, both lead to something surprised.
The side effects of the functor can be harmless, such as outputting a prompt information to the stream or modify its own data member, without interfereying the effect of the generic algorithm itself. Despite this, such a Functor cannot pass to Transform () because it violates the standard requirements. On the other hand, common sense is a FUNCTOR that is not free to be free. The side effects generated by the functor must not interfere with the behavior of generic algorithms. For example, the use of generic algorithms for invalid iTrator or sequences is disastrous. Even the functor for for_each () must also obey this recognized rule, even if the standard is not said.
Call order
The functor-sensitive functor can be transmitted to for_each (), but it is not reasonable to pass it to transform (). There is no description of the order of the functor to call the TRANSFORM () algorithm. For this reason, it is transmitted to the transform () sequence sensitive operation, and the results are unpredictable, and for for_each () is clearly defined.
Why do this cause problems in practice? Ok, let us study an example.
Specific example
Suppose we have the following scenarios: We have a dictionary, including names and related information, implemented with Map containers. In addition, one file contains a series of names. All names in this file appear, its corresponding entry must be removed from the dictionary. How do we solve this problem?
The first idea may be using generic algorithms remove_if () and transove_copy_if (): Remove the names in the file (and copy into another MAP). This certainly does not work, because remove_if () and remove_copy_if () are variability algorithms, which are attempted to assign the elements in the input sequence by reverse reference Iterator. However, the MAP container does not allow the element to be accommodated; its element is the CONST Key and the corresponding Value PAIR, and the CONST's Key cannot be changed. For this reason, the programs that try to use remove_if () or remove_copy () _ if the MAP cannot be compiled. Replacing the elements in the use_if () and remove_copy_if (), and the ERASE () member function can be better removed.
Use for_each ()
Let us use another method to use for_each (): for each name in the file, apply a function to delete an entry in the MAP. Functor can look like this:
Template
Class Erasefct {
PUBLIC:
ERASEFCT (MAPT * M): Themap (m) {}
Void Operator () (String Nam)
{TypenAme mapt :: item ore = themap-> find (nam);
IF (item == themap-> end ())
Throw INVALID_ARGUMENT (NAM);
Themap-> ERASE (ITER);
}
Private:
Mapt * themap;
}
Template
ERASEFCT
{RETURN ERASEFCT
Functor may be used in this way:
Map
// ... Populate Directory_1 ...
IFStream Infile ("TOBEERASED.TXT");
For_each (ISTREAM_ITERATOR
Using functor on for_each () and has a desired effect. The side effect of the Functor is to modify the MAP pointed to by the data member THEMAP. Note that side effects are not sequentially sensitive, so the guarantee of the call sequence is not necessary. In addition, side effects do not affect the behavior of generic algorithms, because functor does not attempt to modify input or output Iterator or sequence.
So far, it is very good. Now, imagine a slight change in the situation: not removed from the dictionary, we are now an exploded dictionary; How do we solve this new problem?
Use transform ()
The first intuitional idea is to apply TRANFORM (): to apply TRANFORM (): for each name that appears in the file, applying a function to an entry in Map, and Returns this entry to store another Map.
We have modified the initial Functor to use Fransform (). The main difference is that the modified Functor returns the value of the removed element, so transform () can store this value into the output sequence. All necessary modifications have been tagged in the implementation: Number:
Template
Class Erasefct {
PUBLIC:
ERASEFCT (MAPT * M): Themap (m) {}
TypeName Mapt :: Value_Type Operator () (String Nam)
{TypenAme mapt :: item ore = themap-> find (nam);
IF (item == themap-> end ())
Throw INVALID_ARGUMENT (NAM);
TypeName Mapt :: Value_Type Res = * iter;
Themap-> ERASE (ITER);
Return res;
}
Private:
Mapt * themap;
}
Template
ERASEFCT
{RETURN ERASEFCT
It can be used like this:
Map
Transform (ISTREAM_ITERATOR
INSERTER (Directory_2, Directory_2.end ()),
ERASER (& Directory_1));
We can also use it in places where the initial functor is used to resolve the beginning of the problem, that is, remove the entry:
Map
// ... Populate Directory_1 ...
IFStream Infile ("TOBEERASED.TXT");
For_each (ISTREAM_ITERATOR
ERASER (& Directory_1));
There is no problem with the modified functor on for_each (), and the initial Functor is equally better to solve the problem. FOR_EACH () simply ignores the return value of the function, the effect, and the original functor are the same. For Transform (), the situation is surprisingly different. The functor passed to Tranform (), its behavior is neither a predictive nor portable, because the standard only allows TRANSFORM () without side effects without side effects, and our Functor has side effects, that is, it is removed. An element of the MAP pointed to by the data member.
Here, we see the fundamental difference between for_each () and transform (). Describe the two algorithms very similar (only difference is for_each () ignores the return value of the functionor and transform () is not ignored), this is an misleading. In fact, the two algorithms use Functor: one is side effects; the other is no side effects.
Theory vs. practice
Standard prohibits Functor and Transform () with side effects. The reason is that the standard wants to give the implementation of the implementation of the library possible optimization. It is a very strong requirement that Transformator can't have any side effects. TRANSFORMATOR is allowed to do things. It cannot modify its own data members; it cannot modify the temporary variable (IT cannot modify temporaries; it is not able to call any function with side effects (such as writing a stream); it can't even retrieve the Volatile variable value. It can do all is to check its parameters and other Const's non-Volatile fields, call no side effects and generate a return value. Under these limits, the implementation of the running library can be safely applied.
One possible optimization is to perform transform () in concurrent threads. Functor without side effects is concurrent; because it does not cause any changes to the runtime environment, if there is no potential conflict without any potential conflicts in multiple threads. Such an optimized transform () undoubtedly breaks our example.
The Transformator in our example may delete an element in a map, which is not atomic operation. A thread may be performing deleting element actions and another thread is checking MAP's end (the Other checks for the end of the map), which will be placed in the first line later, and the second thread will Will collapse. This is a competitive state, which originated from our transformator violates the fact that it does not have side effects.
In practice, you will find that for most standard runners, pass it to transform () a side effects of Fucntor work well and produces expected and reliable results. In fact, the work library we know, no freedom in the use of the standard given. Despite this, keep in mind: Strictly speaking, transformator with side effects is not portable.
So, in the portable procedure, what can we use to use transform ()? Immediately, we see two possible methods: Relaxed Transform () versions and replace standard Transform () generic algorithms, or use for_each () replacement.
Implement your own transform () version
We can implement our own transform () version, which calls function in the header and proceeds to END, allows Functor to have any side effects (except for input or outputting Iterator or sequence invalid side effects). This is a possible implementation:
Template
Outputiterator Relaxed_Transform (InputIterator First, InputItemrator Last,
Outputiterator Result, Transformator Transformator Transformator Transformator Transformator Transformator Transformator Transformator Transformator
For (; first! = last; first, result)
* Result = Trans (* first);
Return Result;
}
This is the implementation that you have found in most standard run libraries, but use your own version more secure because it is a portable solution. The above algorithm can be declared as:
Template
Outputiterator Relaxed_Transform (InputIterator First, InputItemrator Last,
Outputiterator Result, Transformator Transformator TRANS;
l Effect: On the reverse use of each Iterator within the interval [first, last), start from First - 1, and turn the TRANS (* (* (i - result) return value Each iTerator in the interval [Result, Result (Last - First)) is assigned.
l Requirements: TRANS does not have any side effects of Iterator within the interval [first, last), and [Result, Result (Last - First)).
l Return Value: Result (Last - First)
l Complexity: Accurate Last - FIRST Application TRANS and precise Last - FIRST assignment.
l Note: Result can be equal to FIRST.
In the case where the result is equal to FIRST, the Transform () algorithm is used as the In-Place algorithm when the input sequence and the output sequence are the same. In this case, the user must pay attention to any modification of the input element through the functor will be covered by the assignment action of this element. This introduces a small trap, but the user who uses the function of modifying the input element may not use such functor to be used on in-place. Transform ().
The purpose and benefit of using a custom relaxed RELAXED_TRANSFORM () algorithm is to easily implement portable programs. Disadvantages is that possible optimization does not exist in this relaxed custom version.
Use for_each () when you are not on time.
Another optional method is to use the for_each () algorithm whenever you need to generate side effects. We can re-implement Functor to generate all the desired side effects, including transform () already generated; that is, it removes an entry from the dictionary and adds it to another dictionary. Here is a rewritten fuctor:
EMPLATE
Lass MoveFct {
Ublic:
MoveFCT (MAPT * M1, MAPT * M2): THEMAP1 (M1), THEMAP2 (M2) {}
Void Operator () (String Nam)
{TypenAme mapt :: item t t = themap1-> Find (nam); if (it == THEMAP1-> end ())
Throw INVALID_ARGUMENT (NAM);
THEMAP2-> INSERT (* iTer);
THEMAP1-> Erase (iTer);
}
Rivate:
Mapt * themap1;
Mapt * themap2;
;
EMPLATE
Ovefct
Return Movefct
Can be used like this:
Map
// ... Populate Directory_1 ...
IFStream Infile ("TOBEERASED.TXT");
For_each (ISTREAM_ITERATOR
ERASER (& Directory_1));
This solution has reached the height of recommendation: We should usually use for_each () instead of transform () to perform transformations of all sequential sensitive or side effects.
to sum up
This is a common misunderstanding: the unique difference between the generic algorithm for_each () and transform () is the return value of the for_each () ignores the return value of the operation and transform () copies the return value into the element in the output interval. A comparative fundamental difference between the two algorithms is that transform () is limited to the useless functor, and for_each () is much loose to its functor.
In fact, for_each () is an exception in the generic algorithm of the standard running library: it is the only algorithm that promises the order and number of times of calling the function, and allows the side effects (including the modification of the elements in the input sequence). In detail:
l for_each () is one of the few algorithms of several algorithms in the standard running library. [Note 3]. This allows programs that allow sequential sensitive functions through the functionor. Transfer sequence-sensitive functor is completely meaningless, because the result is unpredictable.
l for_each () is the only algorithm that returns the function. This allows for cumulative data member information in the functionor and re-acquires this information after performing the algaratory. Transferring such functor to other algorithms requires the example of these algorithms to use the reference to the reference to the reference, which is more difficult because it involves the explicit function informance (WQ note, see "C standard library" P298, the form is Transform < Inputiterator, InputITerator, Outputiterator, Transformator &> (...), plus you will fight with the defects of the running library [Note 4].
l for_each () is one of several algorithms that do not limit the side effects of fuctor [Note 5]. This gives a huge flexibility when the user implements functor. All other algorithms have various mandatory requirements for the Functor used.
In this column, we will discuss Unary Predicate, which is another category of the standard running library. We will see what side effects should have any side effects or no side effects.
Note and reference
. [1] INTERNATIONAL STANDARD Programming languages & 151; C ISO / IEC IS 14882:. 1998 (E) [2] The Standard defines a side effect as follows: Accessing an object designated by a volatile lvalue, modifying an object, calling a. Of Those Operations in The State of The Execution Environment.
[3] The other algorithms that give a guarantee regarding the order of invocation are the generalized numeric operations accumulate, inner_product, partial_sum, and adjacent_difference defined in header file
[4] Klaus Kreft and Angelika Langer. "Effective Standard C Library: Explicit Function Template Argument Specification and the STL," C Experts Forum.
[5] The Other Algorithm That Does NOT RESTRICT The Side Effects of Its Function Object Is Generate.