Effective Standard C Library: Unary Predicates in the STL
Klaus Kreft and an Agelika Langer
http://www.cuj.com/experts/1904/toc.htm?topic=experts
Several generic algorithms in the standard runtime use a unary Predicate when running. The example is an algorithm with IF, such as count_if (), find_if (), remove_if (), and replace_if (), but there is also a algorithm for [without IF]. In this column, we will contact Unary Predicate at close range to see them and can never do anything.
Let us look at how the standard defines Unary Predicate. It is referred to as Predicate [Note 1] in the standard.
UNARY PREDICATE.
The PREDICATE parameter is used for every time the generic algorithm expects a function to act on the corresponding Iterator and returns a value that can be tested with TRUE. In other words, if a generic algorithm accepts a Predicate parameter PRED and ITERATOR parameter first, in the constructor, it should work correctly: (PRED (* first)) {...}.
Functor object PRED should not apply any non-Const functions on the Iterator's reverse reference.
This functor can be a pointer to a function, or an object with a suitable call operation operator ().
From this description and inspections for generic algorithms using unary Predicate (we will see later in this article), we can identify a lot of typical features of Unary Predicate. We will carefully discuss each characteristic in this article. Features is:
Basic characteristics
1. Unary Predicate must be called.
2. Unary Predicate must accept a parameter and return a value that can be converted to the Boolean.
3. Unary Predicate does not need to copy (Copyable).
Side effect characteristics
4. Unary Predicate cannot modify its arguments.
5. Unary Predicate does not allow the generic algorithm that is in existence or Iterator invalid.
6. Unary Predicate can have any side effects other than 4 and 5.
Other features
7. Unary Predicate must be insensitive to order, which means that the effect of calling Predicate must not depend on the order of passing it to it.
8. Unary Predicate does not have to produce the same result for different calls of the same arguments.
Let us make PREDICATEs exactly these features.
Basic characteristics 1, 2 and 3
Unary Predicate must be called, but do not have to copy, and must accept a parameter and return a Boolean value.
When we examine the standard how to use unary Predicate in the generic algorithm in the constructor, it should work correctly: (PRED (* first)) {...}), these features are obvious. Here is Typical implementation of generic algorithms used by Demonstration UNARY PREDICATE:
Template
TypeName Iterator_Traits
Count_if (InputITerator First, InputITerator Last, Predicate Pred) {
TypenAme Iterator_Traits
IF (PRED (* first))
N;
Return n;
}
In other words, the Unary Predicate is called like a function. "Callable" requirements are directed to the function of the function, and is also satisfied with the object (or reference) of the type of call operation (or reference to such objects). When the PRedicate is called, it is incremented by a parameter. This parameter is the result of the reverse reference of the Iterator, that is, a reference to the elements in the sequence. The return value is used as a conditional expression, and must be converted to the Boolean type. This is a complete description of the intent of Unary Prediate: it is called to generate a Boolean result according to the elements in the sequence.
In particular, there is no copy semantic to Unary Predicate. There is no need to copy at all. As a general rule, the generic algorithm cannot rely on any characteristics that the object it uses is not clear. This includes the generic algorithm must never copy Predicate because the user is not asked to provide any reasonable copy semantics for his Predicate. It is great if you can declare a copy constructor and assignment operation as a private member and pass the Predicaet object by reference. It should not break any generic algorithm. In practice, you will find that Predicate is a copied runtime, although they should not do this. Such a standard run library is an amazing effect that has been discussed in an article in C Report [Note 2]. At the same time, some runners have removed this restriction and there is a desired behavior; for example, Metrowerks CodeWarrior 6.0. Considering different runners, we can only say that it is best to avoid "interesting" copy semantics or unary Predicate without copying semantics.
In practice, most of Predicate has normal copy semantics. This is because we usually call generic algorithms in a PREDICATE. In order to do this, Predicate must be copied. Unprintable Predicate is useful, but it is not common because they must be quoted, which must be careful and need to look very interesting template syntax. We conducted a discussion of the issue of Functor's polar problem in the previous one: How do we complete it, and why we might want to do this [Note 3]. It will not be further discussed here. Let us continue to conduct the remaining features of Predicate, discuss the side effects generated by Unary Predicate.
Side effects, 4, 5, and 6
Unary Predicate can have any side effects, in addition to modifying its argument and makes the generic algorithm is operated by the sequence or Iterator.
The standard is forbidden to use these side effects, but other things are allowed. why? In order to understand this requirement, consider what happened inside the generic algorithm using Unary Predicate. There are two entities that produce side effects: generic algorithm itself and unary Predicate. The generic algorithm traverses the input element sequence, checks the elements, passing them as a parameter to Unary Predicate, modify, and copy them, and may generate other side effects. Unary Predicate accepts references for an element, likely to check and modify this element and generate other side effects. Naturally, these behaviors may conflict with each other. Because of this reason, let's take a look at the side effects generated by Unary Predicate and is detrimental, may be harmful and harmless depending on potential conflictism.
Harmful side effects
The harmful side effects lead to the sequence or iTerator that the generic algorithm is operating. (Functor will never produce harmful side effects. This is suitable for all Functor's general rules, not just for Predicate. Standards, there is no explicit prohibiting such side effects, which may be because it is considered "common sense".) harmful An example of PREDICATE is a pointer or reference to an element in the sequence of generic algorithms in the Predicate, and use this [pointer or] reference to delete the element. Elements removal may result in an Iterator that provides a generic algorithm (in indicating the input or output sequence), and in this case, the generic algorithm may cause the program to crash.
Removing the element is a very conspicuous source of failure, but sometimes, the invalidation of the sequence is not much obvious. If an instance of a generic algorithm depends on the order of the sequence, the Predicate is intentionally or inadvertently modified in the call, then this will result in unpredictable results.
In any case, Predicate with harmful side effects must be absolutely avoided. As a rule, you must never use any of the sequences or iTerator invalidated in the generic algorithm.
Possible side effects
This type of side effect is prohibited by standard explicit. All Unary Predicate with non-COSNT functions using its parameters belong to this column because they modify the elements in the sequence. Let us call the change in Unary Predicate. (Note "Modify Sequence" (harmful) and "Elements in Sequence" (only possible harmful): "Modify Sequence" means an element insertion or removal or movement so that a Iterator or Iterator interval change It is invalid. "Components in the modified sequence" means that the elements are accessed, and their content is changed, but it will not cause any Iterator.)
The possible harm of change Unary Predicate comes from this fact: Predicate is not the only thing that accesses and changes the elements in the input sequence. The generic algorithm itself may try to modify the same elements. In this case, there are two side effects (generic algorithms and predicate), which may conflict.
When did this conflict happen? Not all generic algorithms modify the elements of the input sequence, but some are doing this. The generic algorithm is divided into several categories: non-variable algorithm and variability algorithm, and the variability algorithm can be divided into in-place algorithm and copy algorithm. Non-varying algorithms (for example, count_if ()) just view elements; they do not make any changes. Variant copy algorithm (for example, replace_copy_if ()) does not modify the element of the input sequence, but copy them into the output sequence; they modify the elements of the output sequence. Variant In-Place Algorithm (for example, replace_if ()) "local" modifying elements, which means they modify the elements of the input sequence; they are dangerous. Therefore, the potential conflicts between Predicate and generic algorithms occur at the same time using a variability in-place algorithm and a varying unary Predicate.
The program for modifying the same element twice will result in two problems: which modification is executed first and may be overwritten by the second modification? The result can be fully preded? In order to avoid this conflict between generic algorithms and Predicate, the standard requires Unary Predicate must never modify the elements in the input sequence transmitted to it as a parameter. Note that this change side effect is for all Unary Predicate, not just Unary Predicate that passes the varying IN-Place algorithm.
This restriction is often reflected in the Predicate's function signature: Typically, a unary Predicate accepts parameters are not a pass value to ensure that the argument (element of the related input sequence) is not modified. Harmless side effects
Finally, but not least, Predicate can have a non-harmless side effect. All non-mimetic access to the elements in the sequence belong to this category. Predicate can use its parameters of Const's function; that is, it can view elements, but cannot modify them. In addition, Unary Predicate can modify the objects outside the first parameters. For example, it may have a data member and change their value. Or it may involve unrelated element sequences and modify them. As long as the sequence changed by the Predicate is not the sequence of the algorithm is being operated, it is harmless.
Why do we care?
We have seen different types of side effects that Predicate may generate, and many side effects are prohibited by standard. Why do we want to use Predicate with side effects in these environments? Predicate with side effects rarely is still very common?
Well, depending on the situation. If you look at the Predicate in the C textbook, it will find that the Predicate such as ISEVEN is defined as Bool ISEVEN (INT ELEM) {Return ELEM% 2 == 0;} or bind2nd (Greater
In practice, there is different. For example, we care about efficiency. Obviously, when some operations are completed, the sequence is traversed and not repeated in a long sequence, it will be faster. Consider some examples.
Suppose we have a container to describe the customer. We tested the number of frequent visits for internal statistics, and we want to build a list of mailing, because we want to send a sales tribute to frequent visits. However, the mailing list should not exceed 5,000 boundaries. That is a task. How do we complete? One possible way is to use a unary Predicate to generate TRUE to the frequent visitor, and accumulate information of the mailing list. When it passes it to a count_if () algorithm, the desired count is generated (as side effects) to establish a mailing list. Such unary Predicate strictly follows the rules. It accepts a reference to the customer's Const, view the customer, and generate a non-harmless side effect: mailing list.
Let us consider another similar example. We need to remove all customers from customer data and update the discount records of the retained frequent visits. Once again, we try to efficiently, and expect to turn both when traversing customer data. Compared to the previous method, we can try to provide a unary Predicate for REMOVE_IF (), which generates TRUE (so generic algorithm to remove them), and add the discount information to the rest of the customer. In contrast to earlier examples, this is not legal, and the additional information of the elements in the input series is prohibited. Remember: Predicate must never modify its arguments. But that is really what we want: we want to update the remaining elements in the series. So how do you do it?
We don't have much to do too much. Thorough research on the standard generic algorithm chapter shows that for_each () is the only generic algorithm for accepting the functor and allows functor to modify the universal parameters. (We discussed for_each () 4) in the previous article. For this reason, for the task of modifying the elements of the input series, we are seriously limited in the selection of generic algorithms; for_each () is basically unique. The result is that we must decompose the task into non-modified behavior (implemented as a unary Predicate for remove_if ()) and the modality (implemented as Functor for for_each (). This extra traversal of such a sequence is inevitable, including inevitable efficiency loss.
Alternative choice is:
l Supply for for_each () to remove and modify elements (repeating the function of using our previousterial REMOVE_IF (), this is of course not what we want)
l User-defined REMOVE_IF () versions, which allows you to modify the elements of the input sequence (this is feasible; even the copy of the standard generic algorithm may depend on how it is achieved)
l Hand-written algorithm (ignore all standard generic algorithms)
The bottom line is: If the component of the input sequence must be modified, unary Predicate cannot be used.
If such modifications need to be modified (for example, because of the efficiency), we want to use standard generic algorithms, then we must decompose the task into two categories of change and non-modified, and must accept multiple traversals for input sequences. .
Feature 7
Unary Predicate must be insensitive; that is, its effect must not depend on the call sequence.
In addition, two aspects and the side effects of Unary Predicate have contacts: Predicate's call sequence and number of call times. If you generate side effects whenever the Predicate acts on the element of the input sequence, then we will want to know how the side effects are generated in the order of frequency and order. For example, in our example, we have accumulated a count to determine the maximum size of the generated mailing list. Naturally, PREDICATE is accurately effectively effecting each element and there may be different effects. Depending on the nature of side effects, the order and number of calls will play different roles.
The number of calls is accurately described: count_if () or remove_if () This type of generic algorithm accurately uses Predicate as each element in the input sequence. The order sequence is different: no generic algorithm that accepts Redicate describes the order of providing elements to Predicate. Thus, Unary Predicate must not depend on the call sequence. If we use a Predicate depends on the call sequence, the result is unpredictable.
Here is an example: one (order-sensitive) Predicate generates Ture: Ture in the sequence:
Class nth {
PUBLIC:
Nth (int N): Then (N), Thecnt (0) {}
Bool Operator () (int)
{RETURN ( THECNT)% the;}
Private:
Const int.;
Int thecnt;
}
If we pass NTH (3) to the generic algorithm such as Remove_copy_if (), then it is expected that it will move into an element in the input sequence into the output sequence. But this cannot be guaranteed because the elements in the sequence are not necessarily operated in a clear order. We can determine only one-third of the element being removed from the input sequence into the output sequence.
Why didn't the standard give a guarantee for the call order of Unary Predicate? This is because some generic algorithms are optimized for some types of Iterator. For example, if the Iterator is an Input Iterator, the generic algorithm may step from the sequence to the end, but the Random Access Iterator can make any jump. Because the standard does not want to limit this optimization, there is no guarantee for the invocation of Unary Predicate. As a result, for the users of STL, all Unary Predicate must not depend on the order provided in the sequence. If we want to use order-sensitive Predicate, we must implement our custom generic algorithm to give a guarantee for the call sequence of Unary Predicate.
This is the last observation about Unary Predicate.
Features 8
Unary Predicate does not have to produce the same result when the different calls of the same arguments.
This feature may sound slightly. We include it in a feature table, because we noticed something that there is a hypothesis, implies that Predicate is "stable" behavior, that is, when they are used by the same or "equal / equivalent" Each time it produces the same result. This assumption is not established; the standard does not specify any similar requirements.
So why is it sometimes assuming Unary Predicate is required to have "stable" behavior? Because it will give a lot of freedom. With "stable" behavior, it doesn't have to care about how many times called the same element, because the results are always the same. It does not care about the transmission to the generic algorithm is the reference to the elements in the sequence or its temporary copy (ie "equal / equivalent" element).
However, the Unary Predicate for STL does not require "stable" behavior. It fully defines a "unstable" Predicate. For example, all elements with an attribute generate TRUE until a limit is reached. It may be used in REMOVE_COPY_IF () to copy MaxImum from the input sequence to an element to the output sequence of a given property.
to sum up
The following generic algorithms for standard runners use unary predicate: Replace_IF (), Remove_if (), Partition (), Stable_Partition (), Replace_copy_IF (), Remove_copy_IF (), Count_if (), and Find_iF ().
If a Unary Predicate has the following characteristics, it can be used for any of the above generic algorithms, and the result is portable and predictable.
l Unary Predicate must be callable and must accept a strein and return a Boolean value, but do not have to have any special copy semantics. (Some runtuations have restrictions on this because they need some copy semantics.)
l Unary Predicate must never modify its arguments and must not cause the generic algorithm to be operated sequence or Iterator invalid, but there can be any other side effects.
l Unary Predicate must do not rely on the call sequence, and can produce different results for different calls of the same arguments.
Quote
[1] International Standard. Programming Languages - C ISO / IEC IS 14882: 1998 (e).
. [2] Nicolai M. Josuttis "Predicates vs. Function Objects," C Report, June 2000. [3] Klaus Kreft and Angelika Langer "Effective Standard C Library: Explicit Function Template Argument Specification and the STL,". C / C Users Journal, December 2000, http://www.cuj.com/experts/1812/langer.html.
[4] Klaus Kreft and Angelika Langer. "Effective Standard C Library: for_each vs. Transform," C / C Users Journal, February 2001. http://www.cuj.com/experts/1902/langer.html