The Standard Librarian: Bitsets and Bit Vectors
Matt austern
Http://www.cuj.com/experts/1905/AUSTERN.HTM?topic=Experts
In C , you can play the donation chamber, and even the macro.
-------------------------------------------------- ----------------------------
The people who have the process are familiar with the Boolean options: Handle a set of options into one, package them into a word, use a bit for each option. For example, to set the permissive permissions of UNIX files, you may write like this:
Chmod ("my_file",
S_IWUSR | S_IRUSR |
S_IRGRP | S_IROTH);
Each constant corresponds to a bit; you can specify a lot of options once by combining them with a "bit or" operation.
Packing multiple options into a Word behavior is very common. This trick is used in many places, in the UNIX and WIN32 API, in the iOS_BASE formatted flag of the C standard runtime, and some of its forms are easy to appear in large programs. The collection of bit yuan is very important.
It is not difficult to understand why this skill is very common: another implementation method is to use an array or structure, each option corresponds to a different field, which is clumsy and wasting memory. However, sometimes this technique will cause trouble. First, some calculations may be clumsy: set a named bit to compare directly (Flags | = S_IRGRP), but clear a bit (Flags & = ~ S_IWGRP) how much ugly. You can test if a bit is set, by moving it: if (falgs & s_iwusr); but when the "explicit" test error: IF ((Flags & S_iwusr) == True), or worse IF (Flags & S_IWUSR == TURE). Corresponding to the named bit, for the number of the number, the same clumsy: It is necessary to use an expression similar to Flags & = ~ (1 << n), usually add a mandatory type conversion. Finally, this technique is difficult to have a lot of options:
Because the set of bits is important, the C standard runtime provides explicit support for them - in fact, there are several support. Sometimes you will still want to use a low-level bit (and you have to do this, if you are interacting with C language API), but in most cases, the version of the C runs will more suitable. They have some small problems, but most of them are easy to bypass.
Bitset
Class std :: bitset appears in C Standard Chapter 23 "Associated container". This is not the correct position it should appear, because BitSet does not have any relationship with the associated container such as Set and Map, which does not even meet the most basic needs of the STL container. Put BitSet is better when making an integer, and each of its bits can be accessed separately - but it is not limited by the length of long. The length size of BitSet is determined in the compile period (the number of the bit is a template parameter), but there is no upper limit: BitSet <32> is 32-bit long, bitset <1000> is 1000. The integer operation you have used continues to be valid for BistSet, and has added some operations for convenience. For example, you can write B1 ^ B2 to perform "bit or" operation (at least at least B1 and B2 length). Operating a single bit There are two different interfaces: You can set the nth bit with B.Set (n), clear it with B.Reset (n), and test it with if (B.Test (n)); Or, almost equivalents, you can do BitSet as an array, use B [n] = true, b [n] = false, and if (b [n]) to achieve the same operation. ("Almost" is because there is a small difference: the array version does not perform off -ral inspection, and the set () / reset () version is done. If passing to set () / reset () / test ( The parameter is too large, and it will get out_of_range exception.)
If you use the bitset size appropriate, you can use it as an integer: There is a constructor to create a bitset from unsigned long, and a member function to_ulong () get a unsigned long from BitSet. Of course, you can't use this constructor directly to initialize the bit over the unsigned long range; Similarly, you cannot extract the bit of the unsigned long with to_ulong (). (If you try to do, and any one of the unsigned long is set, to_ulong () will throw an exception). However, if needed, you can bypass these limits by using shifts and masks:
Const int N =
SIZEOF (UNSIGNED Long) * CHAR_BIT;
Unsigned long high = 0x7b62;
Unsigned long low = 0x1430;
Std :: bitset <2 * n> b
= (std :: bitset <2 * n> (hor) << n) |
Std :: bitset <2 * n> (low);
...
Const std :: bitset <2 * n>
Mask ((unsigned long);
Low = (b & mask) .to_ulong ();
HIGH = (B >> N) .to_ulong ();
The 0th bit is defined as the lowest significant bit, so for example, if you write:
Std :: bitset <4> b (0xa);
The place where the place is B [1] and B [3].
It is easy to replace traditional option flags with BitSet: Just declare a BitSet object in the header file to replace integer constants. We have already said two benefits to using BitSet: You get more markers than long, you can use it easier and safer ways to operate each bit. The other is that BitSet gives you a conversion mechanism to conversion between BitSet and text. First, BitSet provides a commonly used I / O operation. This program,
#include
#include
Int main () {
Std :: bitset <12> b (3432);
Std :: cout << "3432 in binary is"
<< B << std :: endl;
}
Give an intuitive result:
3432 in binary IS 110101101000.
The input operation works in the same method: it is read into a string of "1" and "0", converting them into a bitset.
Second, you can convert Bitsets into strings or conversions from string: there is a constructor that accepts a string parameter, and BitSet <> :: to_string () member functions. Hey, although these conversions are useful, the details indicate that it is very inconvenient. Accept string constructor and to_string () member functions are a member template, because the Std :: Basic_String class itself is template because the run library; the usual string class, std :: string is an alias of Basic_String
The versatility of these member templates is affected by some unfortunate rules of C . You must write:
Std :: bitset <6> (std :: string ("110101"))
Instead of
Std :: bitset <6> ("110101");
Only the string text "110101" is directly incoming version, the compiler error will be given because the compiler does not know what version of the instantiation of the member template. Similarly, if B is BitSet, you can't just write:
Std :: string s = b.to_string ();
You must use this kind of terrorist form:
Std: string s
= B.Template TO_STRING Std :: char_traits Std :: allocator (Yes, the keyword that looks laughs is really necessary.) Of course, in practice, you should not pollute your code in such something. Unless you really need to work with a variety of characters, you can encapsulate the horror grammatical detail into the auxiliary function: Template Std :: bitset From_string (const st :: string & s) { Return std :: bitset } Template Std :: string TO_STRING (const std :: bitset Return B.Template To_String Std :: char_traits Std :: allocator } Vector BitSet does have an important limit: it has a fixed length. You can have a bitset than long, but you have to specify its size in advance. Things to the option flag set, this is good, but it is not suitable for other purposes. If you are handling a huge terms set in a complex order, and you need to master what you have seen. This requires an array of a Boolean value, there is reason to use "compressed" array, each element is used, but BitSet is no longer a reasonable choice. The number of clauses you are dealing with until the runtime can you know, and the terms may even increase or remove. Another mechanism for another management bit set in the C standard runtime is a specialization of Vector Although vector Std :: vector Std :: Transform (v1.begin (), v1.end (), v2.begin (), v3.begin (), Std: Logical_and Similarly, you will output Vector Std :: Copy (v.rbegin (), v.rend (), Std :: ostream_iterator (This code depends on a fact, by default, BOOL output uses "1" and "0" instead of "true" and "false". It also noticed that we are using Rbegin () and rend () to inverse Copy vector As long as it is possible, you should always use BitSet instead of Vector It seems that there is a situation that should be used with vector However, you should not let this lack of you! Although BitSet does not have an STL container interface, it is still a very good (fixed size) container. If you make sense, and if you need to choose a child, you can define a simple "subscript selection sub" adapter to convert the selection (such as * i) into an array expression (such as B [n]) . Implementation is clearly: maintaining a pointer to a subscript and pointing to the container. Details, most of us is used when implementing Random Iterator, seeing in Listing 1. We also define some non-members' auxiliary functions, begin () and end (), which accepts a bitset as a parameter. (ITERATOR we displayed in Listing 1 is universal as it possible: If we are willing to accept a slightly cumbersome interface, we can define a class that can work with any similar to array. A universal destination subscript Selecting sub-adapters are often useful when processing pre-STL containers, sometimes, even if the STL container is processed, for example, vectors.) Using bitset_iterator, BitSet can now interact with STL components: For example, you can copy a BitSet into Vector Std :: bitset <10> B; ... Std :: Vector b (begin (b), end (b)); However, if you have read Listing 1 carefully, you may have noticed a question of bitset_iterator: The name is a lie because BitSet_iterator is not really an Iterator. If i is an Iterator, then * i should return to the reference to the object referred to. BitSet_iterator does not do this: const bitset_iterator returns BOOL, not const boxol, and can modify version of BitSet_iterator returns a type of bitset <> :: Reference agent object, not Bool &. Because the bits are not independently addressing, this is the best we can do; in fact, Vector to sum up The array of Boolean values is very common in large procedures, and the C standard runtime provides several ways to represent such an array. I don't have all possible: For example, you can use Valarray Many times, in any case, the easiest way is to use std :: bitset. If you know how much your Boolean array is in the compile period, or at least specify a reasonable upper limit, then BitSet is simpler and more efficient. There are some annoying problems on the interface of BitSet, but it is easy to bypass them by some auxiliary functions. Listing 1 - bitset_iterator, an iterator adaptor class for std :: bitset Template Template Struct IF Typedef iftrue Val; } Template Struct IF Typedef iffalse val; } Template Class bitset_iterator { Private: Typedef std :: bitset TypedEf TypeName if QBitSet; Typedef std :: random_access_iterator_tag Iterator_category; Typedef Bool Value_Type; Typedef std :: ptrdiff_t Difference_type; TypedEf TypeName if Pointer; TypedEf Typename IF Bool, Typename BitSet :: Reference> :: Val REFERENCE; QBitSet * b; Std :: size_t n; PUBLIC: BitSet_Iterator (): b (), n () {} BitSet_iterator (QBitSet & B, std :: size_t sz) : B (& B), N (SZ) {} BitSet_iterator (const bitset_iterator : B (x.b), n (x.n) {} BitSet_iterator & operator = (const bitset_iterator) { B = x.b; n = x.n; } PUBLIC: Reference Operator * () const {return (* b) [n];} Reference Operator [] (std :: ptrdiff_t x) const { Return (* b) [n x]; } BitSet_iterator & operator () { n; return * this;} BitSet_iterator Operator (int) { N; Return bitset_iterator (* b, n-1); } BitSet_iterator & operator - () {--n; return * this;} BitSet_iterator operation - (int) { N; Return bitset_iterator (* b, n 1); } BitSet_iterator Operator (std :: ptrdiff_t x) const { Return bitset_iterator (* b, n x); } BitSet_iterator & operator = (std :: ptrdiff_t x) { n = x; Return * this; } BitSet_iterator Operator- (std :: ptrdiff_t x) const {return bitset_iterator (* b, n - x); } BitSet_iterator & operator - = (std :: ptrdiff_t x) { n - = x; Return * this; } PUBLIC: Friend Bool Operator == (bitset_iterator x, BitSet_iterator y) { Return X.B == Y.B && x.n == Y.N; } Friend Bool Operator! = (bitset_iterator x, BitSet_iterator y) { Return! (x == Y); } Friend Bool Operator <(bitset_iterator x, BitSet_iterator y) { Return X.N } Friend Bool Operator> (BitSet_iterator X, BitSet_iterator y) { Return Y } Friend Bool Operator <= (bitset_iterator x, BitSet_iterator y) { Return! (Y } Friend Bool Operator> = (bitset_iterator x, BitSet_iterator y) { Return! (x } Friend st: PTRDIFF_T OPERATOR- (bitset_iterator x, BitSet_iterator y) { Return X.N - Y.N; } Friend BitSet_Iterator Operator (std :: ptrdiff_t n1, bitset_iterator x) { Return bitset_iterator (* x.b, x.n n1); } } Template BitSet_iterator Begin (Const std :: bitset Return bitset_iterator } Template BitSet_iterator End (const st :: bitset Return bitset_iterator } Template BitSet_iterator Begin (std :: bitset Return bitset_iterator } Template BitSet_iterator End (std :: bitset Return bitset_iterator } - End of listing -