The New C : The Group of Seven - Extensions Under Consideration for the C Standard Library
Herb SUTTER
Copyright © 2002 Herb Sutter
Last time [1], I gave an overview of the past, present, and likely future directions for the C Standard, who the major players are, and how they interact and affect you. This time, as promised, I'll give a Survey of the first battle of suggested library extensions That Were Considered At the October 2001 WG21 / J16 Meeting in Redmond, Washington, USA.
Ground Rules
When You Read The Proposal Summaries in The Next Section, please Remember Four Important Things:
No final decisions are being made right now. This first group of proposals is primarily intended to give the library working group something concrete to chew on. Looking at actual proposals has let us better learn what we want in a proposal, and what kinds of questions WE Want to Be Able To ask.
None of these proposals is a shoo-in. Many of these proposals happen to come from Boost [2], but Boost is not getting played as a favorite here. The door is not closed to alternatives to these same proposals. These are samples that have received initial consideration only, and the library working group knows that most of these proposals have competing designs or implementations. for example, one of the proposals was for Boost's regular expression facility, but Microsoft Research has also recently made available a competing regular expression facility of their own that may be better at some things. In such cases, the library working group may choose to adopt one of the competing proposals, or aspects of both, or possibly even none at all if it decides that the Standard does not need to include a given kind of facility. (for example, even if there are multiple proposals for a facility to automatically convert an integer to base 42 modulo the current phase of the moon, I doubt we'd accept a ny of them.) Compatibility with standards is important. That means compatibility with C 98, as proposals generally ought to be implementable in current Standard C . It also means compatibility with other standards, notably C99 (for example, adoption of new C99 Facilities Such As ITS Header
Implementability is important. All of the proposals we look at should have reference implementations that are available for inspection by the library working group members. This follows the tradition of the original HP STL, which in 1995 was provided with a working implementation and in a form THE committee could free ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
C99 Compatibility. Where Possible, WE WOULD LIKE to Adopt C99 Facilities So As To Promote Better Compatibility Between The Two Languages. Again, C99'S Header
Filling in gaps. We would like to add things that fill gaps and omissions in the current C Standard library. One example is hash-based containers. Another is a wider choice of smart pointers in addition to the current auto_ptr, which unfortunately seems to be .
Useful facilities. Now, just because a facility is useful does not mean it has to be standardized. But some facilities, such as strings, are so widely used that it would be embarrassing to fail to have them in a standard. We do in fact have strings in C 98 (unlike pre-standard C ) for just this reason; what we do not have are things like standard support for regular expression matching and tokenization, both of which are common tasks we want to perform on strings in particular and on iterator ranges and streams in general. in this category, I also include features that facilitate systems programming and generic programming tasks, such as the thread and type traits features discussed below.As you'll now see, this set of proposals Includes Repensentatives from All Three of these categories.
The proposals
Here is the group of seven proposals.
Header
. The C99 Standard added several new facilities to the C Standard library In particular, the C99
Exact-Width Integers (Optional IN C99): The Types Intn_t and uint_nn_t where nn can be 8, 16, 32, or 64 (E.G., INT32_T) Are Signed and UNSIGNED INTEGERS OF EXACTLY NN BITS.
Minimum-width Integers: The Types Int_Leastnn_t and uint_leastn_t where nn can be 8, 16, 32, or 64 (e.g., int_least32_t) Are Signed and UNSIGNED INTEGERS OF AT LEAST NN BITS.
Fastest minimum-width integers: the types int_fastNN_t and uint_fastNN_t where NN can be 8, 16, or 32 (e.g., int_fast32_t) are signed and unsigned integers of at least NN bits that are usually the fastest for most kinds of integer operations.
Greatest-width integers: the types intmax_t and uintmax_t are signed and unsigned integers able to hold any value that can be held in any other signed or unsigned type, respectively.This is useful because in most commercial C projects today that have to target multiple platforms , we already have to define our own versions of these facilities for better portability. You probably have an OUR_INT32 typedef or macro in your project's common system header already. Facilities like these help to keep us from reinventing too many basic wheels and are especially useful as We preted for the shift to 64-bit computing if we're not there is already.
Boost's header
Type traits [4]
The second facility submitted was Boost's Type Traits. If you have any doubt about how important it is to know things about types when doing generic programming, reread Alexandrescu's Modern C Design [5]. That book's Loki library includes similar facilities to the Boost facility, Althought Details Differ and Each Has Advantages The Other Does Not.
Say You're Writing a Template That Has A Template Parameter T:
Template
Void f (t t) {/ * whatver * /}
Inside your function template, do you want to know if the type T is really a class (instead of, say, an int or a function)? Just ask is_class
That a type traits facility was among the first submissions to be considered for the next C Standard library is an indication of how important and how often handcrafted it already is today. Just as there were a lot of strings before the Standard had its basic_string template, today a lot of people are rolling their own type traits facilities. Regardless of which proposal is eventually accepted, having this kind of facility will provide what we now realize is an essential service for certain kinds of generic programming.Regular Expressions [6]
Regular expression parsing and matching is another one of those things that many projects do every day. Languages like Perl provide this capability right out of the box. Boost's regular expression matching library provides powerful tools that are deliberately similar to and compatible with those in the Perl , POSIX, AND OTHER POPULAR EXPRESSION LIBRARIES. IF You Know Perl's Tools, You Should Be Aable To Use Boost's WITHOUT BREAKING A SWEAT.
Here's A Simple Example From The Library's Own Documentation, Showing How To Check IF A Normal C String Happens To Hold A Human-Readable Credit-Card Number:
BOOL VALIDATE_CARD_FORMAT (Const std :: string s)
{
Static const boost :: Regex E ("(// D {4} [-]) {3} // D {4}");
Return Regex_match (S, E);
}
As I've pointed out above, Microsoft Research has also made available a competing regular expression facility that claims significant performance advantages over Boost's. Other competing facilities may also appear. I personally think it's likely that one (or some combination) of them will be adopted into the C Standard library, but at this point the field is wide open. If you have a good regex library sitting around that you think is superior to these, let us know by posting to the newsgroup comp.std.c . Now's the Time.smart Pointers [7]
As noted above, it's a real shame that auto_ptr is the only standard smart pointer That did not need to happen;. Indeed, during the first round of C standardization, Greg Colvin in particular was several times encouraged to submit, and did submit, smart pointer variants - and then the committee accepted only auto_ptr, and that in a, well, er, let us politely say "modified" form in particular, Colvin's counted_ptr did not make it into the Standard But never fear, for it.. is here, in Boost:. counted_ptr is now called boost :: shared_ptr, and there's a parallel shared_array there's also a scoped_ptr, which is arguably what auto_ptr should have been (that is, limited to uses as an "auto" object that deallocates its Pointee When It Goes out of scope) and a complementary scoped_Array.
While we were discussing this proposal, Andrei Alexandrescu was able to attend the meeting and offered comments on his own Loki SmartPtr [5] that uses policy-based design. SmartPtr provides a superset of the functionality of the four Boost pointers. It remains to be seen just which of these or other proposals will finally be adopted, but these alternatives are important.If you know nothing else about Boost, know about shared_ptr. It's especially valuable if you ever want to have a container of pointers, because you just can ' t put auto_ptrs into containers (doing that should not and had better not compile, by design, and if it does compile you're left walking naked in a minefield whether you know it or not [8]). What you almost always really .
Random Numbers [9]
Because of my own interest in cryptography, I have a soft spot in my heart for a good RNG (random number generator). RNGs are used all the time for all sorts of things, from unimportant things like generating die rolls in a board game, to important modeling applications like generating random input for stock market simulations, to vital and crucial and easy-to-get-wrong security applications like creating unguessable input for cryptographically secure secret key generation Each of those kinds of random number generation has different requirements.; for example, some require flat distributions (you generally want your dice to have a 1/6 chance of each result, instead of deliberately loaded dice), and some require non-flat distributions (such as normal or Poisson distributions).
I personally think it's important to have decent RNG facilities in the Standard so that people will be less inclined to roll their own and get them wrong. In particular, because C provides rand in its standard library, people are far too quick to rely on it when they should not, which is most of the time. In C , we provide rand because we support the C Standard library, and I personally feel we have a responsibility to do better. What's there now is too often misleading and more often than Not gives people a false seircurity.rational number [10]
Fractions, anyone? Here's a standardizable facility that provides capabilities like rational
Threads [11]
"Why does not C have threads?" Is a commonly heard refrain. Many of us write multithreaded C programs every day of the week, but it's true that the C Standard is silent on the subject of threads, provides no facilities for handling them .
It's virtually a given that the next revision of the C Standard will include thread support. What exact form that takes, and how much change is required in the standard library as opposed to in the core language itself, remains to be seen. But the interest in this area and its pent-up demand make the Boost thread library perhaps the most interesting submission of the bunch. This thread library has been implemented using POSIX threads on Unix and Windows and also using the native Win32 threads on Windows.Summary
No final decisions have been made on any of these facilities, and competing versions of many of them do exist. The committee welcomes those and other future submissions too. In the meantime, we are already seeing concrete and useful proposals in the areas of C99 compatibility (header
Next Time, A Closer Look At One of the Above Proposed Facilities. After That, IT Will Be Time To include News from the Upcoming April 2002 Starade Tuned.
References
[1] H. SUTTER. "" The New C , "C / C Uses Journals Forum, February 2002,
[2]
[3]
[4]
[5] A. AlexandRescu. Modern C Design (Addison-Wesley, 2001).
[6]
[7]
[8] H. SUTTER. Exceptional C , Item 37 (Addison-Wesley, 2000). [9]
[10]
[11]
About the Author
Herb Sutter is an independent consultant and secretary of the ISO / ANSI C standards committee. He is also one of the instructors of The C Seminar (www.gotw.ca/cpp_seminar). Herb can be reached at hsutter@acm.org.