Boost 1..33.0 is coming out, and rewriting regex, added
* Support for Unicode
* Support for ATL MFC CString
***********
I can't wait, let's take a look.
Source code download:
=========
Boost address:
CVS -D: PServer: anonymous@cvs.sourceforge.net: / cvsroot / boost login cvs -z9 -d: pserver: anonymous@cvs.sourceforge.net: / cvsroot / boost co -P boost
ICU Address: (Boost 1.33.0 Regex Unicode solution is based on IBM's Unicode library ICU)
http://www.ibm.com/software/globalization/icu/
Source code compilation:
=============
The compilation environment is a C STL comes with VC7.1 VC7.1, enters Boost_Root / libs / regex / build
Bjam -sicu_path = d: / ic 332 -stools = VC-7_1 Stage
Unicode support test:
================
After reading the ICU's DLL, Boost Regex dynamically connected three DLLs have achieved 10M, and the mood is not good, abandon the test.
ATL MFC support:
================
In VC7.1, open a Win32 Console, add the following code:
/ *** Copyright (c) 2004 * John Maddock ** Use, modification and distribution are subject to the * Boost Software License, Version 1.0 (See accompanying file * LICENSE_1_0.txt or copy at http:. //Www.boost. ORG / license_1_0.txt) ** /
/ ** local: see http://www.boost.org for most revition version. * File mfc_example.cpp * Version See
#DEFINE TEST_MFC
#ifdef test_mfc
#include
#ifdef _unicode # define cout wcout # Endif
//// Find out if * password * meets our password requirements, // as defined by the regular expression * requirements *. // bool is_valid_password (const CString & password, const CString & requirements) {return boost :: regex_match (password, boost :: make_regex (requirements));} //// Extract filename part of a path from a CString and return the result // as another CString: // CString get_filename (const CString & path) {boost :: tregex r (__ T ( "(?: // a |") ([^] ) ")); boost :: tmatch what; if (boost :: regex_match (pat, what, r)) {// Extract $ 1 as a cstring: Return CString (what [1] .first, what.length (1));} else {throw std :: runtime_ERROR ("invalid pathname");}}
CString extract_postcode (const CString & address) {// searches throw address for a UK postcode and returns the result, // the expression used is by Phil A. on www.regxlib.com: boost :: tregex r (__ T ( "^ ( ([AZ] {1, 2} [0-9] {1, 2}) | ([AZ] {1, 2} [0-9] [AZ])) // s? ([0-9] [AZ] {2}) $ ")); Boost :: TMATCH What; if (Boost :: Regex_Search (Address, What, R)) {// Extract $ 0 as a cstring: Return Cstring (what [0] .first What.length ());} else {throw std :: runtime_error ("no postcode found";}}
/// Take a Credit Card Number As a string of digits, // and reformat it as a human ready string with "-" // Separating Each Group of Four Digits: // const boost :: Tregex E (__ t) // a (// D {3, 4}) [-]? (// D {4}) [-]? (// D {4}) [-]? (// d {4}) / / z ")); const cstring human_format = __t (" $ 1- $ 2- $ 3- $ 4 ");
CString Human_readable_card_number (const cstring & s) {return boost :: Regex_replace (s, e, human_format);}
INT main () {// password checks using regex_match: cstring pwd = "Abcdef ---"; cstring pwd_check = "(=. * [[: Lower:]]) (? =. * [[[: Upper:] ]) (? =. * [[: punct:]]). {6,} "; BOOL B = IS_VALID_PASSWORD (PWD, PWD_CHECK); Assert (b); PWD =" abcd- "; b = is_valid_password (PWD, PWD_CHECK); Assert (! b); // filename extraction with regex_match: cstring file = "abc.hpp"; file = get_filename (file); assert (file == "abc.hpp"); file = "C: / / a // b // c // DH "; file = get_filename (file); assert (file ==" dh ");
// Postcode Extraction with regex_search: cstring address = "Joe Bloke, 001 Somestreet, Somewhere, / NPL2 8AB"; cstract postcode = extract_postcode (address); assert (postcode = "PL2 8nv");
// HTML LINK Extraction with regex_iterator: cstring text = "
CString credit_card_number = "1234567887654321"; credit_card_number = human_readable_card_number (credit_card_number); assert (credit_card_number == "1234-5678-8765-4321"); return 0;}
#ELSE
#include
Int main () {std :: cout <<
#ENDIF
Set the compilation environment:
================== Include contains $ (boost_root);% (ICU_PATH) / include, all after the VC7.1-related include directory.
Set compilation properties:
============
* Use the Unicode character set
* Use / zc: wchar_t (Note: When the Boost is compiled by default, Wchar_t is the metadata processing, so if you want to support Unicode instead of MBCS, please use this compilation item compilation project)
* Use multi-threaded debug DLL / MDD (please don't use other, if you don't understand what this means)
* Set macro boost_regex_dyn_link (by default, regex is a static connection, if you want to connect, set this macro)
Compile the connection "smooth".
Compile command behavior:
/ Od / d "win32" / d "_debug" / d "_console" / d "boost_regex_dyn_link" / d "_unicode" / d "unicode" / GM / EHSC / RTC1 / MDD / ZC: wchar_t /yu "stdafx.h "/Fp"debug/capture.pch" / fo "debug /" /fd"debug/vc70.pdb "/ w3 / nologo / C / WP64 / Zi / TP
Connection command behavior:
/OUT:"Debug/capture.exe "/ INCREMENTAL / NOLOGO / DEBUG /PDB:"Debug/capture.pdb" / SUBSYSTEM: CONSOLE / MACHINE: X86 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32 .lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib
Boost 1.33.0 Regex Changelog
=====================
Boost 1.33.0.
Completely rewritten expression parsing code, and traits class support;.. Now conforms to the standardization proposal Added support for (? Imsx-imsx) constructs Added support for lookbehind expressions and ( Fixed bug in partial matches of bounded repeats of '.'. Boost 1.31.0. Completely rewritten pattern matching code - it is now up to 10 times faster than before Reorganized documentation Deprecated all interfaces that are not part of the regular expression standardization proposal Added regex_iterator and regex_token_iterator Added support for Perl style independent sub-expressions Added..... non-member operators to the sub_match class, so that you can compare sub_match's with strings, or add them to a string to produce a new string. Added experimental support for extended capture information. Changed the match flags so that they are a distinct type ( NOT ANTEGER [end]