Turnmissile's Blog http://blog.9cbs.net/turnmissile/ Microsoft has included the rules of the regular expression in MSDN, interested friends can study themselves (ms-help: //ms.msdnqtr. 2003OCT.1033 / CPGENREF / HTML / CPCONREGULAREXPRESSIONSLANGUAGEEEMENTS.HTM), here are some syntax elements that I found, everyone study!
Transcomive character table
Escaped Character
Description
Ordinary Characters Characters Other Than. $ ^ {[(|) * ? / Match Themselves. / A Matches A Bell (ALARM) / U0007. / B Matches A Backspace / U0008 IF in A [] Character Class; Otherwise, See the Note Following this table. / t matches a Tab / u0009. / r Matches a carriage return / u000d. / v matches a vertical tab / u000b. / f Matches a form feed / u000c. / n Matches a new line / u000a. / e Matches an escape / u001B / 040 Matches an ASCII character as octal (up to three digits);. numbers with no leading zero are backreferences if they have only one digit or if they correspond to a capturing group number (For more information,. . see backreferences) for example, the character / 040 represents a space / x20 Matches an ASCII character using hexadecimal representation (exactly two digits) / cC Match es an ASCII control character;.. for example, / cC is control-C. / u0020 Matches a Unicode Character Using Hexadecimal Representation (Exactly Four Digits). / When FOLLOWED BY a Character That IS Not Recognized AS An escaped character, matches, / * is the Same as / x2a. Note The Escaped Character / B Is A Special Case. in A Regular Expression, / B Denotes a Word Boundary (BetWeen / W and CHARACTERS Except Withnin A [] Character Class, WHERE / B ReferS to the Backspace Character. in A Replacement Pattern, / B Always Denotes A Backspace.
Character set a character class is a set of character, one of the characters incrh, one one of the one of the characters incructed. The set matches. The following Table Summarizes Character Matching Syntax.Character Class
Description
. Matches any character except / n. If modified by the Singleline option, a period character matches any character. For more information, see Regular Expression Options. [Aeiou] Matches any single character included in the specified set of characters. [^ Aeiou] Matches any single character not in the specified set of characters [0-9a-fA-F] Use of a hyphen. (-). allows specification of contiguous character ranges / p {name} Matches any character in the named character class specified by {name}. Supported names are Unicode groups and block ranges. For example, Ll, Nd, Z, IsGreek, IsBoxDrawing. / P {name} Matches text not included in groups and block ranges specified in {name}. / w Matches any Word Character. Equivalent to the Unicode Character Categories [/ P {LL} / P {lu} / p {lt} / p {lo} / p {nd} / p {pc}]. if ECMAScript-Compliant Behavior Is Specified with THE ECMAScript option, / w is equ_0-9]. / w matches any nonword character. Equivalent to the unicode categories [^ / p {ll} / p {lu} / p {lt} / p {LO} / p {nd} / p {pc}]. if ECMAScript-Compliant Behavior Is Specified with The ECMAScript Option, / W Is Equivalent TO [^ a-ZA-Z_0-9]. / s Matches Any White-Space Character. Equivalent To The Unicode Character Categories [/ F / N / R / T / V / X85 / P {z}]. if ECMAScript-Compliant Behavior Is Specified with The ECMAScript Option, / S Is Equivalent TO [/ F / N / R / T / V]. / s matches any non-white-space character. Equivalent to the Unicode Character Categories [^ / F / N / R / T / V / X85 / P {z}]. if ECMAScript-Compliant Behavior Is Specified WITH The ECMAScript Option, / S Is Equivalent TO [^ / F / N / R / T / V]. / D Matches Any Decimal Digit. Equivalent TO / P {nd} for unicode and [0-9] for non-Unicode , ECMAScript Behavior. / D Matches Any Nondigit.
Equivalent to / P {Nd} for Unicode and [^ 0-9] for non-Unicode, ECMAScript behavior. You can find the Unicode category a character belongs to with the method option regex and ECMAScript are not allowed inline.
Regexoption MEMBER
Inline Character
Description
None N / A Specifies that no options are set. IgnoreCase i Specifies case-insensitive matching. Multiline m Specifies multiline mode. Changes the meaning of ^ and $ so that they match at the beginning and end, respectively, of any line, not just the beginning and end of the whole string ExplicitCapture n Specifies that the only valid captures are explicitly named or numbered groups of the form (
) Lookbehind constructs provide something similar that can be used as a subexpression RightToLeft changes the search direction only It does not reverse the substring that is searched for The lookahead and lookbehind assertions do not change:... Lookahead looks to the right; lookbehind looks to the left.ECMAScript N / A Specifies that ECMAScript-compliant behavior is enabled for the expression. This option can be used only in conjunction with the IgnoreCase and Multiline flags. Use of ECMAScript with any other flags results in an exception. CultureInvariant N / A Specifies That Cultural Difference In Language Is Ignored. See Performing Culture-Insensitive Operations in The RegularExpressions Namespace for more information.
Atomic Zero-Width Assertions
Assertion
Description
^ Specifies that the match must occur at the beginning of the string or the beginning of the line. For more information, see the Multiline option in Regular Expression Options. $ Specifies that the match must occur at the end of the string, before / n at the end of the string, or at the end of the line. For more information, see the Multiline option in Regular Expression Options. / A Specifies that the match must occur at the beginning of the string (ignores the Multiline option). / Z Specifies that the match must occur at the end of the string or before / n at the end of the string (ignores the Multiline option). / z Specifies that the match must occur at the end of the string (ignores the Multiline option) . / G Specifies that the match must occur at the point where the previous match ended. When used with Match.NextMatch (), this ensures that matches are all contiguous. / b Specifies that the match must occur on a boundary between / w ( Alphaumeric) And / W (NonalPhanumeric) Characters. The MA Tch Must Occur On Word Boundaries - That IS, At The First OR Last Characters in Words Separated by Any Nonalphanumeric Characters. / B Specifies That The Match Must Not Occur On A / B Boundary. Quantity. Quantity
Quantifier
Description
* Specifies Zero or more matches; for example, / w * or (abc) *. Equivalent to {0,}. Specifies one or more matches; for example, / w or (abc) . Equivalent to {1,} .? Specifies; for example, / w? Or (abc) ?. Equivalent to {0,1}. {N} specifies example, (pizza) {2}. {N,} Specifies at Least N Matches; for Example, (ABC) {2,}. {N, m} Specifies At Least N, But no more Than M, Matches. *? Specifies The First Match That Consumes AS Few Repeats As Possible (Equivalent To lazy *). ? Specifies as Few Repeats as Possible, But at Least One (Equivalent to Lazy ). ?? Specifies Zero Repeats if Possible, or One (Lazy?). {n}? Equivalent to {n} ( Lazy {n}). {n,}? Specifies as Few Repeats as Possible, But at Least N (Lazy {n,}). {n, m}? specifies as Few Repeats as Possible BetWeen N and M (lazy {n , M}). Group Construction Grouping Constructions Allow You to Capture Groups of Subepressions and To Increase The Efficiency of Regular Expressions with Noncapturing Lookahead and Lookbehind Mo Difiers. The Following Table Describes The Regular Expression Grouping Construction.
Grouping construct
Description
() Captures the matched substring (or noncapturing group; for more information, see the ExplicitCapture option in Regular Expression Options). Captures using () are numbered automatically based on the order of the opening parenthesis, starting from one The first capture, capture. element number zero, is the text matched by the whole regular expression pattern. (?
/ D) matches a word followed by a digit, without matching the digit. This construct does not backtrack. (?!) Zero-width negative lookahead assertion. Continues match only if the subexpression does not match at this position on the right. For EXAMPLE, / B (?! UN) / W / B Matches Words That Do Not Begin with Un. Continues Match Only The Subepression Matches Atly The SubExpression Matches At this position on the left. for example , (? <= 19) 99 matches instances of 99 that follow 19. This construct does not backtrack. (? ) Nonbacktracking subexpression (also known as a "greedy" subexpression). The subexpression is fully matched once, and then does not participate piecemeal in backtracking. (that is, the subexpression matches only strings that would be matched by the subexpression alone .) Named Captures Are Numbered Sequentially, Based on the left-to-right order of the opening parenthesis (like unnamed captures), but numbering of named captures starts after all unnamed captures have been counted. For instance, the pattern ((?
Name
Pattern
0 0 ((?
Backreference Construction The Following Table Lists Optional Parameters That Add Backreference Modifiers To a regular expression.
Backreference ConstructDefinition
/ Number Backreference. for example, (/ w) / 1 Finds Doubled Word Characters. / K
Other The Following Table Lists SubExpressions That Modify A Regular Expression.
Construct
Definition
(? Imnsx-imnsx) Sets or disables options such as case insensitivity to be turned on or off in the middle of a pattern. For information on specific options, see Regular Expression Options. Option changes are effective until the end of the enclosing group. See also the information on the grouping construct (imnsx-imnsx:?), which is a cleaner form Inline comment inserted within a regular expression The comment terminates at the first closing parenthesis character # [to end of line. (#?).. ] X-mode comment. The comment begins at an unescaped # and continues to the end of the line. (Note that the x option or the RegexOptions.IgnorePatternWhitespace enumerated option must be activated for this kind of comment to be recognized.)