Microsoft's Regular Expression Tutorial (2): Regular Expression Syntax and Priority Order

zhaozj2021-02-16  66

Regular expression syntax

A regular expression is a text mode composed of normal characters (such as characters a to z) and special characters (called metammatics). This mode describes one or more strings to be matched when the text body is looking for. Regular expression As a template, a character mode matches the search string.

Here are some regular expressions that may encounter:

JScriptvbscript Match / ^ / [/ T] * $ / "^ / [/ t] * $" matches a blank line. // D {2} - / d {5} / "/ d {2} - / d {5}" Verify that one ID number is composed of a 2-digit, a hyphen, and a 5-digit. /< (.*)>.* (*)>. * "matches an HTML tag.

The table below is a complete list of metamorphic and its behavior in the regular expression context:

Character Description / Tags the next character as a special character, or a primary character, or a backward reference, or an octave. For example, 'n' matches characters "n". '/ n' matches a newline. Sequence '//' match "/" "/ (" matches "(". ^ Match the input string of the start position. If the multiline property of the regexp object is set, ^ also matches '/ n' or '/ r' The next location. $ Match the end position of the input string. If the multiline property of the regexp object is set, $ also matches the position before '/ n' or '/ r'. * Match the previous sub-expression zero or multiple times For example, ZO * can match "z" and "zoo". * Equivalent to {0,}. Match the previous sub-expression once or more. For example, 'ZO ' can match "ZO" and "ZOO" However, it cannot match "Z". Equivalent to {1,}. • Match the previous sub-expression zero or once. For example, "Do (es)" can match "do" or "does" "" " Do ".? Is equivalent to {0,1}. {n} n is a non-negative integer. Match the N times. For example, 'o {2}' does not match" Bob "'o', but can Match two O. {n,} n is a non-negative integer. At least n times. For example, 'o {2,}' does not match 'O' in "Bob", but can match " All O.'o {1,} 'in fooood is equivalent to' o '.' o {0,} 'is equivalent to' o * '. {n, m} M and N are non-negative integers Where n <= m. Leverage N times and matched M times. Liu, "O {1, 3}" will match the top three O.'o {0, 1} 'in "foooood" 'o?'. Please note that there is no space between commas and two numbers.? When this character is tight in any other restriction (*, ,?, {n}, {n,}, {n, M}), when the matching mode is non-greedy. Non-greedy mode matches the search string as little as possible, and the default greed mode is as much as possible to match the search string. For example, for strings "OOOO ", 'O ?' Will match a single" O ", and 'o ' will match all 'o' .. Match any individual characters other than" / n ". To match any characters including '/ n' Please use the mode of '[./n]'. (Pattern) matches Pattern and get this match. The acquired match can be obtained from the generated Matches collection, using the Submatches collection in VBScript, using $ 0 in JScript ... $ 9 properties. To match the bracket character, use '/ (' or '/)'. (12 :Pattern) match Patte Rn but does not acquire the matching result, that is, this is a non-acquired match, not for storage for storage. This is useful to use the "or" character (|) to combine a pattern.

For example, 'industr (?: Y | iES) is a smale of' Industry | Industries'. (? = pattern) Positive to check, match the lookup string at any string of Pattern. This is a non-acquisition match, that is, the match does not need to be used later. For example, 'Windows (? = 95 | 98 | NT | 2000)' Map "Windows" in Windows 2000, but does not match "Windows" in "Windows 3.1". It is not consumed by the character, that is, after a match occurs, start the next matching search immediately after the last match, not starting from the character containing the pre-check. (?! pattern) negotiation, match the lookup string at any string of any mismatch at any Point WHERE A STRING NOT MATCHING POINT WHERE A STRING NOT MATCHING PATTERN. This is a non-acquisition match, that is, the match does not need to be used later. For example, 'Windows (?! 95 | 98 | NT | 2000) "can match" Windows "in Windows 3.1, but cannot match" Windows "in" Windows 2000 ". It is not consumed by the character, that is, after a match occurs, start the next matching search immediately after the last match, not the X | Y, which matches X or Y after the character containing the queue. For example, 'Z | Food' can match "z" or "food". '(z | f) OOD' matches "Zood" or "Food". [XYZ] Character collection. Match any of the included characters. For example, '[abc]' can match 'a' in "Plain". [^ XYZ] Negative character set. Match any of the characters that are not included. For example, '[^ ABC]' can match 'P' in "Plain". [A-Z] character range. Match any of the characters within the specified range. For example, '[a-z]' can match any lowercase alphabetic characters in the 'A' to 'Z' range. [^ a-z] Negative character range. Match any of any characters that are not within the specified range. For example, '[^ a-z]' can match any of any characters that are not in the 'A' to 'Z'. / b Match a word boundary, that is, the location of the words and spaces. For example, 'er / b' can match 'ER' in "Never", but do not match 'Er' in "Verb". / B matches non-word boundary. 'ER / B' can match 'Er' in "Verb", but cannot match 'Er' in "Never". / CX matches the control character indicated by x. For example, / cm matches a Control-M or an Enterprise. The value of x must be one of A-Z or A-Z. Otherwise, the C is treated as a primary 'c' character. / d Match a numeric character. Equivalent to [0-9]. / D Match a non-digital character. Equivalent to [^ 0-9]. / f Match a change page.

Equivalent to / x0c and / cl. / n Match a newline. Equivalent to / x0a and / cj. / r Match a carriage return. Equivalent to / X0D and / cm. / s Match any blank character, including spaces, tabs, change page, and the like. Equivalent to [/ f / n / r / t / v]. / S Match any non-blank character. Equivalent to [^ / f / N / R / T / V]. / t matches a tab. Equivalent to / x09 and / ci. / v Match a vertical tab. Equivalent to / x0b and / ck. / w Match any word character that includes underscore. Equivalent to '[A-ZA-Z0-9_]'. / W Match any non-word characters. Equivalent to '[^ a-za-z0-9_]'. / XN matches n, where n is a hexadecimal escape value. The hexadecimal escape value must be a determined two numbers long. For example, '/ x41' matches "a". '/ x041' is equivalent to '/ x04' & "1". ASCII coding can be used in regular expressions. ./num matches NUM, where NUM is a positive integer. References to the acquired match. For example, '(.) / 1' matches two consecutive identical characters. / n identifies an octal escape value or a rearward reference. If the sub-expression of at least N before / N, n is a backward reference. Otherwise, if n is an octal number (0-7), then n is an eight-input escape value. / Nm identifies an octal escape value or a backward reference. If the / nm has at least IS Preceded by Least NM acquired a sub-expression, the nm is a backward reference. If there is at least n acquisitions before / nm, then n is a rear reference reference to the text M. If the previous conditions are not satisfied, if n and m are octal numbers (0-7), the / nm will match the eight-way escape value Nm. / Nml If n is an octal number (0-3), and M and L are eight-input numbers (0-7), match the eight-en-en-escaic value NML. / UN matches N, where N is a Unicode character represented by four hexadecimal numbers. For example, / u00A9 matches copyright symbol (?). Regular expression of priority order

After constructing the regular expression, you can evaliate like a mathematical expression, that is, from left to right and in accordance with a priority order.

The following table lists the priority sequence of various regular expression operators from the highest priority to the lowest priority:

Operator describes / escape (), (? :), (? =), [] Parentheses and square brackets *, ,?, {N}, {n,}, {n, m} definition ^ , $, / Anymetachacter location and order | "or" operation

转载请注明原文地址:https://www.9cbs.com/read-18842.html

New Post(0)