Microsoft's Regular Expression Tutorial (3): Character Matching

xiaoxiao2021-03-06  55

Ordinary character

Ordinary characters consist of all those that are not explicitly specified as a metamorphic character, a non-printing character. This includes all uppercase and lowercase letters characters, all numbers, all punctuation symbols, and some symbols.

The simplest regular expression is a separate normal character that matches the character in the search string itself. For example, single-character mode 'a' can match the letter 'a' that appears in any position in the search string. Here are some single-character regular expression modes:

/ a / / 7 / / m /

Equivalent VBScript single-character regular expression is:

"a" "7" "m"

You can get a larger expression together with multiple single characters together. For example, the following JScript regular expression is not an alias, which is an expression created by combining single character expressive 'a', '7', and 'm'.

/ a7m /

Equivalent VBScript expression is:

"a7m"

Please note that there is no connection operator. What you need to do is to place a character behind another character.

Special characters

There are many figures that need to be specially processed when trying to match them. To match these special characters, these characters must first use these characters, that is, use a backslash (/) in front. The following table gives these special characters and its meaning:

Special character descriptions $ Match the end position of the input string. If the demiline property of the Regexp object is set, $ or '/ r' is matched. To match the worth itself, use / $. () Mark the beginning and end position of a child expression. Sub-expressions can be used later. To match these characters, use / (and /). * Match the previous sub-expression zero or multiple times. To match * characters, use / *. Match the previous sub-expression once or multiple times. To match characters, use / . Matches any single characters other than the resort / N. To match., Please use /. [Marking a bracket expression. To match [, please use / [. • Match the previous sub-expression zero or once, or indicate a non-greedy qualifier. To match? Characters, please use /?. / Tag the next character as a special character, or primary character, or rearward reference, or eight-encyclopedifier. For example, 'n' matches character 'n'. '/ n' matches changing. Sequence '//' Match "/", and '/ (', match "(". ^ Matches the start position of the input string unless used in square brackets, it indicates that it does not accept the character set. Match ^ Character itself, please use / ^. {Tag qualifier expression. To match {, please use / {. | Indicate two options. To match |, please use / |.

Non-printing characters

There are a lot of useful non-print characters, which occasionally must be used. The following table shows the escape sequence used to indicate these non-print characters:

Character Meaning / CX matches the control character indicated by x. For example, / cm matches a Control-M or an Enterprise. The value of x must be one of A-Z or A-Z. Otherwise, the C is treated as a primary 'c' character. / f Match a change page. Equivalent to / x0c and / cl. / n Match a newline. Equivalent to / x0a and / cj. / r Match a carriage return. Equivalent to / X0D and / cm. / s Match any blank character, including spaces, tabs, change page, and the like. Equivalent to [/ f / n / r / t / v]. / S Match any non-blank character. Equivalent to [^ / f / N / R / T / V]. / t matches a tab. Equivalent to / x09 and / ci. / v Match a vertical tab. Equivalent to / x0b and / ck.

Character match

The period (.) Matches any single print or non-printing character in a string, except for the wrap (/ N). The following JScript regular expression can match 'AAC', 'ABC', 'ACC', "ADC ', etc., etc., can also match' A1C ',' A2C ', A-C', and A # c ': / AC /

Equivalent VBScript regular expression is:

"a.c"

If you try to match a string containing the file name, the period (.) Is part of the input string, you can add a backslash (/) character in front of the period in the regular expression to achieve this requirement. For example, the following JScript regular expression can match 'filename.ext':

/filename/.ext/

For VBScript, the equivalent expression is as follows:

"filename / .ext"

These expressions are still quite limited. They only allow matching any single characters. In many cases, it is useful to match special characters from the list. For example, if the input text contains the number representation as Chapter 1, Chapter 2, the chapter title you may need to find these chapters.

Braces expressions

One or more single characters can be placed in a square bracket ([and]) to create a list of to be matched. If the character is placed in parentheses, the list is called a bracket expression. Like anywhere in parentheses, ordinary characters represent itself, that is, they match one of them in the input text. Most special characters will lose their meaning when located in parentheses. There are some exceptions here:

']' Character If it is not the first item, a list will be ended. To match the ']' character in the list, put it in the first item, followed behind the start '['. '/' Is still an escap. To match '/' characters, use '//'.

The characters included in parentheses are only matched to a single character in the parentheses expression in the regular expression. The following JScript regular expressions can match 'Chapter 1', 'Chapter 2', 'Chapter 3', 'Chapter 4' and 'Chapter 5':

/ Chapter [12345] /

In VBScript, you must match the same chapter title, please use the following expression:

"Chapter [12345]"

Note that the word 'Chapter' and the positional relationship of the characters in the brackets are fixed. Therefore, bracket expressions are only used to specify a character set that satisfies the single-character position immediately after the word 'Chapter' and a space. Here is the ninth character position.

If you want to use the range instead of the character itself, you can use a hyphen to separate the start and end characters of the range. The character value of each character will determine its relative order in a range. The following JScript regular expression contains an equivalent to the range expressions of the parentheses shown above.

/ Chapter [1-5] /

The expression of the same function in VBSCIPT is as follows:

"Chapter [1-5]"

If the range is specified in this manner, the start and end values ​​are included in this range. One thing to note is that the starting value in Unicode sort must be before the end value.

If you want to include even characters in parentheses, you must use one of the following methods:

Use a backslash to escape: [/ -] placed the hinder in the start and end position of the parentheses list. The following expression can match all lowercase letters and hyphens: [-A-z] [A-z-] creates a range, where the value of the start character is less than the hyphen, and the value of the end character is equal to or greater than the hyperpoint. The following two regular expressions meet this requirement: [! -] [! - ~], by placing an insert (^) by placing an insert (^) at the beginning of the list, you can find all characters in the list or range. If the insert appears in other locations of the list, it matches its own, there is no special meaning. The following JScript regular expression match chapter section is more than 5 chapter title:

/ CHAPTER [^ 12345] /

Use VBScript:

"Chapter [^ 12345]"

In the example shown above, the expression will match any numeric characters other than 1, 2, 3, 4, or 5 in the ninth position. Therefore, 'Chapter 7' is a match, the same 'Chapter 9' is also the same.

The above expression can be represented using a hyphen (-). For JScript:

/ Chapter [^ 1-5] /

Or, VBScript is:

"Chapter [^ 1-5]"

Typical usage of parentheses is to specify matching of any uppercase or lowercase alphanumeric characters or any numbers. The following JScript expressions give this match:

/ [A-za-z0-9] /

Equivalent VBScript expression is:

"[A-ZA-Z0-9]"

转载请注明原文地址:https://www.9cbs.com/read-113801.html

New Post(0)