Limits in regular expressions

xiaoxiao2021-03-06  42

Sometimes I don't know how much characters you want to match. In order to adapt to this uncertainty, the regular expression supports the concept of qualifier. These qualifiers can specify how many times a given component must appear to match the match.

The following table gives a description of various qualifiers and its meaning:

Character Description * Matches the previous sub-expression zero or multiple times. For example, ZO * can match "Z" and "ZOO". * Equivalent to {0,}. Match the previous sub-expression once or multiple times. For example, 'ZO ' can match "ZO" and "ZOO" but cannot match "Z". Equivalent to {1,}. • Match the previous sub-expression zero or once. For example, "Do (ES)" can match "do" in "do" or "does". Is equivalent to {0,1}. {n} n is a non-negative integer. Match the determined N times. For example, 'o {2}' does not match 'o' in "Bob", but can match two O in "Food". {n,} n is a non-negative integer. At least n times. For example, 'o {2,}' cannot match 'O' in "Bob", but can match all O in "fooOOD". 'o {1,}' is equivalent to 'o '. 'o {0,}' is equivalent to 'o *'. {N, M} M and N are non-negative integers, where n <= m. Match at least n times and matched M times. For example, "O {1, 3}" will match the top three O in "foooood". 'o {0,1}' is equivalent to 'o?'. Please note that there is no space between commas and two numbers.

For a large input document, the number of chapters is easily more than 9 chapters, so there is a way to handle two-digit or three-digit chapter number. This feature is provided. The following JScript regular expression can match the chapter title with any digits:

/ Chapter [1-9] [0-9] * /

The following VBScript regular expressions perform the same match:

"Chapter [1-9] [0-9] *"

Please note that the qualifier appears after the range expressions. Therefore, it will be applied to the entire range of expressions included, in this example, only numbers from 0 to 9 are specified.

There is no use of ' ' default here, because one number is not required in the second or subsequent position. Also didn't use '?' Characters, because this will limit the number of chapters to only two digits. At least one number is required after 'Chapter' and space characters.

If the number of chapter is limited to 99, you can use the following JScript expression to specify at least one digit, but no more than two numbers.

/ Chapter [0-9] {1,2} /

The following regular expressions can be used for VBScript:

"Chapter [0-9] {1,2}"

The disadvantage of the above expression is that if there is a chapter number greater than 99, it still only matches the first two digits. Another disadvantage is that some people can create a Chapter 0 and can still match. A better JScript expression that matches the two-digit number is as follows:

/ Chapter [1-9] [0-9]? /

or

/ Chapter [1-9] [0-9] {0,1} /

For VBScript, the following expression is equivalent to the above:

"Chapter [1-9] [0-9]?"

or

"Chapter [1-9] [0-9] {0,1}" "" ", ' ' and '?' The limit is called greed, that is, they match the text as much as possible. Sometimes this is not what happened. Sometimes I just hope that minimal match.

For example, you may want to search for an HTML document to find a chapter title that is included in the H1 tag. This text may have the following form in the document:

Chapter 1 - Introduction To Regular Expressions

The following expression matches all content between the beginning of the smaller than the number (<) to the end of the H1 tag.

/ "

The regular expression of VBScript is:

"<. *>"

If the starting H1 mark begins, the following non-greedy expressions only match

.

/ "

or

"<. *?>"

By placing '?'? '?'? '"After' * ',' 'or'? ', The expression is transferred from greedy to non-greed or minimally matches.

转载请注明原文地址:https://www.9cbs.com/read-54069.html

New Post(0)