If we ask those unix systems, they like what they like. In addition to stable systems and can be started remotely, ten eight-nine people will mention regular expressions; if we ask what they are the most, what is the most headache? In addition to complex process control and installation procedures, it will also be regular expressions. So what is the regular expression? How can I really master the regular expression and properly use it? This article will introduce this, hoping to help readers who are eager to understand and master regular expressions.
Getting started
Simply put, the regular expression is a powerful tool that can be used for pattern matching and replacement. We can find a regular expression in almost all UNIX-based tools, such as a VI editor, Perl, or PHP scripting language, and awk or sed & nbspshell programs. In addition, the scripting language like JavaScript has also provided support for regular expressions. It can be seen that the regular expression has exceeded the limitations of some languages or a system, and has become a widely accepted concept and function.
Regular expression allows users to build a matching mode by using a series of special characters, then compare the matching mode with data files, program input, and web pages, whether or not to include matching mode in the comparison object, perform corresponding program of.
For example, a general expression of a regular expression is whether it is used to verify that the format of the mail address entered online input is correct. If the format of the user mail address is verified by the regular expression, the form information filled out will be processed normally; contrary, if the user entered by the user input does not match the mode, the prompt information will be popped up, requiring the user to re-re- Enter the correct email address. This shows that the regular expression has a pivotable role in the logical judgment of the web application.
Basic syntax
After a preliminary understanding of the function and function of the regular expression, we will see the syntax format of the regular expression.
The form of regular expressions is generally as follows:
/ Love /
The part between the "/" delimiter is the mode that will be matched in the target object. Users can put them between the mode content you want to find the matching object in the "/" delimiter. In order to be able to make user more flexible custom mode content, regular expressions provide special "metadamic characters". The so-called metammatism refers to the exhibit mode of its preamble characters (i.e., characters in front of the metamorphism) in the regular expression.
More commonly used metamodes include: " ", "*", and "?". Among them, " " figures specify that its predetermined characters must continue once or more in the target object, "*" element character specifies that its predetermined character must occur zero or continuous in the target object, and "?" Yuan Characters are specified that their leading objects must be zero or once in the target object.
Let's take a look at the specific application of the regular expression element character.
/ fo /
Since the above regular expression includes a " " element character, indicating that "fool", "fo", or "football" in the target object can match the string of one or more letter O after the letter f after the letter F. .
/ eg * /
Since the above regular expression contains "*" character, it indicates that "EASY", "EGO", or "EGG" in the target object can continuously appear from zero or more letter Gs after the letter E. match.
/ Wil? /
Since "?" Metad characters are included in the above regular expression, it indicates that "WIN", or "Wilson" in the target object, or a string of zero or one letter L continuous or one letter L continuously after the letter i.
In addition to the metammat, the user can accurately specify the frequency that appears in the match object. E.g,
/ jim {2,6} /
The above regular expression specifies that the character m can continuously appear in two times in the matching object, and therefore, the regular expression may match the character string such as JIMMY or JIMMMMMY. After you have a preliminary understanding of how to use the regular expression, let's take a look at the other important metades.
S: Used to match a single space character, including Tab keys, and wrap;
S: Used to match all characters except for single space characters;
D: Used to match the number from 0 to 9;
W: Used to match letters, numbers, or underscore characters;
W: Used to match all characters that do not match W;
: Used to match all characters outside of the resort.
(Note: We can regard S and S and W and W as mutual retrograms)
Below, we look at how to use the above metades in the regular expression.
/ s /
The above regular expression can be used to match one or more space characters in the target object.
/ d000 /
If we have a complex financial statement in his hand, we can find all the total amount of thousands of yuan through the above regular expressions.
In addition to the metamorphors described above, there is another unique dedicated character, ie, locator in the regular expression. The locator is used to specify the appearance of the matching mode in the target object.
More commonly used locators include: "^", "$", "b", and "b". Where "^" positioning specifies that the matching mode must appear at the beginning of the target string, the "$" locator specified that the matching mode must appear in the end of the target object, the B locator specified that the matching mode must appear at the beginning of the target string or One of the two boundaries of the end, and the "B" positioning runation specifies that the matching object must be within the beginning and end of the target string, that is, the matching object cannot be the beginning of the target string, and cannot be used as a target character. The end of the string. Similarly, we can also regard "^" and "$" and "b" and "b" as two sets of locators that are mutually counters. for example:
/ ^ Hell /
Since the above regular expression contains "^" locator, it can match the string of "Hell", "Hello" or "Hellhing" in the target object.
/ AR $ /
Since the "$" locator is included in the above regular expression, it can match the string ends with "car", "bar" or "ar" in the target object.
/ bbom /
Since the above regular expression mode begins with the "B" locator, it can match the string beginning with "BOMB", or "BOM" in the target object.
/ MANB /
Since the above regular expression mode is tailing with the "B" locator, it can match the string of "human", "Woman" or "man" in the target object.
In order to facilitate user more flexible setting matching mode, the regular expression allows the user to specify a range in the match mode without being limited to the specific character. E.g:
/ [A-z] /
The above regular expression will match any uppercase from the A to Z.
/ [a-z] /
The above regular expression will match any lowercase alphabet from the A to Z.
/ [0-9] /
The above regular expression will match any of the numbers from 0 to 9.
/ ([A-Z] [A-Z] [0-9]) /
The above regular expression will match any string consisting of letters and numbers, such as "AB0". Here, it is necessary to remind the user to pay attention to the use of "()" to combine the string in the regular expression. "()" The content containing the symbol must appear in the target object at the same time. Therefore, the above regular expression will not match a string such as "ABC", because the last character in "ABC" is a letter rather than a number.
If we want to implement "or" or "operations in the regular expression, you can use a match in multiple different modes to use the pipeline" | ". E.g:
/ to | TOO | 2 /
The above regular expression will match "TO", "TOO", or "2" in the target object. There is also a more common operator in the regular expression, ie, negative "[^]". Unlike the locator "^" described in our forebel, negative "[^]" specifies the string specified in the mode in the target object. E.g:
/ [^ A-c] /
The above strings will match any characters other than A, B, and C in the target object. In general, when "^" appears in "[]", it is considered a negative operator; and when "^" is "[]", or "[]", it should be regarded. Locator.
Finally, when the user needs to add a metammat in the regular expression of the regular expression, and find the escape character when it looks for match objects. E.g:
/ TH * /
The above regular expression will match "TH *" instead of "THE" or the like in the target object.