Interpretation of regular expressions in C #

xiaoxiao2021-03-06  85

Copyright Notice: 9CBS is this BLOG managed service provider. If this paper involves copyright issues, 9CBS does not assume relevant responsibilities, please contact the copyright owner directly with the article Author.

For many years, many programming languages ​​and tools contain support for regular expressions, and a series of namespaces and a range of classes that make full playback of rule expressions, and they are also with future Perl. The rule expression in 5 is compatible. In addition, the Regexp class can also complete some other functions, such as editing from the right left binding mode and expression. In this article, I will briefly introduce the classes and methods in System.Text.RegularExpression, some string matching, and replacement examples, and the details of the group structure, and finally, some of the common you may use. Expression. Basic knowledge that should be mastered

Knowledge of rules expressions may be one of the knowledge of many programmers "often forget". In this article, we will assume that you have mastered the usage of rule expressions, especially the use of expressions in Perl 5. The .NET's regexp class is a supercharge in the expression in Perl 5, so it is theoretically as a good starting point. We also assume that you have the basic knowledge of the C # syntax and .NET architecture. If you don't have a rule expression, I suggest you start learning from Perl 5's grammar. The authoritative book in rule expressions is a book written by Jeffrey Freder, which we strongly recommend reading this book for readers who wish to understand expressions. RegularExpression assembly

The Regexp rule class is included in the System.Text.RegularExpressions.dll file, you must reference this file when compiling the application, for example:

CSC r: system.text.regulaRexpressions.dll foo.cs

The command will create a foo.exe file, which references the System.Text.RegularExpressions file. Name Space Introduction

Only 6 classes and one definition in the namespace, they are: Capture: The result of one match; CaptureCollection: Capture sequence; group: The result of a group record, inherited by Capture; Match: One expression The matching result is inherited by group; MatchCollection: Match's sequence; Matchevaluator: The agent used when replacing the operation; Regex: Instance of the compiled expression.

There are also some static methods in the Regex class:

Escape: Side escape in Regex in the string; ismatch: If the expression matches in the string, the method returns a Boolean value; Match: Returns the instance of Match; Matches: Return a series of match Method; Replace: Replace the matching expression with replacement strings; split: Returns a series of characters determined by the expression; Unescape: Do not escape the escape character in the string. Simple match

We first start learning from the simple expression of Regex, Match classes. Match m = regex.match ("Abracadabra", "(A | B | R) "); we now have an instance of the MATCH class that can be used to test, for example: if (m.success) ... if Want to use the matching string, you can convert it into a string: console.writeline ("match =" m.toString ()); this example can get the following output: match = abra. This is the matching string. The replacement of the string is very intuitive. For example, the following statement: string s = regex.replace ("Abracadabra", "Abra", "zzzz"); it returns a string zzzcadzzzz, all matching strings are replaced with zzzzz.

Now let's look at an example of a more complex string: string s = regex.replace ("abra", @ "^ / s * (. *?) / S * $", "$ 1"); this statement returns String Abra, its preamble and suffix are removed. The above mode is very useful for deleting leading and subsequent spaces in any string. In C #, we often use alphanuce strings, in an alphanumeric string, the compiler does not treat characters "/" as an escape character. When using characters "/" specifies the escape character, @ "..." is very useful. Also worth mentioning $ 1 is used in a string replacement, it indicates that the replacement string can only contain the alternative string. Detail of the matching engine

Now, we understand a slightly complicated example through a group structure.

Look at the following example: String text = "Abracadabra1abracadabra2abracadabra3"; string Pat = @ "(# first group start ABRA # matching string abra (# second group start CAD # matching string CAD)? # Second Group End (Optional)) # The first group ends # Match once or multiple "; // ignore the comment using X modifier 忽 注 注 Regex R = New Regex (PAT," X "); // Get group number List int [] gnums = r.getGroupNumBers (); // Match Match m = r.match (text); while (m.success) {// From group 1 Start (int i = 1; i

Another way to complete the functions in the previous example is through a Matchevaluator, the new code as follows: static string capText (MATCH M) {// acquisition of the matching string string x = m.toString (); // If One character is lowercase if (Char.islower (x [0])) // Convert to capital return char.toupper (x [0]) x.substring (1, x.length-1); return x;} Static void main () {string text = "The Quick Red Fox Jumped Over the lazy brown dog."; system.console.writeline ("text = [" text "]"); string pattern = @ "/ w " String Result = regex.replace (TEXT, PATTERN, New Matchevaluator (Test.capText)); System.Console.writeline ("Result = [" Result "]");} At the same time, it should be noted that due to simply need This model is very simple to modify words without any words. Common expressions In order to better understand how to use rule expressions in the C # environment, I wrote some rules expressions that may be useful to you, these expressions are used in other environments, hoping to be You have helpful.

Roman Digital String P1 = "^ m * (D? C {0, 3} | C [DM])" "(L? X {0, 3} | x [lc]) (v? I {0, 3 } | i [vx]) $ "; string t1 =" vii "; match m1 = regex.match (t1, p1); swap front two words string t2 =" the quick brown fox "; string p2 = @" / S ) (/ s ) "; regex x2 = new regex (p2); string r2 = x2.replace (T2," $ 3 $ 2 $ 1 ", 1); Guan Jian = Value String T3 =" MyVal = 3 "; String P3 = @" (/ w ) / s * = / s * (. *) / S * $ "; match m3 = regex.match (T3, P3); implement 80 characters per row String T4 = "********************" "******************************** ***** " " ****************************** "; String P4 =". {80,} " Match M4 = regex.match (T4, P4); Month / Day / Years: Time: String T5 = "01/01/01 16:10:01"; String P5 = @ "(/ D ) / (/ d ) / (/ d ) (/ d ): (/ d ) "; match m5 = regex.match (T5, P5); change the directory (only for Windows platform) String T6 @ "C: / Documents and Settings / User1 / Desktop /"; string r6 = regex.replace (T6, @ "// user1 //", @ "// user2 //"); extended 16-bit escap String T7 = "% 41"; // Capital A String P7 = "% ([0-9A-FA-F] [0-9A-FA-F])"; string r7 = regex.replace (T7, P7, HEXCONVERT); Remove annotations in C language (waiting to be improved) String T8 = @ "/ * * Traditional Note * /"; String P8 = @ "// * # Match the delimiter started by the comment. *? # Match the annotation / * / # matching annotation end delimiter; String r8 = regex.replace (T8, P8, "", "XS"); Delete String T9A = "Leading" in the start and end in the string; string p9a = @ "^ / s "; String R9a = Regex .Replace (T9A, P9A, ""); String T9B =

"trailing"; string p9b = @ "/ s $"; string r9b = regex.replace (T9B, P9B, ""); add character n in characters / post, make it a true new line String T10 = @ "/ NTEST / N "; STRING R10 = regex.replace (T10, @" // n "," / n "); convert IP address string T11 =" 554.53.52 "; string p11 =" ^ " @" ([[ 01] / d / d | 2 [0-4] / d | 25 [0-5]) /. " @" ([01]? / D / d | 2 [0-4] / d | 25 [0-5]) /. " @" ([01] / d / d | 2 [0-4] / d | 25 [0-5]) /. " @" ([01]? / D / D | 2 [0-4] / D | 25 [0-5]) " " $ "; match m11 = regex.match (t11, p11); Remove the file name containing path String T12 = @" C : /file.txt "; String P12 = @" ^. * // "; string r12 = regex.replace (t12, p12," "); line String T13 = @" this is a in multi-line strings Split line "; string p13 = @" / s * / r? / n / s * "; string r13 = regex.replace (T13, P13," "); extract all the digital String T14 = @" TEST in the string 1 TEST 2.3 TEST 47 "; String P14 = @" (/ D /.? / D * | /. MatchCollection MC14 = Regex.matches (T14, P14); Find all uppercase letters String T15 = "This Is A Test of All Caps"; String P15 = @ "(/ B [^ / WA-Z0-9 _] / B)"; MatchCollection MC15 = Regex.matches (T15, P15); find lowercase words String T16 = "TH Is is a test of limited "; string p16 = @" (/ b [^ / wa-z0-9 _] / b) "; matchcollection mc16 = regex.matches (t16, p16); find the first letter as Upperword string t17 = "this is a test of initial caps"; string p17 = @ "(/ b [^ / WA-Z0-9 _] [^ / WA-Z0-9 _] * / b)"; MatchCollection MC17 = Regex.matches (T17, P17); find links in the simple HTML language String T18 = @

转载请注明原文地址:https://www.9cbs.com/read-109210.html

New Post(0)