Interpretation in C # regular expression (quote from author: Liu Yanqing)

xiaoxiao2021-03-06  49

Interpretation of regular expressions in C #

Article Category: C # App Store Date: 2004-4-1 Thursday

Author: Liu Yanqing Over the years, many programming languages ​​and tools include support for regular expressions, .NET base class library contains a space and the name of a series of regular expressions can give full play to the power of the class, but they also have the The rules expressions in the future Perl 5 are compatible. In addition, the Regexp class can also complete some other functions, such as editing from the right left binding mode and expression. In this article, I will briefly introduce the classes and methods in System.Text.RegularExpression, some string matching, and replacement examples, and the details of the group structure, and finally, some of the common you may use. Expression. The knowledge of the basic knowledge rules that should be mastered may be one of the knowledge of many programmers "often forget". In this article, we will assume that you have mastered the usage of rule expressions, especially the use of expressions in Perl 5. The .NET's regexp class is a supercharge in the expression in Perl 5, so it is theoretically as a good starting point. We also assume that you have the basic knowledge of the C # syntax and .NET architecture. If you don't have a rule expression, I suggest you start learning from Perl 5's grammar. The authoritative book in rule expressions is a book written by Jeffrey Freder, which we strongly recommend reading this book for readers who wish to understand expressions. RegularExpression assembly The REGEXP rule class is included in the System.Text.RegularExpressions.dll file, you must reference this file when compiling the application, for example: CSC R: system.text.regulaRexpressions.dll foo.cs command will create foo .exe file, it references the System.Text.RegularExpressions file. Name Space Introduction Then only 6 classes and one definition in the namespace, which is: Capture: The result of a match; CaptureCollection: Capture sequence; Group: The result of a group record, inherited by capture: Match: The matching result of the primary expression is inherited by group; MatchCollection: MatchCollection: Matchevaluator: Agent used when performing a replacement operation; Regex: Instance of compiled expressions. The regex class also includes some static methods: Escape: Elvescent in the REGEX in the string; ismatch: If the expression matches in the string, the method returns a Boolean value; Match: Return to Match Instance; Matches: Returns a series of matches; replace: replacing matching expressions with replacement strings; split: Returns a series of characters determined by expressions; Unescape: Do not escape escape characters in the string. Simple match We first start learning from a simple expression using Regex, Match classes.

Match m = regex.match ("Abracadabra", "(A | B | R) "); we now have an instance of the MATCH class that can be used to test, for example: if (m.success) ... if Want to use the matching string, you can convert it into a string: console.writeline ("match =" m.toString ()); this example can get the following output: match = abra. This is the matching string. The replacement of the string is very intuitive. For example, the following statement: string s = regex.replace ("Abracadabra", "Abra", "zzzz"); it returns a string zzzcadzzzz, all matching strings are replaced with zzzzz. Now let's look at an example of a more complex string: string s = regex.replace ("abra", @ "^ / s * (. *?) / S * $", "$ 1"); this statement returns String Abra, its preamble and suffix are removed. The above mode is very useful for deleting leading and subsequent spaces in any string. In C #, we often use alphanuce strings, in an alphanumeric string, the compiler does not treat characters "/" as an escape character. When using characters "/" specifies the escape character, @ "..." is very useful. Also worth mentioning $ 1 is used in a string replacement, it indicates that the replacement string can only contain the alternative string. The details of the matching engine are now now understood by a group structure to understand a slightly complex example.

Look at the following example: String text = "Abracadabra1abracadabra2abracadabra3"; string Pat = @ "(# first group start ABRA # matching string abra (# second group start CAD # matching string CAD)? # Second Group End (Optional)) # The first group ends # Match once or multiple "; // ignore the comment using X modifier 忽 注 注 Regex R = New Regex (PAT," X "); // Get group number List int [] gnums = r.getGroupNumBers (); // Match Match m = r.match (text); while (m.success) {// From group 1 Start (int i = 1; i

The output of this example is shown below: Text = [The Quick Red Fox Jumped Over The Lazy Brown Dog.] Result = [The Quick Red Fox Jumped Over The Lazy Brown Dog.] Based on the expression-based mode completion of the function Another way to pass a Matchevaluator, the new code as follows: static string captext (match m) {// gets the matching string string x = m.toString (); // If the first character is lowercase IF (Char.isLower (x [0])) // Convert to capital return char.toupper (x [0]) x.substring (1, x.length-1); return x;} static void main () { String text = "The Quick Red Fox Jumped Over The Lazy Brown Dog."; System.Console.writeline ("text = [" text "]"); string pattern = @ "/ w "; string result = regex. Replace (Test, Pattern, New Matchevaluator (Test.capText)); System.Console.writeline ("Result = [" Result "]");} At the same time, it should be noted that since it only needs to be modified to words without Modify nonword words, this model is very simple. Common expressions In order to better understand how to use rule expressions in the C # environment, I wrote some rules expressions that may be useful to you, these expressions are used in other environments, hoping to be You have helpful.

Roman Digital String P1 = "^ m * (D? C {0, 3} | C [DM])" "(L? X {0, 3} | x [lc]) (v? I {0, 3 } | i [vx]) $ "; string t1 =" vii "; match m1 = regex.match (t1, p1); swap front two words string t2 =" the quick brown fox "; string p2 = @" / S ) (/ s ) "; regex x2 = new regex (p2); string r2 = x2.replace (T2," $ 3 $ 2 $ 1 ", 1); Guan Jian = Value String T3 =" MyVal = 3 "; String P3 = @" (/ w ) / s * = / s * (. *) / S * $ "; match m3 = regex.match (T3, P3); implement 80 characters per row String T4 = "********************" "******************************** ***** " " ****************************** "; String P4 =". {80,} " Match M4 = regex.match (T4, P4); Month / Day / Years: Time: String T5 = "01/01/01 16:10:01"; String P5 = @ "(/ D ) / (/ d ) / (/ d ) (/ d ): (/ d ) "; match m5 = regex.match (T5, P5); change the directory (only for Windows platform) String T6 @ "C: / Documents and Settings / User1 / Desktop /"; string r6 = regex.replace (T6, @ "// user1 //", @ "// user2 //"); extended 16-bit escap String T7 = "% 41"; // Capital A String P7 = "% ([0-9A-FA- F] [0-9A-FA-F]) "; string r7 = regex.replace (T7, P7, HEXCONVERT); Deleting annotations in C language (waiting to be improved) String T8 = @" / * * Traditional annotation * / "; String p8 = @" // * # Match the delimiter started by the comment. *? # Match the annotation / * / # matching annotation end delimiter "; string r8 = regex.replace (T8, P8," "," XS "); Delete Space String T9A = String T9A =

"leading"; string p9a = @ "^ / s "; string r9a = regex.replace (T9A, P9A, ""); string t9b = "trailing"; string p9b = @ "/ s $"; string r9b = regex .Replace (T9B, P9B, ""); add character n in characters / post, make it true new line String T10 = @ "/ ntest / n"; string r10 = regex.replace (t10, @ "// n "," / n "); convert IP address string T11 =" 55.54.53.52 "; string p11 =" ^ " @" ([01]? / d / d | 2 [0-4] / d | 25 [0-5]) /. " @" ([01] / d / d | 2 [0-4] / d | 25 [0-5]) /. " @" ([01]? / D / D | 2 [0-4] / D | 25 [0-5]) /. " @" ([01]? / d / d | 2 [0-4] / d | 25 [0-5 ]) " " $ "; Match m11 = regex.match (t11, p11); Remove the file name containing the path String T12 = @" c: /file.txt "; string p12 = @" ^. * // " String r12 = regex.replace (T12, P12, ""); line String T13 = @ "this is a split line"; string p13 = @ "/ s * / r? / N / s * "; string r13 = regex.replace (T13, P13,"); extract all the digital String T14 = @ "test 1 test 2.3 test 47"; string p14 = @ "(/ D /. /D*|/./d ) "; MatchCollection MC14 = Regex.matches (T14, P14); find all uppercase letters String T15 = "this is a test of all caps"; string p15 = @ "(/ b [^ / WA-Z0-9 _] / b) "; matchcollection mc15 = regex.matches (T15, P15); find lowercase words string T16 =" this is a test of lowercase "; string p16 = @" (/ b [^ / WA-Z0-9_] / b) ";

转载请注明原文地址:https://www.9cbs.com/read-117663.html

New Post(0)