Getting start 3

xiaoxiao2021-03-06  15

Learning regular expressions, I was originally attacking one by one according to the regular expression. Summary of regular-expression constructs written in Java is not 6 sheets of paper, spending a test for 1 or a half days, and learning is almost almost. Unfortunately, YQJ2065 has a good memory, forgotten, and remembers a few simplest things after 3 days. awful.

The regular expression in MSDN is a complete textbook. YQJ2065 has studied once (JScript), and then forgets 7788 after 10 days.

Now I have tried to consider: If I design Regex, can I get it in 5 o'clock in (2 grades 1)? Remember the core 5 points:

1. Basic regular expression: a single character A (where A is in the alphabet σ of the regular characters), and the metamorphic or metader character consists. It may be: l (a) = {a}; l (ε) = {ε}; l () = {}. R and s are all regular expressions: 2. R | s expression: l (r | s) = L (r) ∪ L (s). 3. RS expression: l (r s) = L (r) L (s). 4. R * Expression: L (r *) = l (r) *. 5. (r) Expression of the format: L ((r)) = l (r), parentheses does not change the language, and they only adjust the priority of the operation. Closure> Connection> Pass.

I don't have to use my head and use five points to keep them. That finger is forgotten, just hit the point. Single characters, parallel sets |, link, closure *, parentheses ().

§3 alphabet σ

Single characters depend on alphabet σ, Java supports Unicode. All, σ = Unicode. For example:

String str = "Single characters depend on alphabet", Regex is word. Good effect. I want to use EditPad Pro to be convenient, but it doesn't support Unicode. Also, I am not ready to study it now (3 grade again.), First get the ASCII - The American Standard Code for Information Interchange, as if you often check it in the assembly language. It is a bit blurred now.

ASCII is probably divided into these sections: printable and non-printing characters.

l You can print characters Non-Printable Characters: There are a few non-print characters in Java, which can be placed in regular expressions. These things are a bit strange.

l Print characters: All uppercase and lowercase letters, all numbers, all punctuation symbols, and some other symbols. The key here is that there are some printable characters with special meanings, become metachacters.

In all ASCII characters, our thousands of patterns have been founded to a special character backslash / as a escape ESCAPE switch. Although like Java's escape symbols, I still strange why it doesn't choose to #. It is easy to confuse with the division / confusion, and the directory in the M $ OS is often E: / regex / 1/2, I heard that when C programming, its regex should match E: Regex1 to match, I don't know the source code in Java How is it.

Core 1 point Description: Single A, X, #, ', "are all basic regex. Single metades are not basic regex.

Exercise: str = "b! C @ d # e $ f% g ^ h & j * l" "(m) n_o p | q / r = ST / UV, W? X> y

1Regex = "#" (or / ", / ', r, //,>, n,) Note: String to use escape /" in Java source code / "and regex =" / "", but regular expressions are "If the program:

Import java.util.regex. *;

Class Regex1 {

Public static void main (string args []) {

String str = "for my /" money / "";

String regex = "/" "; //

Pattern P = Pattern.Compile (regex);

Matcher M = P.matcher (STR);

Boolean Result = m.find ();

System.out.println (Result);

}

}

The point here is that the Java language itself has some characters in the source code / "to be compiled, and then the regular expression engine gets a regexp." This is also the reason why C matches one / text character with four backslash. We have to distinguish between Java language and regular expressions.

Be sure to pay attention when using various regular expression exercises.

§4

The core 2 is A | B. In the document, it is called "Character Class", also called "Character Set". This is not easy to translate, the character set is easy to confuse the ASCII character set, the character class? nondescript. Maybe the character is more than the set,

1 How to match for and far for text fororarafar? Remember the priority of the combined operation. Therefore, F (o | a) R is the correct answer, of course, F (A | O) R is also line, and their order does not matter. However, FO | A will match FO and AR.

2 If regex is REGEX, as F (A | S | D | F | G | H | J | L) R, people often like simple, write into F [ASDFGHJL] R. In this way, there are more than two metamodes [and].

3 If there is REGEX such as F [123456] R, people are simple to write into F [1-6] r. Note that even between the character is not a metamorphic, there is only a special semantics between [] and between the order, such as [1-5A-YA-F]. In addition to [], Hyphen- is a normal character.

4 If we want to choose all the best? There is a very easy to use and often misuse. We will explain later.

5 If we want to be alive? For example, all F⊙rs outside A, B, and C, are expressed as σ- {a, b, and c} in the collection operation. We wrote [^ ABC], pay attention: Caret ^ is only followed by [later, it means this semantic. For text f ^ RFARFBRFFHRF ^^ R, F [^ ABC] R can match F ^ r and FHR. And F [A ^ BC] R matches FAR, F ^ R, FBR, FCR. Caret ^ is different from-, it is a metamorphism, but the metamorphism ^ represents another concept, here the ^ and - Similar - can be understood as non-element characters.

6 Shortclothing and its negation: [0-9] There is a short writing method / d. This allows us to think of a mission of assembly language, but / d has enabled us to increase the burden of memory. / D is equivalent to [^ 0-9]. In addition, Java, / W is equivalent to [0-9A-Z_A-Z], for example, when using F / WR matches the following string (there is a underscore inside), the result is: FARF ^ RFHRF1RF price RF_R, / W- - Pay attention to see clearly, is uppercase W-/ W equivalent to [^ / W] is also equivalent to [^ 0-9A-Z_A-Z]. In addition, there are large S and small S, / s equivalents in some non-printing characters [/ T / R / N / F / X0B], and the large S is the negation of small S. 7 Have A REST ...

Exercise:

Regex (in Java Source Codes) String "I say: /" i love you, java, are you love me? / "/ or // 【patternsyntaxexception】 i say //i // love // ​​javai Say // i //love//java [ketball "RFAFARFA | ORFORFORFARFOAR

Program 2.2-1:

Import java.util.regex. *;

Class Regex1 {

Public static void main (string args []) {

String regex = "y"; // Try a "Y //"

String str = "I SAY // I // Love // ​​Java";

System.out.println (STR);

Pattern P = Pattern.Compile (regex);

Matcher M = P.matcher (STR);

String s = m.replaceAll ("⊙"); // ("") Delete

System.out.println (s);

}

}

Program 2.2-2:

转载请注明原文地址:https://www.9cbs.com/read-47371.html

New Post(0)