Sed instance, Part 1

xiaoxiao2021-03-06  68

Universal thread - SED example, Part 1

Reprinted from: IBM DeveloperWorks China website

In this article, Daniel Robbins will show you how to use the functional (but often forgotted) UNIX stream editor SED. The SED is an ideal tool for editing a file with a batch method or a very effective way to modify an ideal tool for existing files.

Picking Editor There are a lot of text editors in UNIX world to choose from. Think about - VI, Emacs and JED and many other tools will be in the mind. We have you have gradually understand and love the editor (and our favorite combination). With a trustworthy editor, we can easily handle any number of management or programming tasks related to UNIX. Although the interactive editor is great, it has its limit. Although its interactive characteristics can become a strong, there is also a shortcomings. Consider the case where you need to perform similar changes to a set of files. You may be able to run your own editor, then manually perform a group of cumbersome, repeated and time-consuming editing tasks. However, there is a better way. Enter SED If you can automate the process of editing files to edit files with a "batch" method, even writing scripts that can be complex and change existing files, that will be too good. Fortunately, for this situation, there is a better way - this better method is called "SED". The SED is a lightweight stream editor that is almost included in all UNIX platforms (including Linux). Sed has a lot of good features. First of all, it is quite small, usually more times more than the scripting language you love. Second, since the SED is a stream editor, it can edit the data received from a standard input such as a pipe. Therefore, there is no need to store the data to be edited to the file on the disk. Because you can easily output the data pipe to the SED, it is easy to use the SED as a powerful shell script. Try to do it with your favorite editor. GNU SED is fortunate to Linux users is that one of the best SED versions happens to GNU SED whose current version is 3.02. Every Linux distribution has (or at least there should be) GNU SED. The reason why GNU SED is not only because it is free to distribute its source code, but also because it happens to have a lot of convenience, time-saving expansion. In addition, the GNU has many restrictions on Early Special versions of SED, such as row length restrictions - GNU can easily handle any length of rows. The latest GNU SED I noticed when studying this article: Several online SED enthusiasts mention GNU Sed 3.02a. Surprisingly, in ftp.gnu.org (for these links, see Refigu) on Sed 3.02a, so I have to find it elsewhere. I found it in the alpha.gnu.org / pub / sed. So I happily downloaded it, compile it, and I found that the latest SED version was 3.02.80 after a few minutes - can find its source code next to Alpha.Gnu.org 3.02A source code. After the GNU SED 3.02.80 is installed, I am completely ready.

Alpha.gnu.orgalpha.gnu.org (see Resources) is the location of the new and experimental GNU source code. However, you will also find many excellent, stable source code. For some reason, not many GNU developers have forgotten to move a stable source code to ftp.gnu.org, is their "Beta" period, excellent long (2 years!). For example, Sed 3.02a has been two years, or even 3.02.80 is also a year, but they still cannot be obtained in ftp.gnu.org when writing this article in August 2000. The correct SED will use GNU Sed 3.02.80 in this series. In the following subsequent article, some (but very few) the most advanced examples will not be used in GNU SED 3.02 or 3.02A. If you are not a GNU SED, then the result may be different. Why don't you spend some time to install GNU SED 3.02.80? In such way, not only can prepare the remaining parts of this series, but also possible to use the best SED. Sed Example SED works by performing an editing operation ("command") specified by the input data ("Command"). The SED is based on rows, so performs commands in each line in order. Then, the SED is written to the standard output (STDOUT), which does not modify any input files. Let us see some examples. There will be some weird, because I want to use them how Sed how to work, not any useful task. However, if you are a new hand, it is very important to understand them. Here is the first example: $ sed -e 'd' / etc / services If you enter this command, you will not be able to get any output. So what happened? In this example, the SED is called with an editing command 'd'. Sed open / etc / service file, read the line into its mode buffer, perform the Edit command ("Delete Row", then print the mode buffer (the buffer is empty). It then repeats these steps on each row behind. This doesn't generate output, because the "D" command removes every line in the mode buffer! In this example, there are still a few things to pay attention to. First, there is no modification / etc / service at all. This is still because the SED is only read on the file specified in the command line, uses it as input - it does not try to modify the file. The second thing to note is that the SED is facing row. The 'd' command is not simply telling the SED to delete all input data. Instead, the SED is entered into the internal buffer called the / etc / service in each line of / etc / services. Once you read the mode buffer, it will execute the 'd' command, then print the contents of the mode buffer (there is no content in this example). I will show you how to use the address range to control which rows apply the command - but if you don't use the address, the command will be applied to all rows. Thirdly, what is important is to enclose the single quotation of the 'd' command. The habit of developing a single quotes to enclose the SED command is a good note, which can disable the shell extension. Another Sed Example The following is an example of using the SED output stream from the output stream to the first line of the / etc / service file:

$ SED -E '1D' / etc / services | More, in addition to '1', this command is very similar to the first 'd' command except for the first 'D' command. If you guessed '1' referring to the first line, then you guess. Unlike the first example, only 'd' is different, this time the 'd' front of the 'D' has an optional digital address. By using the address, you can tell the SED to edit one or some particular row. The address range is now now, let's take a look at how to specify the address range. In this example, the SED will delete the output of the first to 10 lines: $ sed -e '1, 10d' / etc / services | more When two addresses are separated by a comma, the SED will apply the back command to from The first address starts to end the second address. In this example, the 'd' command is applied to the first to 10 lines (including these two rows). All other rows are ignored. Address with rule expressions now demonstrate a more useful example. Assume that you want to view the contents of the / etc / services file, but is not interested in viewing the comments included. If you know, you can place an comment in the / etc / service file at the beginning of the '#' character. In order to avoid comments, we hope that the SED is removed in '#' starting. The following is a specific practice:

$ SED -E '/ ^ # / d' / etc / services | More tries this example to see what happened. You will notice that the SED successfully completed the expected task. Now let us analyze the situation. To understand the '/ ^ # / d' command, you must first need to analyze it. First, let us remove 'd' - this is the same delete line command us used earlier. The new increase is the '/ ^ # /' section, which is a new rule expression address. Rule expression addresses are always from slant bars. They specify a model that follows the commands after the rule expression address will only be applied to the rows that are just matching the specific pattern. Therefore, '/ ^ # /' is a rule expression. But what do it do? Obviously, the review rule expression is now. Rule Expressions You can use rule expressions to indicate patterns that may be discovered in text. Do you have used '*' characters in the shell command line? This usage is similar to rule expressions, but is not the same. Below is a special character that can be used in a regular expression:

Character Description and Rong Matching and Row Tail Matching and any character matching will match the zero or multiple of the previous characters Match [] and [], all character matching experience rule expressions may be a few An example. All of these examples will be accepted by the SED as a legal address, which appears on the left side of the command. Here are a few examples:

Rule Expression Description /. / Match / ^ # / ^ $ ^ / ^ # / ^ # / ^ # / ^ # / ^ # / ^ # / ^ # / ^ # / ^ / Will match all space matching /} ^ / to match any row matching with '}' (spaceless) /} * ^ / will match any row of zero or multiple spaces after the '}' / [abc] / will match any rows of lowercase 'a', 'b' or 'c', match any row started with 'a', 'b' or 'c' In these examples, you encourage you to try a few. Take some time to familiarize yourself with the rules express, then try a few rule expressions you created yourself. You can use REGEXP as follows:

$ sed -e '/ regexp / d' / path / to / my / test / file | more This will cause the SED to delete any matching rows. However, by telling SED printing regexp matching and deleting mismatched content, not the opposite method, it will be more beneficial to familiarize the rules expressions. You can do this with the following: $ sed -n -e '/ regexp / p' / path / to / my / test / file | more please note the new '-n' option, this option tells SEDs unless clear request print mode Space, otherwise it will not do this. You will also notice that we replace the 'd' command with the 'p' command, as you guess, this clearly requires the SED print mode space. In this way, the matching portion will be printed. More about the address is now so far, we have seen the row address, row range address, and regexp address. However, there are more possibilities. We can specify two rules expressions separated by commas, and the SED will start with all the rows that match the first rule expression to match the rows of the second rule expression (including the line). match. For example, the following command will print from the line that contains "begin" and ends the text block that contains the row of "end":

$ SED -N -E '/ begin /, / end / p' / my / test / file | more If no "begin" is found, the data will not be printed. If "Begin" is found, "end" is found in all rows after this, then all follow-up will be printed. This happens because the SED is a streaming characteristic - it does not know if "end" will occur. C Source Sample If you print the main () function in the C source file, you can enter:

$ sed -n -e '/ main [[: space:]] * (/, / ^} / p' sourcefile.c | more this command has two rule expressive '/ main [[: space:]] * (/ 'And' / ^} / ', and a command' P '. The first rule expression will match any number of spaces or tab keys and start parentheses, as well. This should match the beginning of the general ANSI C main (). In this special rule expression, the '[[: space:]]' character class is appeared. This is just a special keyword, which tells the SED and TAB Or spaces match. If you want, you may not enter '[[: space:]]', and enter '[', then the space letter, then -V, then enter the tab key letter and ']' Control-V tells BASH to insert the "real" tab key, not the command extension. Use the '[: space:]]' command class (especially in the script) will be clearer. Ok, now look at the first Two regexp. '/ ^}' Will match any '}' characters that appear in the new row line. If the code is formatted, this will match the end of the main () function. If the format is not good If it does not match correctly - this is a tricky thing to perform a pattern matching task. Because it is in '-N' quiet mode, the 'p' command is still a task, that is, clearly telling the SED printing the line. Try to run this command to the C source file - it should output the entire main () {} block, including the beginning "main ()" and end '}'. Next article has touched basic knowledge, we will In the latter two articles, speed up. If you want to see some more rich SED information, please feel patient - soon! At the same time, you may want to see the following SED and rule expressions. References For SED: Read developerWorks Daniel's other SED article: General Thread: Sed instance, Part 2 and Part 3. View Eric PEMENT SED FAQ. You can find Sed 3.02 resources in ftp://ftp.gnu.org/pub/gnu/seed Will find a good new Sed 3.02.80 in alpha.gnu.org. In addition, Eric PEMENT has some convenient SED single line programs. Any ambitious SED master should look. If you want to optimally book, O'Reilly Sed & AWK, 2nd Edition will be excellent. Maybe read 7th Edition Unix's Sed Man Page (about 1978!). Read the Felix von Leitner short tutorial. Read David Mertz's "Text Processing In Python" on developerWorks. About rule expressions:

Review, discovery, and modify this free DW exclusive tutorial text in Using Regular Expressions. View the rule expressions how-to document in Python.org. Reference Overview of Regular Expressions at the University of Kentucky, USA. About author

转载请注明原文地址:https://www.9cbs.com/read-90572.html

New Post(0)