Picking Editor There are a lot of text editors in UNIX world to choose from. Think about - VI, Emacs and JED and many other tools will be in the mind. We have you have gradually understand and love the editor (and our favorite combination). With a trustworthy editor, we can easily handle any number of management or programming tasks related to UNIX.
Although the interactive editor is great, it has its limit. Although its interactive characteristics can become a strong, there is also a shortcomings. Consider the case where you need to perform similar changes to a set of files. You may be able to run your own editor, then manually perform a group of cumbersome, repeated and time-consuming editing tasks. However, there is a better way.
Enter SED If you can automate the process of editing files to edit files with a "batch" method, even writing scripts that can be complex and change existing files, that will be too good. Fortunately, for this situation, there is a better way - this better method is called "SED".
The SED is a lightweight stream editor that is almost included in all UNIX platforms (including Linux). Sed has a lot of good features. First of all, it is quite small, usually more times more than the scripting language you love. Second, since the SED is a stream editor, it can edit the data received from a standard input such as a pipe. Therefore, there is no need to store the data to be edited to the file on the disk. Because you can easily output the data pipe to the SED, it is easy to use the SED as a powerful shell script. Try to do it with your favorite editor.
GNU SED is fortunate to Linux users is that one of the best SED versions happens to GNU SED whose current version is 3.02. Every Linux distribution has (or at least there should be) GNU SED. The reason why GNU SED is not only because it is free to distribute its source code, but also because it happens to have a lot of convenience, time-saving expansion. In addition, the GNU has many restrictions on Early Special versions of SED, such as row length restrictions - GNU can easily handle any length of rows.
The latest GNU SED I noticed when studying this article: Several online SED enthusiasts mention GNU Sed 3.02a. Surprisingly, in ftp.gnu.org (for these links, see Refigu) on Sed 3.02a, so I have to find it elsewhere. I found it in the alpha.gnu.org / pub / sed. So I happily downloaded it, compile it, and I found that the latest SED version was 3.02.80 after a few minutes - can find its source code next to Alpha.Gnu.org 3.02A source code. After the GNU SED 3.02.80 is installed, I am completely ready.
Alpha.gnu.orgalpha.gnu.org (see Resources) is the location of the new and experimental GNU source code. However, you will also find many excellent, stable source code. For some reason, not many GNU developers have forgotten to move a stable source code to ftp.gnu.org, is their "Beta" period, excellent long (2 years!). For example, Sed 3.02a has been two years, or even 3.02.80 is also a year, but they still cannot be obtained in ftp.gnu.org when writing this article in August 2000.
The correct SED will use GNU Sed 3.02.80 in this series. In the following subsequent article, some (but very few) the most advanced examples will not be used in GNU SED 3.02 or 3.02A. If you are not a GNU SED, then the result may be different. Why don't you spend some time to install GNU SED 3.02.80? In such way, not only can prepare the remaining parts of this series, but also possible to use the best SED. Sed Example SED works by performing an editing operation ("command") specified by the input data ("Command"). The SED is based on rows, so performs commands in each line in order. Then, the SED is written to the standard output (STDOUT), which does not modify any input files.
Let us see some examples. There will be some weird, because I want to use them how Sed how to work, not any useful task. However, if you are a new hand, it is very important to understand them. Here is the first example:
$ sed -e 'd' / etc / services
If you enter this command, you will not get any output. So what happened? In this example, the SED is called with an editing command 'd'. Sed open / etc / service file, read the line into its mode buffer, perform the Edit command ("Delete Row", then print the mode buffer (the buffer is empty). It then repeats these steps on each row behind. This doesn't generate output, because the "D" command removes every line in the mode buffer!
In this example, there are still a few things to pay attention to. First, there is no modification / etc / service at all. This is still because SED reads only in the command line finger ǖ Nego.com - it does not try to modify the file. The second thing to note is that the SED is facing row. The 'd' command is not simply telling the SED to delete all input data. Instead, the SED is entered into the internal buffer called the / etc / service in each line of / etc / services. Once you read the mode buffer, it will execute the 'd' command, then print the contents of the mode buffer (there is no content in this example). I will show you how to use the address range to control which rows apply the command - but if you don't use the address, the command will be applied to all rows.
Thirdly, what is important is to enclose the single quotation of the 'd' command. The habit of developing a single quotes to enclose the SED command is a good note, which can disable the shell extension.
Another Sed Example The following is an example of using the SED output stream from the output stream to the first line of the / etc / service file:
$ SED -E '1D' / etc / services | more
As you can see, in addition to '1', this command is very similar to the first 'd' command. If you guessed '1' referring to the first line, then you guess. Unlike the first example, only 'd' is different, this time the 'd' front of the 'D' has an optional digital address. By using the address, you can tell the SED to edit one or some particular row.
The address range is now now, let's take a look at how to specify the address range. In this example, the SED will delete the output of the first to 10 lines:
$ sed -e '1,10d' / etc / services | more
When the two addresses are separated by a comma, the SED will apply the rear command to the beginning of the first address, to the second address end. In this example, the 'd' command is applied to the first to 10 lines (including these two rows). All other rows are ignored. Address with rule expressions now demonstrate a more useful example. Assume that you want to view the contents of the / etc / services file, but is not interested in viewing the comments included. If you know, you can place an comment in the / etc / service file at the beginning of the '#' character. In order to avoid comments, we hope that the SED is removed in '#' starting. The following is a specific practice:
$ sed -e '/ ^ # / d' / etc / services | more
Try this example to see what happened. You will notice that the SED successfully completed the expected task. Now let us analyze the situation.
To understand the '/ ^ # / d' command, you must first need to analyze it. First, let us remove 'd' - this is the same delete line command us used earlier. The new increase is the '/ ^ # /' section, which is a new rule expression address. Rule expression addresses are always from slant bars. They specify a model that follows the commands after the rule expression address will only be applied to the rows that are just matching the specific pattern.
Therefore, '/ ^ # /' is a rule expression. But what do it do? Obviously, the review rule expression is now.
Rule Expressions You can use rule expressions to indicate patterns that may be discovered in text. Do you have used '*' characters in the shell command line? This usage is similar to rule expressions, but is not the same. Below is a special character that can be used in a regular expression:
Character description and row match matching the end of the row with any character matching will match all characters in [] and [] with [] and []
The best way to feel the rules expressions may be a few examples. All of these examples will be accepted by the SED as a legal address, which appears on the left side of the command. Here are a few examples:
Rule Expression Description /. / Match / ^ # / ^ $ ^ / ^ # / ^ # / ^ # / ^ # / ^ # / ^ # / ^ # / ^ # / ^ / Will match all space matching /} ^ / to match any row matching with '}' (spaceless) /} * ^ / will match any row of zero or multiple spaces after the '}' / [ABC] / will match any row of lower-write 'a', 'b' or 'c' to match any row started with 'a', 'b' or 'c'
In these examples, you encourage you to try a few. Take some time to familiarize yourself with the rules express, then try a few rule expressions you created yourself. You can use REGEXP as follows:
$ sed -e '/ regexp / d' / path / to / my / test / file | more
This will cause the SED to delete any matching rows. However, by telling SED printing regexp matching and deleting mismatched content, not the opposite method, it will be more beneficial to familiarize the rules expressions. You can do this with the following command:
$ sed -n -e '/ regexp / p' / path / to / my / test / file | more
Please note that the new '-n' option tells SEDs to do so unless you have clearly required to print mode space. You will also notice that we replace the 'd' command with the 'p' command, as you guess, this clearly requires the SED print mode space. In this way, the matching portion will be printed. More about the address is now so far, we have seen the row address, row range address, and regexp address. However, there are more possibilities. We can specify two rules expressions separated by commas, and the SED will start with all the rows that match the first rule expression to match the rows of the second rule expression (including the line). match. For example, the following command will print from the line that contains "begin" and ends the text block that contains the row of "end":
$ sed -n -e '/ begin /, / end / p' / my / test / file | more
If there is no "begin", it will not print the data. If "Begin" is found, "end" is found in all rows after this, then all follow-up will be printed. This happens because the SED is a streaming characteristic - it does not know if "end" will occur.
C Source Sample If you print the main () function in the C source file, you can enter:
$ sed -n -e '/ main [[: space:]] * (/, / ^} / p' sourcefile.c | more
This command has two rule expressive '/ main [[: space:]] * (/' and '/ ^} /', and a command 'p'. The first rule expression will follow back in turn with any A number of spaces or tab keys, and string "main" matching parentheses. This should match the beginning of the general ANSI C main () declaration.
In this special rule expression, the '[[[: space:]] character class appears. This is just a special keyword, which tells SEDs to match Tab or spaces. If you prefer, you can not enter '[[: space:]]', and enter '[', then the space letter, then -V, then enter the tab key letter and ']' - Control-V tells Bash is inserted into the "real" tab key instead of executing a command extension. Use the '[[: space:]]' command class (especially in the script).
Ok, look at the second regexp now. '/ ^}' Will match any '}' characters that appear in the new row. If the format of the code is very good, then this will match the end of the main () function. If the format is not good, it will not match correctly - this is a tricky thing to perform the mode matching task.
Because it is in '-N' quiet mode, the 'p' command is still a task, that is, clearly telling SED to print the line. Try to run the command to the C source file - it should output the entire main () {} block, including the start "main ()" and end '}'.
The next article has touched basic knowledge, we will speed up the pace in the latter two articles. If you want to see some richer SED information, please feel patient - there is! At the same time, you may want to see the following SED and rule expressions.
Reference information about SED:
Read more of Daniel on developerWorks: General Thread: SED instance, Part 2, and Part 3. View Eric Spement Sed FAQ. Sed 3.02 resources can be found in ftp://ftp.gnu.org/pub/gnu/sed. A good new Sed 3.02.80 will be found in alpha.gnu.org. In addition, Eric's also has some convenient SED single line programs, and any Sed master with ambition should look at it. If you want to optimize old books, O'Reilly's Sed & AWK, 2nd Edition will be excellent. Maybe read 7th Edition Unix's Sed Man Page (about 1978!). Read the Felix von Leitner short tutorial. Read David Mertz's "Text Processing In Python" on developerWorks. About rule expressions:
Review, discovery, and modify this free DW exclusive tutorial text in Using Regular Expressions. View the rule expressions how-to document in Python.org. Reference Overview of Regular Expressions at the University of Kentucky, USA.
About author
Daniel Robbins lives in Albuquerque in New Mexico. He is the founder of Gentoo Technologies, President and CEO, Gentoo Linux and the PC's Advanced Linux and Portage System (next-generation port system for Linux). He is still the author of Macmilla Openlinux Unleashed, SUSE Linux Unleashed and Samba Unleashed author. Daniel from the second grade of elementary school with the computers, then he first touched the LOGO program language and indulge in the PAC-Man game. This may be the reason why he still serves as the chief graphic designer of Sony Electronic Publishing / Psygnosis. Daniel likes to spend time with his wife Mary and newborn daughter Hadassah. Can contact Daniel via DRobbins@gentoo.org.