summaryrefslogtreecommitdiff
path: root/doc/sed.info-1
diff options
context:
space:
mode:
Diffstat (limited to 'doc/sed.info-1')
-rw-r--r--doc/sed.info-11353
1 files changed, 1353 insertions, 0 deletions
diff --git a/doc/sed.info-1 b/doc/sed.info-1
new file mode 100644
index 0000000..dce6cf1
--- /dev/null
+++ b/doc/sed.info-1
@@ -0,0 +1,1353 @@
+This is ../../doc/sed.info, produced by makeinfo version 4.5 from
+../../doc/sed.texi.
+
+INFO-DIR-SECTION Text creation and manipulation
+START-INFO-DIR-ENTRY
+* sed: (sed). Stream EDitor.
+
+END-INFO-DIR-ENTRY
+
+This file documents version 4.1.5 of GNU `sed', a stream editor.
+
+ Copyright (C) 1998, 1999, 2001, 2002, 2003, 2004 Free Software
+Foundation, Inc.
+
+ This document is released under the terms of the GNU Free
+Documentation License as published by the Free Software Foundation;
+either version 1.1, or (at your option) any later version.
+
+ You should have received a copy of the GNU Free Documentation
+License along with GNU `sed'; see the file `COPYING.DOC'. If not,
+write to the Free Software Foundation, 59 Temple Place - Suite 330,
+Boston, MA 02110-1301, USA.
+
+ There are no Cover Texts and no Invariant Sections; this text, along
+with its equivalent in the printed manual, constitutes the Title Page.
+
+File: sed.info, Node: Top, Next: Introduction, Up: (dir)
+
+
+
+This file documents version 4.1.5 of GNU `sed', a stream editor.
+
+ Copyright (C) 1998, 1999, 2001, 2002, 2003, 2004 Free Software
+Foundation, Inc.
+
+ This document is released under the terms of the GNU Free
+Documentation License as published by the Free Software Foundation;
+either version 1.1, or (at your option) any later version.
+
+ You should have received a copy of the GNU Free Documentation
+License along with GNU `sed'; see the file `COPYING.DOC'. If not,
+write to the Free Software Foundation, 59 Temple Place - Suite 330,
+Boston, MA 02110-1301, USA.
+
+ There are no Cover Texts and no Invariant Sections; this text, along
+with its equivalent in the printed manual, constitutes the Title Page.
+* Menu:
+
+* Introduction:: Introduction
+* Invoking sed:: Invocation
+* sed Programs:: `sed' programs
+* Examples:: Some sample scripts
+* Limitations:: Limitations and (non-)limitations of GNU `sed'
+* Other Resources:: Other resources for learning about `sed'
+* Reporting Bugs:: Reporting bugs
+
+* Extended regexps:: `egrep'-style regular expressions
+
+* Concept Index:: A menu with all the topics in this manual.
+* Command and Option Index:: A menu with all `sed' commands and
+ command-line options.
+
+--- The detailed node listing ---
+
+sed Programs:
+* Execution Cycle:: How `sed' works
+* Addresses:: Selecting lines with `sed'
+* Regular Expressions:: Overview of regular expression syntax
+* Common Commands:: Often used commands
+* The "s" Command:: `sed''s Swiss Army Knife
+* Other Commands:: Less frequently used commands
+* Programming Commands:: Commands for `sed' gurus
+* Extended Commands:: Commands specific of GNU `sed'
+* Escapes:: Specifying special characters
+
+Examples:
+* Centering lines::
+* Increment a number::
+* Rename files to lower case::
+* Print bash environment::
+* Reverse chars of lines::
+* tac:: Reverse lines of files
+* cat -n:: Numbering lines
+* cat -b:: Numbering non-blank lines
+* wc -c:: Counting chars
+* wc -w:: Counting words
+* wc -l:: Counting lines
+* head:: Printing the first lines
+* tail:: Printing the last lines
+* uniq:: Make duplicate lines unique
+* uniq -d:: Print duplicated lines of input
+* uniq -u:: Remove all duplicated lines
+* cat -s:: Squeezing blank lines
+
+
+File: sed.info, Node: Introduction, Next: Invoking sed, Prev: Top, Up: Top
+
+Introduction
+************
+
+ `sed' is a stream editor. A stream editor is used to perform basic
+text transformations on an input stream (a file or input from a
+pipeline). While in some ways similar to an editor which permits
+scripted edits (such as `ed'), `sed' works by making only one pass over
+the input(s), and is consequently more efficient. But it is `sed''s
+ability to filter text in a pipeline which particularly distinguishes
+it from other types of editors.
+
+
+File: sed.info, Node: Invoking sed, Next: sed Programs, Prev: Introduction, Up: Top
+
+Invocation
+**********
+
+ Normally `sed' is invoked like this:
+
+ sed SCRIPT INPUTFILE...
+
+ The full format for invoking `sed' is:
+
+ sed OPTIONS... [SCRIPT] [INPUTFILE...]
+
+ If you do not specify INPUTFILE, or if INPUTFILE is `-', `sed'
+filters the contents of the standard input. The SCRIPT is actually the
+first non-option parameter, which `sed' specially considers a script
+and not an input file if (and only if) none of the other OPTIONS
+specifies a script to be executed, that is if neither of the `-e' and
+`-f' options is specified.
+
+ `sed' may be invoked with the following command-line options:
+
+`--version'
+ Print out the version of `sed' that is being run and a copyright
+ notice, then exit.
+
+`--help'
+ Print a usage message briefly summarizing these command-line
+ options and the bug-reporting address, then exit.
+
+`-n'
+`--quiet'
+`--silent'
+ By default, `sed' prints out the pattern space at the end of each
+ cycle through the script. These options disable this automatic
+ printing, and `sed' only produces output when explicitly told to
+ via the `p' command.
+
+`-i[SUFFIX]'
+`--in-place[=SUFFIX]'
+ This option specifies that files are to be edited in-place. GNU
+ `sed' does this by creating a temporary file and sending output to
+ this file rather than to the standard output.(1).
+
+ This option implies `-s'.
+
+ When the end of the file is reached, the temporary file is renamed
+ to the output file's original name. The extension, if supplied,
+ is used to modify the name of the old file before renaming the
+ temporary file, thereby making a backup copy(2)).
+
+ This rule is followed: if the extension doesn't contain a `*',
+ then it is appended to the end of the current filename as a
+ suffix; if the extension does contain one or more `*' characters,
+ then _each_ asterisk is replaced with the current filename. This
+ allows you to add a prefix to the backup file, instead of (or in
+ addition to) a suffix, or even to place backup copies of the
+ original files into another directory (provided the directory
+ already exists).
+
+ If no extension is supplied, the original file is overwritten
+ without making a backup.
+
+`-l N'
+`--line-length=N'
+ Specify the default line-wrap length for the `l' command. A
+ length of 0 (zero) means to never wrap long lines. If not
+ specified, it is taken to be 70.
+
+`--posix'
+ GNU `sed' includes several extensions to POSIX sed. In order to
+ simplify writing portable scripts, this option disables all the
+ extensions that this manual documents, including additional
+ commands. Most of the extensions accept `sed' programs that are
+ outside the syntax mandated by POSIX, but some of them (such as
+ the behavior of the `N' command described in *note Reporting
+ Bugs::) actually violate the standard. If you want to disable
+ only the latter kind of extension, you can set the
+ `POSIXLY_CORRECT' variable to a non-empty value.
+
+`-r'
+`--regexp-extended'
+ Use extended regular expressions rather than basic regular
+ expressions. Extended regexps are those that `egrep' accepts;
+ they can be clearer because they usually have less backslashes,
+ but are a GNU extension and hence scripts that use them are not
+ portable. *Note Extended regular expressions: Extended regexps.
+
+`-s'
+`--separate'
+ By default, `sed' will consider the files specified on the command
+ line as a single continuous long stream. This GNU `sed' extension
+ allows the user to consider them as separate files: range
+ addresses (such as `/abc/,/def/') are not allowed to span several
+ files, line numbers are relative to the start of each file, `$'
+ refers to the last line of each file, and files invoked from the
+ `R' commands are rewound at the start of each file.
+
+`-u'
+`--unbuffered'
+ Buffer both input and output as minimally as practical. (This is
+ particularly useful if the input is coming from the likes of `tail
+ -f', and you wish to see the transformed output as soon as
+ possible.)
+
+`-e SCRIPT'
+`--expression=SCRIPT'
+ Add the commands in SCRIPT to the set of commands to be run while
+ processing the input.
+
+`-f SCRIPT-FILE'
+`--file=SCRIPT-FILE'
+ Add the commands contained in the file SCRIPT-FILE to the set of
+ commands to be run while processing the input.
+
+
+ If no `-e', `-f', `--expression', or `--file' options are given on
+the command-line, then the first non-option argument on the command
+line is taken to be the SCRIPT to be executed.
+
+ If any command-line parameters remain after processing the above,
+these parameters are interpreted as the names of input files to be
+processed. A file name of `-' refers to the standard input stream.
+The standard input will be processed if no file names are specified.
+
+ ---------- Footnotes ----------
+
+ (1) This applies to commands such as `=', `a', `c', `i', `l', `p'.
+You can still write to the standard output by using the `w' or `W'
+commands together with the `/dev/stdout' special file
+
+ (2) Note that GNU `sed' creates the backup file whether or not
+any output is actually changed.
+
+
+File: sed.info, Node: sed Programs, Next: Examples, Prev: Invoking sed, Up: Top
+
+`sed' Programs
+**************
+
+ A `sed' program consists of one or more `sed' commands, passed in by
+one or more of the `-e', `-f', `--expression', and `--file' options, or
+the first non-option argument if zero of these options are used. This
+document will refer to "the" `sed' script; this is understood to mean
+the in-order catenation of all of the SCRIPTs and SCRIPT-FILEs passed
+in.
+
+ Each `sed' command consists of an optional address or address range,
+followed by a one-character command name and any additional
+command-specific code.
+
+* Menu:
+
+* Execution Cycle:: How `sed' works
+* Addresses:: Selecting lines with `sed'
+* Regular Expressions:: Overview of regular expression syntax
+* Common Commands:: Often used commands
+* The "s" Command:: `sed''s Swiss Army Knife
+* Other Commands:: Less frequently used commands
+* Programming Commands:: Commands for `sed' gurus
+* Extended Commands:: Commands specific of GNU `sed'
+* Escapes:: Specifying special characters
+
+
+File: sed.info, Node: Execution Cycle, Next: Addresses, Up: sed Programs
+
+How `sed' Works
+===============
+
+ `sed' maintains two data buffers: the active _pattern_ space, and
+the auxiliary _hold_ space. Both are initially empty.
+
+ `sed' operates by performing the following cycle on each lines of
+input: first, `sed' reads one line from the input stream, removes any
+trailing newline, and places it in the pattern space. Then commands
+are executed; each command can have an address associated to it:
+addresses are a kind of condition code, and a command is only executed
+if the condition is verified before the command is to be executed.
+
+ When the end of the script is reached, unless the `-n' option is in
+use, the contents of pattern space are printed out to the output
+stream, adding back the trailing newline if it was removed.(1) Then the
+next cycle starts for the next input line.
+
+ Unless special commands (like `D') are used, the pattern space is
+deleted between two cycles. The hold space, on the other hand, keeps
+its data between cycles (see commands `h', `H', `x', `g', `G' to move
+data between both buffers).
+
+ ---------- Footnotes ----------
+
+ (1) Actually, if `sed' prints a line without the terminating
+newline, it will nevertheless print the missing newline as soon as
+more text is sent to the same output stream, which gives the "least
+expected surprise" even though it does not make commands like `sed -n
+p' exactly identical to `cat'.
+
+
+File: sed.info, Node: Addresses, Next: Regular Expressions, Prev: Execution Cycle, Up: sed Programs
+
+Selecting lines with `sed'
+==========================
+
+ Addresses in a `sed' script can be in any of the following forms:
+`NUMBER'
+ Specifying a line number will match only that line in the input.
+ (Note that `sed' counts lines continuously across all input files
+ unless `-i' or `-s' options are specified.)
+
+`FIRST~STEP'
+ This GNU extension matches every STEPth line starting with line
+ FIRST. In particular, lines will be selected when there exists a
+ non-negative N such that the current line-number equals FIRST + (N
+ * STEP). Thus, to select the odd-numbered lines, one would use
+ `1~2'; to pick every third line starting with the second, `2~3'
+ would be used; to pick every fifth line starting with the tenth,
+ use `10~5'; and `50~0' is just an obscure way of saying `50'.
+
+`$'
+ This address matches the last line of the last file of input, or
+ the last line of each file when the `-i' or `-s' options are
+ specified.
+
+`/REGEXP/'
+ This will select any line which matches the regular expression
+ REGEXP. If REGEXP itself includes any `/' characters, each must
+ be escaped by a backslash (`\').
+
+ The empty regular expression `//' repeats the last regular
+ expression match (the same holds if the empty regular expression is
+ passed to the `s' command). Note that modifiers to regular
+ expressions are evaluated when the regular expression is compiled,
+ thus it is invalid to specify them together with the empty regular
+ expression.
+
+`\%REGEXP%'
+ (The `%' may be replaced by any other single character.)
+
+ This also matches the regular expression REGEXP, but allows one to
+ use a different delimiter than `/'. This is particularly useful
+ if the REGEXP itself contains a lot of slashes, since it avoids
+ the tedious escaping of every `/'. If REGEXP itself includes any
+ delimiter characters, each must be escaped by a backslash (`\').
+
+`/REGEXP/I'
+`\%REGEXP%I'
+ The `I' modifier to regular-expression matching is a GNU extension
+ which causes the REGEXP to be matched in a case-insensitive manner.
+
+`/REGEXP/M'
+`\%REGEXP%M'
+ The `M' modifier to regular-expression matching is a GNU `sed'
+ extension which causes `^' and `$' to match respectively (in
+ addition to the normal behavior) the empty string after a newline,
+ and the empty string before a newline. There are special character
+ sequences (`\`' and `\'') which always match the beginning or the
+ end of the buffer. `M' stands for `multi-line'.
+
+
+ If no addresses are given, then all lines are matched; if one
+address is given, then only lines matching that address are matched.
+
+ An address range can be specified by specifying two addresses
+separated by a comma (`,'). An address range matches lines starting
+from where the first address matches, and continues until the second
+address matches (inclusively).
+
+ If the second address is a REGEXP, then checking for the ending
+match will start with the line _following_ the line which matched the
+first address: a range will always span at least two lines (except of
+course if the input stream ends).
+
+ If the second address is a NUMBER less than (or equal to) the line
+matching the first address, then only the one line is matched.
+
+ GNU `sed' also supports some special two-address forms; all these
+are GNU extensions:
+`0,/REGEXP/'
+ A line number of `0' can be used in an address specification like
+ `0,/REGEXP/' so that `sed' will try to match REGEXP in the first
+ input line too. In other words, `0,/REGEXP/' is similar to
+ `1,/REGEXP/', except that if ADDR2 matches the very first line of
+ input the `0,/REGEXP/' form will consider it to end the range,
+ whereas the `1,/REGEXP/' form will match the beginning of its
+ range and hence make the range span up to the _second_ occurrence
+ of the regular expression.
+
+ Note that this is the only place where the `0' address makes
+ sense; there is no 0-th line and commands which are given the `0'
+ address in any other way will give an error.
+
+`ADDR1,+N'
+ Matches ADDR1 and the N lines following ADDR1.
+
+`ADDR1,~N'
+ Matches ADDR1 and the lines following ADDR1 until the next line
+ whose input line number is a multiple of N.
+
+ Appending the `!' character to the end of an address specification
+negates the sense of the match. That is, if the `!' character follows
+an address range, then only lines which do _not_ match the address range
+will be selected. This also works for singleton addresses, and,
+perhaps perversely, for the null address.
+
+
+File: sed.info, Node: Regular Expressions, Next: Common Commands, Prev: Addresses, Up: sed Programs
+
+Overview of Regular Expression Syntax
+=====================================
+
+ To know how to use `sed', people should understand regular
+expressions ("regexp" for short). A regular expression is a pattern
+that is matched against a subject string from left to right. Most
+characters are "ordinary": they stand for themselves in a pattern, and
+match the corresponding characters in the subject. As a trivial
+example, the pattern
+
+ The quick brown fox
+
+matches a portion of a subject string that is identical to itself. The
+power of regular expressions comes from the ability to include
+alternatives and repetitions in the pattern. These are encoded in the
+pattern by the use of "special characters", which do not stand for
+themselves but instead are interpreted in some special way. Here is a
+brief description of regular expression syntax as used in `sed'.
+
+`CHAR'
+ A single ordinary character matches itself.
+
+`*'
+ Matches a sequence of zero or more instances of matches for the
+ preceding regular expression, which must be an ordinary character,
+ a special character preceded by `\', a `.', a grouped regexp (see
+ below), or a bracket expression. As a GNU extension, a postfixed
+ regular expression can also be followed by `*'; for example, `a**'
+ is equivalent to `a*'. POSIX 1003.1-2001 says that `*' stands for
+ itself when it appears at the start of a regular expression or
+ subexpression, but many nonGNU implementations do not support this
+ and portable scripts should instead use `\*' in these contexts.
+
+`\+'
+ As `*', but matches one or more. It is a GNU extension.
+
+`\?'
+ As `*', but only matches zero or one. It is a GNU extension.
+
+`\{I\}'
+ As `*', but matches exactly I sequences (I is a decimal integer;
+ for portability, keep it between 0 and 255 inclusive).
+
+`\{I,J\}'
+ Matches between I and J, inclusive, sequences.
+
+`\{I,\}'
+ Matches more than or equal to I sequences.
+
+`\(REGEXP\)'
+ Groups the inner REGEXP as a whole, this is used to:
+
+ * Apply postfix operators, like `\(abcd\)*': this will search
+ for zero or more whole sequences of `abcd', while `abcd*'
+ would search for `abc' followed by zero or more occurrences
+ of `d'. Note that support for `\(abcd\)*' is required by
+ POSIX 1003.1-2001, but many non-GNU implementations do not
+ support it and hence it is not universally portable.
+
+ * Use back references (see below).
+
+`.'
+ Matches any character, including newline.
+
+`^'
+ Matches the null string at beginning of line, i.e. what appears
+ after the circumflex must appear at the beginning of line.
+ `^#include' will match only lines where `#include' is the first
+ thing on line--if there are spaces before, for example, the match
+ fails. `^' acts as a special character only at the beginning of
+ the regular expression or subexpression (that is, after `\(' or
+ `\|'). Portable scripts should avoid `^' at the beginning of a
+ subexpression, though, as POSIX allows implementations that treat
+ `^' as an ordinary character in that context.
+
+`$'
+ It is the same as `^', but refers to end of line. `$' also acts
+ as a special character only at the end of the regular expression
+ or subexpression (that is, before `\)' or `\|'), and its use at
+ the end of a subexpression is not portable.
+
+`[LIST]'
+`[^LIST]'
+ Matches any single character in LIST: for example, `[aeiou]'
+ matches all vowels. A list may include sequences like
+ `CHAR1-CHAR2', which matches any character between (inclusive)
+ CHAR1 and CHAR2.
+
+ A leading `^' reverses the meaning of LIST, so that it matches any
+ single character _not_ in LIST. To include `]' in the list, make
+ it the first character (after the `^' if needed), to include `-'
+ in the list, make it the first or last; to include `^' put it
+ after the first character.
+
+ The characters `$', `*', `.', `[', and `\' are normally not
+ special within LIST. For example, `[\*]' matches either `\' or
+ `*', because the `\' is not special here. However, strings like
+ `[.ch.]', `[=a=]', and `[:space:]' are special within LIST and
+ represent collating symbols, equivalence classes, and character
+ classes, respectively, and `[' is therefore special within LIST
+ when it is followed by `.', `=', or `:'. Also, when not in
+ `POSIXLY_CORRECT' mode, special escapes like `\n' and `\t' are
+ recognized within LIST. *Note Escapes::.
+
+`REGEXP1\|REGEXP2'
+ Matches either REGEXP1 or REGEXP2. Use parentheses to use complex
+ alternative regular expressions. The matching process tries each
+ alternative in turn, from left to right, and the first one that
+ succeeds is used. It is a GNU extension.
+
+`REGEXP1REGEXP2'
+ Matches the concatenation of REGEXP1 and REGEXP2. Concatenation
+ binds more tightly than `\|', `^', and `$', but less tightly than
+ the other regular expression operators.
+
+`\DIGIT'
+ Matches the DIGIT-th `\(...\)' parenthesized subexpression in the
+ regular expression. This is called a "back reference".
+ Subexpressions are implicity numbered by counting occurrences of
+ `\(' left-to-right.
+
+`\n'
+ Matches the newline character.
+
+`\CHAR'
+ Matches CHAR, where CHAR is one of `$', `*', `.', `[', `\', or `^'.
+ Note that the only C-like backslash sequences that you can
+ portably assume to be interpreted are `\n' and `\\'; in particular
+ `\t' is not portable, and matches a `t' under most implementations
+ of `sed', rather than a tab character.
+
+
+ Note that the regular expression matcher is greedy, i.e., matches
+are attempted from left to right and, if two or more matches are
+possible starting at the same character, it selects the longest.
+
+Examples:
+`abcdef'
+ Matches `abcdef'.
+
+`a*b'
+ Matches zero or more `a's followed by a single `b'. For example,
+ `b' or `aaaaab'.
+
+`a\?b'
+ Matches `b' or `ab'.
+
+`a\+b\+'
+ Matches one or more `a's followed by one or more `b's: `ab' is the
+ shortest possible match, but other examples are `aaaab' or
+ `abbbbb' or `aaaaaabbbbbbb'.
+
+`.*'
+`.\+'
+ These two both match all the characters in a string; however, the
+ first matches every string (including the empty string), while the
+ second matches only strings containing at least one character.
+
+`^main.*(.*)'
+ his matches a string starting with `main', followed by an opening
+ and closing parenthesis. The `n', `(' and `)' need not be
+ adjacent.
+
+`^#'
+ This matches a string beginning with `#'.
+
+`\\$'
+ This matches a string ending with a single backslash. The regexp
+ contains two backslashes for escaping.
+
+`\$'
+ Instead, this matches a string consisting of a single dollar sign,
+ because it is escaped.
+
+`[a-zA-Z0-9]'
+ In the C locale, this matches any ASCII letters or digits.
+
+`[^ tab]\+'
+ (Here `tab' stands for a single tab character.) This matches a
+ string of one or more characters, none of which is a space or a
+ tab. Usually this means a word.
+
+`^\(.*\)\n\1$'
+ This matches a string consisting of two equal substrings separated
+ by a newline.
+
+`.\{9\}A$'
+ This matches nine characters followed by an `A'.
+
+`^.\{15\}A'
+ This matches the start of a string that contains 16 characters,
+ the last of which is an `A'.
+
+
+
+File: sed.info, Node: Common Commands, Next: The "s" Command, Prev: Regular Expressions, Up: sed Programs
+
+Often-Used Commands
+===================
+
+ If you use `sed' at all, you will quite likely want to know these
+commands.
+
+`#'
+ [No addresses allowed.]
+
+ The `#' character begins a comment; the comment continues until
+ the next newline.
+
+ If you are concerned about portability, be aware that some
+ implementations of `sed' (which are not POSIX conformant) may only
+ support a single one-line comment, and then only when the very
+ first character of the script is a `#'.
+
+ Warning: if the first two characters of the `sed' script are `#n',
+ then the `-n' (no-autoprint) option is forced. If you want to put
+ a comment in the first line of your script and that comment begins
+ with the letter `n' and you do not want this behavior, then be
+ sure to either use a capital `N', or place at least one space
+ before the `n'.
+
+`q [EXIT-CODE]'
+ This command only accepts a single address.
+
+ Exit `sed' without processing any more commands or input. Note
+ that the current pattern space is printed if auto-print is not
+ disabled with the `-n' options. The ability to return an exit
+ code from the `sed' script is a GNU `sed' extension.
+
+`d'
+ Delete the pattern space; immediately start next cycle.
+
+`p'
+ Print out the pattern space (to the standard output). This
+ command is usually only used in conjunction with the `-n'
+ command-line option.
+
+`n'
+ If auto-print is not disabled, print the pattern space, then,
+ regardless, replace the pattern space with the next line of input.
+ If there is no more input then `sed' exits without processing any
+ more commands.
+
+`{ COMMANDS }'
+ A group of commands may be enclosed between `{' and `}' characters.
+ This is particularly useful when you want a group of commands to
+ be triggered by a single address (or address-range) match.
+
+
+
+File: sed.info, Node: The "s" Command, Next: Other Commands, Prev: Common Commands, Up: sed Programs
+
+The `s' Command
+===============
+
+ The syntax of the `s' (as in substitute) command is
+`s/REGEXP/REPLACEMENT/FLAGS'. The `/' characters may be uniformly
+replaced by any other single character within any given `s' command.
+The `/' character (or whatever other character is used in its stead)
+can appear in the REGEXP or REPLACEMENT only if it is preceded by a `\'
+character.
+
+ The `s' command is probably the most important in `sed' and has a
+lot of different options. Its basic concept is simple: the `s' command
+attempts to match the pattern space against the supplied REGEXP; if the
+match is successful, then that portion of the pattern space which was
+matched is replaced with REPLACEMENT.
+
+ The REPLACEMENT can contain `\N' (N being a number from 1 to 9,
+inclusive) references, which refer to the portion of the match which is
+contained between the Nth `\(' and its matching `\)'. Also, the
+REPLACEMENT can contain unescaped `&' characters which reference the
+whole matched portion of the pattern space. Finally, as a GNU `sed'
+extension, you can include a special sequence made of a backslash and
+one of the letters `L', `l', `U', `u', or `E'. The meaning is as
+follows:
+
+`\L'
+ Turn the replacement to lowercase until a `\U' or `\E' is found,
+
+`\l'
+ Turn the next character to lowercase,
+
+`\U'
+ Turn the replacement to uppercase until a `\L' or `\E' is found,
+
+`\u'
+ Turn the next character to uppercase,
+
+`\E'
+ Stop case conversion started by `\L' or `\U'.
+
+ To include a literal `\', `&', or newline in the final replacement,
+be sure to precede the desired `\', `&', or newline in the REPLACEMENT
+with a `\'.
+
+ The `s' command can be followed by zero or more of the following
+FLAGS:
+
+`g'
+ Apply the replacement to _all_ matches to the REGEXP, not just the
+ first.
+
+`NUMBER'
+ Only replace the NUMBERth match of the REGEXP.
+
+ Note: the POSIX standard does not specify what should happen when
+ you mix the `g' and NUMBER modifiers, and currently there is no
+ widely agreed upon meaning across `sed' implementations. For GNU
+ `sed', the interaction is defined to be: ignore matches before the
+ NUMBERth, and then match and replace all matches from the NUMBERth
+ on.
+
+`p'
+ If the substitution was made, then print the new pattern space.
+
+ Note: when both the `p' and `e' options are specified, the
+ relative ordering of the two produces very different results. In
+ general, `ep' (evaluate then print) is what you want, but
+ operating the other way round can be useful for debugging. For
+ this reason, the current version of GNU `sed' interprets specially
+ the presence of `p' options both before and after `e', printing
+ the pattern space before and after evaluation, while in general
+ flags for the `s' command show their effect just once. This
+ behavior, although documented, might change in future versions.
+
+`w FILE-NAME'
+ If the substitution was made, then write out the result to the
+ named file. As a GNU `sed' extension, two special values of
+ FILE-NAME are supported: `/dev/stderr', which writes the result to
+ the standard error, and `/dev/stdout', which writes to the standard
+ output.(1)
+
+`e'
+ This command allows one to pipe input from a shell command into
+ pattern space. If a substitution was made, the command that is
+ found in pattern space is executed and pattern space is replaced
+ with its output. A trailing newline is suppressed; results are
+ undefined if the command to be executed contains a NUL character.
+ This is a GNU `sed' extension.
+
+`I'
+`i'
+ The `I' modifier to regular-expression matching is a GNU extension
+ which makes `sed' match REGEXP in a case-insensitive manner.
+
+`M'
+`m'
+ The `M' modifier to regular-expression matching is a GNU `sed'
+ extension which causes `^' and `$' to match respectively (in
+ addition to the normal behavior) the empty string after a newline,
+ and the empty string before a newline. There are special character
+ sequences (`\`' and `\'') which always match the beginning or the
+ end of the buffer. `M' stands for `multi-line'.
+
+
+ ---------- Footnotes ----------
+
+ (1) This is equivalent to `p' unless the `-i' option is being used.
+
+
+File: sed.info, Node: Other Commands, Next: Programming Commands, Prev: The "s" Command, Up: sed Programs
+
+Less Frequently-Used Commands
+=============================
+
+ Though perhaps less frequently used than those in the previous
+section, some very small yet useful `sed' scripts can be built with
+these commands.
+
+`y/SOURCE-CHARS/DEST-CHARS/'
+ (The `/' characters may be uniformly replaced by any other single
+ character within any given `y' command.)
+
+ Transliterate any characters in the pattern space which match any
+ of the SOURCE-CHARS with the corresponding character in DEST-CHARS.
+
+ Instances of the `/' (or whatever other character is used in its
+ stead), `\', or newlines can appear in the SOURCE-CHARS or
+ DEST-CHARS lists, provide that each instance is escaped by a `\'.
+ The SOURCE-CHARS and DEST-CHARS lists _must_ contain the same
+ number of characters (after de-escaping).
+
+`a\'
+`TEXT'
+ As a GNU extension, this command accepts two addresses.
+
+ Queue the lines of text which follow this command (each but the
+ last ending with a `\', which are removed from the output) to be
+ output at the end of the current cycle, or when the next input
+ line is read.
+
+ Escape sequences in TEXT are processed, so you should use `\\' in
+ TEXT to print a single backslash.
+
+ As a GNU extension, if between the `a' and the newline there is
+ other than a whitespace-`\' sequence, then the text of this line,
+ starting at the first non-whitespace character after the `a', is
+ taken as the first line of the TEXT block. (This enables a
+ simplification in scripting a one-line add.) This extension also
+ works with the `i' and `c' commands.
+
+`i\'
+`TEXT'
+ As a GNU extension, this command accepts two addresses.
+
+ Immediately output the lines of text which follow this command
+ (each but the last ending with a `\', which are removed from the
+ output).
+
+`c\'
+`TEXT'
+ Delete the lines matching the address or address-range, and output
+ the lines of text which follow this command (each but the last
+ ending with a `\', which are removed from the output) in place of
+ the last line (or in place of each line, if no addresses were
+ specified). A new cycle is started after this command is done,
+ since the pattern space will have been deleted.
+
+`='
+ As a GNU extension, this command accepts two addresses.
+
+ Print out the current input line number (with a trailing newline).
+
+`l N'
+ Print the pattern space in an unambiguous form: non-printable
+ characters (and the `\' character) are printed in C-style escaped
+ form; long lines are split, with a trailing `\' character to
+ indicate the split; the end of each line is marked with a `$'.
+
+ N specifies the desired line-wrap length; a length of 0 (zero)
+ means to never wrap long lines. If omitted, the default as
+ specified on the command line is used. The N parameter is a GNU
+ `sed' extension.
+
+`r FILENAME'
+ As a GNU extension, this command accepts two addresses.
+
+ Queue the contents of FILENAME to be read and inserted into the
+ output stream at the end of the current cycle, or when the next
+ input line is read. Note that if FILENAME cannot be read, it is
+ treated as if it were an empty file, without any error indication.
+
+ As a GNU `sed' extension, the special value `/dev/stdin' is
+ supported for the file name, which reads the contents of the
+ standard input.
+
+`w FILENAME'
+ Write the pattern space to FILENAME. As a GNU `sed' extension,
+ two special values of FILE-NAME are supported: `/dev/stderr',
+ which writes the result to the standard error, and `/dev/stdout',
+ which writes to the standard output.(1)
+
+ The file will be created (or truncated) before the first input
+ line is read; all `w' commands (including instances of `w' flag on
+ successful `s' commands) which refer to the same FILENAME are
+ output without closing and reopening the file.
+
+`D'
+ Delete text in the pattern space up to the first newline. If any
+ text is left, restart cycle with the resultant pattern space
+ (without reading a new line of input), otherwise start a normal
+ new cycle.
+
+`N'
+ Add a newline to the pattern space, then append the next line of
+ input to the pattern space. If there is no more input then `sed'
+ exits without processing any more commands.
+
+`P'
+ Print out the portion of the pattern space up to the first newline.
+
+`h'
+ Replace the contents of the hold space with the contents of the
+ pattern space.
+
+`H'
+ Append a newline to the contents of the hold space, and then
+ append the contents of the pattern space to that of the hold space.
+
+`g'
+ Replace the contents of the pattern space with the contents of the
+ hold space.
+
+`G'
+ Append a newline to the contents of the pattern space, and then
+ append the contents of the hold space to that of the pattern space.
+
+`x'
+ Exchange the contents of the hold and pattern spaces.
+
+
+ ---------- Footnotes ----------
+
+ (1) This is equivalent to `p' unless the `-i' option is being used.
+
+
+File: sed.info, Node: Programming Commands, Next: Extended Commands, Prev: Other Commands, Up: sed Programs
+
+Commands for `sed' gurus
+========================
+
+ In most cases, use of these commands indicates that you are probably
+better off programming in something like `awk' or Perl. But
+occasionally one is committed to sticking with `sed', and these
+commands can enable one to write quite convoluted scripts.
+
+`: LABEL'
+ [No addresses allowed.]
+
+ Specify the location of LABEL for branch commands. In all other
+ respects, a no-op.
+
+`b LABEL'
+ Unconditionally branch to LABEL. The LABEL may be omitted, in
+ which case the next cycle is started.
+
+`t LABEL'
+ Branch to LABEL only if there has been a successful `s'ubstitution
+ since the last input line was read or conditional branch was taken.
+ The LABEL may be omitted, in which case the next cycle is started.
+
+
+
+File: sed.info, Node: Extended Commands, Next: Escapes, Prev: Programming Commands, Up: sed Programs
+
+Commands Specific to GNU `sed'
+==============================
+
+ These commands are specific to GNU `sed', so you must use them with
+care and only when you are sure that hindering portability is not evil.
+They allow you to check for GNU `sed' extensions or to do tasks that
+are required quite often, yet are unsupported by standard `sed's.
+
+`e [COMMAND]'
+ This command allows one to pipe input from a shell command into
+ pattern space. Without parameters, the `e' command executes the
+ command that is found in pattern space and replaces the pattern
+ space with the output; a trailing newline is suppressed.
+
+ If a parameter is specified, instead, the `e' command interprets
+ it as a command and sends its output to the output stream (like
+ `r' does). The command can run across multiple lines, all but the
+ last ending with a back-slash.
+
+ In both cases, the results are undefined if the command to be
+ executed contains a NUL character.
+
+`L N'
+ This GNU `sed' extension fills and joins lines in pattern space to
+ produce output lines of (at most) N characters, like `fmt' does;
+ if N is omitted, the default as specified on the command line is
+ used. This command is considered a failed experiment and unless
+ there is enough request (which seems unlikely) will be removed in
+ future versions.
+
+`Q [EXIT-CODE]'
+ This command only accepts a single address.
+
+ This command is the same as `q', but will not print the contents
+ of pattern space. Like `q', it provides the ability to return an
+ exit code to the caller.
+
+ This command can be useful because the only alternative ways to
+ accomplish this apparently trivial function are to use the `-n'
+ option (which can unnecessarily complicate your script) or
+ resorting to the following snippet, which wastes time by reading
+ the whole file without any visible effect:
+
+ :eat
+ $d Quit silently on the last line
+ N Read another line, silently
+ g Overwrite pattern space each time to save memory
+ b eat
+
+`R FILENAME'
+ Queue a line of FILENAME to be read and inserted into the output
+ stream at the end of the current cycle, or when the next input
+ line is read. Note that if FILENAME cannot be read, or if its end
+ is reached, no line is appended, without any error indication.
+
+ As with the `r' command, the special value `/dev/stdin' is
+ supported for the file name, which reads a line from the standard
+ input.
+
+`T LABEL'
+ Branch to LABEL only if there have been no successful
+ `s'ubstitutions since the last input line was read or conditional
+ branch was taken. The LABEL may be omitted, in which case the next
+ cycle is started.
+
+`v VERSION'
+ This command does nothing, but makes `sed' fail if GNU `sed'
+ extensions are not supported, simply because other versions of
+ `sed' do not implement it. In addition, you can specify the
+ version of `sed' that your script requires, such as `4.0.5'. The
+ default is `4.0' because that is the first version that
+ implemented this command.
+
+ This command enables all GNU extensions even if `POSIXLY_CORRECT'
+ is set in the environment.
+
+`W FILENAME'
+ Write to the given filename the portion of the pattern space up to
+ the first newline. Everything said under the `w' command about
+ file handling holds here too.
+
+
+File: sed.info, Node: Escapes, Prev: Extended Commands, Up: sed Programs
+
+GNU Extensions for Escapes in Regular Expressions
+=================================================
+
+ Until this chapter, we have only encountered escapes of the form
+`\^', which tell `sed' not to interpret the circumflex as a special
+character, but rather to take it literally. For example, `\*' matches
+a single asterisk rather than zero or more backslashes.
+
+ This chapter introduces another kind of escape(1)--that is, escapes
+that are applied to a character or sequence of characters that
+ordinarily are taken literally, and that `sed' replaces with a special
+character. This provides a way of encoding non-printable characters in
+patterns in a visible manner. There is no restriction on the
+appearance of non-printing characters in a `sed' script but when a
+script is being prepared in the shell or by text editing, it is usually
+easier to use one of the following escape sequences than the binary
+character it represents:
+
+ The list of these escapes is:
+
+`\a'
+ Produces or matches a BEL character, that is an "alert" (ASCII 7).
+
+`\f'
+ Produces or matches a form feed (ASCII 12).
+
+`\n'
+ Produces or matches a newline (ASCII 10).
+
+`\r'
+ Produces or matches a carriage return (ASCII 13).
+
+`\t'
+ Produces or matches a horizontal tab (ASCII 9).
+
+`\v'
+ Produces or matches a so called "vertical tab" (ASCII 11).
+
+`\cX'
+ Produces or matches `CONTROL-X', where X is any character. The
+ precise effect of `\cX' is as follows: if X is a lower case
+ letter, it is converted to upper case. Then bit 6 of the
+ character (hex 40) is inverted. Thus `\cz' becomes hex 1A, but
+ `\c{' becomes hex 3B, while `\c;' becomes hex 7B.
+
+`\dXXX'
+ Produces or matches a character whose decimal ASCII value is XXX.
+
+`\oXXX'
+ Produces or matches a character whose octal ASCII value is XXX.
+
+`\xXX'
+ Produces or matches a character whose hexadecimal ASCII value is
+ XX.
+
+ `\b' (backspace) was omitted because of the conflict with the
+existing "word boundary" meaning.
+
+ Other escapes match a particular character class and are valid only
+in regular expressions:
+
+`\w'
+ Matches any "word" character. A "word" character is any letter or
+ digit or the underscore character.
+
+`\W'
+ Matches any "non-word" character.
+
+`\b'
+ Matches a word boundary; that is it matches if the character to
+ the left is a "word" character and the character to the right is a
+ "non-word" character, or vice-versa.
+
+`\B'
+ Matches everywhere but on a word boundary; that is it matches if
+ the character to the left and the character to the right are
+ either both "word" characters or both "non-word" characters.
+
+`\`'
+ Matches only at the start of pattern space. This is different
+ from `^' in multi-line mode.
+
+`\''
+ Matches only at the end of pattern space. This is different from
+ `$' in multi-line mode.
+
+
+ ---------- Footnotes ----------
+
+ (1) All the escapes introduced here are GNU extensions, with the
+exception of `\n'. In basic regular expression mode, setting
+`POSIXLY_CORRECT' disables them inside bracket expressions.
+
+
+File: sed.info, Node: Examples, Next: Limitations, Prev: sed Programs, Up: Top
+
+Some Sample Scripts
+*******************
+
+ Here are some `sed' scripts to guide you in the art of mastering
+`sed'.
+
+* Menu:
+
+Some exotic examples:
+* Centering lines::
+* Increment a number::
+* Rename files to lower case::
+* Print bash environment::
+* Reverse chars of lines::
+
+Emulating standard utilities:
+* tac:: Reverse lines of files
+* cat -n:: Numbering lines
+* cat -b:: Numbering non-blank lines
+* wc -c:: Counting chars
+* wc -w:: Counting words
+* wc -l:: Counting lines
+* head:: Printing the first lines
+* tail:: Printing the last lines
+* uniq:: Make duplicate lines unique
+* uniq -d:: Print duplicated lines of input
+* uniq -u:: Remove all duplicated lines
+* cat -s:: Squeezing blank lines
+
+
+File: sed.info, Node: Centering lines, Next: Increment a number, Up: Examples
+
+Centering Lines
+===============
+
+ This script centers all lines of a file on a 80 columns width. To
+change that width, the number in `\{...\}' must be replaced, and the
+number of added spaces also must be changed.
+
+ Note how the buffer commands are used to separate parts in the
+regular expressions to be matched--this is a common technique.
+
+ #!/usr/bin/sed -f
+
+ # Put 80 spaces in the buffer
+ 1 {
+ x
+ s/^$/ /
+ s/^.*$/&&&&&&&&/
+ x
+ }
+
+ # del leading and trailing spaces
+ y/tab/ /
+ s/^ *//
+ s/ *$//
+
+ # add a newline and 80 spaces to end of line
+ G
+
+ # keep first 81 chars (80 + a newline)
+ s/^\(.\{81\}\).*$/\1/
+
+ # \2 matches half of the spaces, which are moved to the beginning
+ s/^\(.*\)\n\(.*\)\2/\2\1/
+
+
+File: sed.info, Node: Increment a number, Next: Rename files to lower case, Prev: Centering lines, Up: Examples
+
+Increment a Number
+==================
+
+ This script is one of a few that demonstrate how to do arithmetic in
+`sed'. This is indeed possible,(1) but must be done manually.
+
+ To increment one number you just add 1 to last digit, replacing it
+by the following digit. There is one exception: when the digit is a
+nine the previous digits must be also incremented until you don't have
+a nine.
+
+ This solution by Bruno Haible is very clever and smart because it
+uses a single buffer; if you don't have this limitation, the algorithm
+used in *Note Numbering lines: cat -n, is faster. It works by
+replacing trailing nines with an underscore, then using multiple `s'
+commands to increment the last digit, and then again substituting
+underscores with zeros.
+
+ #!/usr/bin/sed -f
+
+ /[^0-9]/ d
+
+ # replace all leading 9s by _ (any other character except digits, could
+ # be used)
+ :d
+ s/9\(_*\)$/_\1/
+ td
+
+ # incr last digit only. The first line adds a most-significant
+ # digit of 1 if we have to add a digit.
+ #
+ # The `tn' commands are not necessary, but make the thing
+ # faster
+
+ s/^\(_*\)$/1\1/; tn
+ s/8\(_*\)$/9\1/; tn
+ s/7\(_*\)$/8\1/; tn
+ s/6\(_*\)$/7\1/; tn
+ s/5\(_*\)$/6\1/; tn
+ s/4\(_*\)$/5\1/; tn
+ s/3\(_*\)$/4\1/; tn
+ s/2\(_*\)$/3\1/; tn
+ s/1\(_*\)$/2\1/; tn
+ s/0\(_*\)$/1\1/; tn
+
+ :n
+ y/_/0/
+
+ ---------- Footnotes ----------
+
+ (1) `sed' guru Greg Ubben wrote an implementation of the `dc' RPN
+calculator! It is distributed together with sed.
+
+
+File: sed.info, Node: Rename files to lower case, Next: Print bash environment, Prev: Increment a number, Up: Examples
+
+Rename Files to Lower Case
+==========================
+
+ This is a pretty strange use of `sed'. We transform text, and
+transform it to be shell commands, then just feed them to shell. Don't
+worry, even worse hacks are done when using `sed'; I have seen a script
+converting the output of `date' into a `bc' program!
+
+ The main body of this is the `sed' script, which remaps the name
+from lower to upper (or vice-versa) and even checks out if the remapped
+name is the same as the original name. Note how the script is
+parameterized using shell variables and proper quoting.
+
+ #! /bin/sh
+ # rename files to lower/upper case...
+ #
+ # usage:
+ # move-to-lower *
+ # move-to-upper *
+ # or
+ # move-to-lower -R .
+ # move-to-upper -R .
+ #
+
+ help()
+ {
+ cat << eof
+ Usage: $0 [-n] [-r] [-h] files...
+
+ -n do nothing, only see what would be done
+ -R recursive (use find)
+ -h this message
+ files files to remap to lower case
+
+ Examples:
+ $0 -n * (see if everything is ok, then...)
+ $0 *
+
+ $0 -R .
+
+ eof
+ }
+
+ apply_cmd='sh'
+ finder='echo "$@" | tr " " "\n"'
+ files_only=
+
+ while :
+ do
+ case "$1" in
+ -n) apply_cmd='cat' ;;
+ -R) finder='find "$@" -type f';;
+ -h) help ; exit 1 ;;
+ *) break ;;
+ esac
+ shift
+ done
+
+ if [ -z "$1" ]; then
+ echo Usage: $0 [-h] [-n] [-r] files...
+ exit 1
+ fi
+
+ LOWER='abcdefghijklmnopqrstuvwxyz'
+ UPPER='ABCDEFGHIJKLMNOPQRSTUVWXYZ'
+
+ case `basename $0` in
+ *upper*) TO=$UPPER; FROM=$LOWER ;;
+ *) FROM=$UPPER; TO=$LOWER ;;
+ esac
+
+ eval $finder | sed -n '
+
+ # remove all trailing slashes
+ s/\/*$//
+
+ # add ./ if there is no path, only a filename
+ /\//! s/^/.\//
+
+ # save path+filename
+ h
+
+ # remove path
+ s/.*\///
+
+ # do conversion only on filename
+ y/'$FROM'/'$TO'/
+
+ # now line contains original path+file, while
+ # hold space contains the new filename
+ x
+
+ # add converted file name to line, which now contains
+ # path/file-name\nconverted-file-name
+ G
+
+ # check if converted file name is equal to original file name,
+ # if it is, do not print nothing
+ /^.*\/\(.*\)\n\1/b
+
+ # now, transform path/fromfile\n, into
+ # mv path/fromfile path/tofile and print it
+ s/^\(.*\/\)\(.*\)\n\(.*\)$/mv "\1\2" "\1\3"/p
+
+ ' | $apply_cmd
+