Sed: Difference between revisions
From charlesreid1
| Line 31: | Line 31: | ||
</source> | </source> | ||
==Special/Escape Characters== | |||
NOTE: This section is specific to GNU sed, other versions of sed will likely behave differently. | |||
Sometimes you want to look for generic patterns, like "four numbers in a row", rather than something specific, like "5555". This can be done using special/escape characters. | |||
===Numerical Characters=== | |||
To match any number between 0 and 9, use <code>[0-9]</code>, like this: | |||
<pre> | |||
$ echo "5" | sed -e 's/[0-9]/replacement/' | |||
replacement | |||
</pre> | |||
To match a pattern of N numbers between 0 and 9, use <code>\{N\}</code>, like this: | |||
<pre> | |||
$ echo "5678" | sed -e 's/[0-9]\{4\}/replacement/' | |||
replacement | |||
</pre> | |||
If you want to match a pattern of numbers between 0 and 9, and know there will be somewhere between M and N numbers, you can use the syntax <code>\{M,N\}</code>. For example, if you want to replace a number between 2 and 4 digits long: | |||
<pre> | |||
$ echo "56" | sed -e 's/[0-9]\{2,4\}/replacement/' | |||
replacement | |||
$ echo "5234678" | sed -e 's/[0-9]\{2,4\}/replacement/' | |||
replacement678 | |||
$ echo "5" | sed -e 's/[0-9]\{2,4\}/replacement/' | |||
5 | |||
</pre> | |||
Note that in the last command executed, the replacement pattern doesn't show up because the largest pattern of numbers between 0 and 9 is 1, which does not fall in the range of 2 to 4. | |||
Since <code>\{M,N\}</code> is ugly and burdensome to type, you can use the sed flag <code>-r</code> or <code>--regexp-extended</code> to eliminate the need for backslashes: | |||
<pre> | |||
$ echo "5234678" | sed -e 's/[0-9]\{2,4\}/replacement/' | |||
replacement678 | |||
$ echo "5234678" | sed -re 's/[0-9]{2,4}/replacement/' | |||
replacement678 | |||
</pre> | |||
To leave the upper bound of the number size unspecified, use <code>\{N,\}</code>: | |||
<pre> | |||
$ echo "52" | sed -re 's/[0-9]{2,}/replacement/' | |||
replacement | |||
$ echo "5234678" | sed -re 's/[0-9]{2,}/replacement/' | |||
replacement | |||
$ echo "5223902949082309448792387234" | sed -re 's/[0-9]{2,}/replacement/' | |||
replacement | |||
</pre> | |||
This page has more information on special/escape characters: http://sed.sourceforge.net/sedfaq6.html | |||
[[Category:Computers]] | [[Category:Computers]] | ||
[[Category:Programs]] | [[Category:Programs]] | ||
Revision as of 07:18, 27 April 2011
Sed is a *nix system utility that will come with 99% of *nix systems. It's an in-place string manipulation program that can come in handy to make a whole lot of typing into a few lines of string manipulation. It's ugly, but once you start to use it you'll wonder how you ever lived without it.
Sed introduction and tutorial: http://www.grymoire.com/Unix/Sed.html
Editing Files In-Place
Sed can be used to edit files in-place using the -i flag.
Find and Replace
You can find and replace instances of a string in a file using:
$ sed -i -e 's/peanut butter/jelly/g' file{1,2,3}.txt
This replaces peanut butter with jelly in file1.txt, file2,txt, and file3.txt. To replace more than one thing, use
$ sed -i -e 's/peanut butter/jelly/g' \
-e 's/green eggs/ham/g' \
-e 's/water/wine/g' \
file{1,2,3}.txt
or, more succinctly:
$ sed -i -e 's/peanut butter/jelly/g;s/green eggs/ham/g' \
file{1,2,3}.txt
Special/Escape Characters
NOTE: This section is specific to GNU sed, other versions of sed will likely behave differently.
Sometimes you want to look for generic patterns, like "four numbers in a row", rather than something specific, like "5555". This can be done using special/escape characters.
Numerical Characters
To match any number between 0 and 9, use [0-9], like this:
$ echo "5" | sed -e 's/[0-9]/replacement/' replacement
To match a pattern of N numbers between 0 and 9, use \{N\}, like this:
$ echo "5678" | sed -e 's/[0-9]\{4\}/replacement/'
replacement
If you want to match a pattern of numbers between 0 and 9, and know there will be somewhere between M and N numbers, you can use the syntax \{M,N\}. For example, if you want to replace a number between 2 and 4 digits long:
$ echo "56" | sed -e 's/[0-9]\{2,4\}/replacement/'
replacement
$ echo "5234678" | sed -e 's/[0-9]\{2,4\}/replacement/'
replacement678
$ echo "5" | sed -e 's/[0-9]\{2,4\}/replacement/'
5
Note that in the last command executed, the replacement pattern doesn't show up because the largest pattern of numbers between 0 and 9 is 1, which does not fall in the range of 2 to 4.
Since \{M,N\} is ugly and burdensome to type, you can use the sed flag -r or --regexp-extended to eliminate the need for backslashes:
$ echo "5234678" | sed -e 's/[0-9]\{2,4\}/replacement/'
replacement678
$ echo "5234678" | sed -re 's/[0-9]{2,4}/replacement/'
replacement678
To leave the upper bound of the number size unspecified, use \{N,\}:
$ echo "52" | sed -re 's/[0-9]{2,}/replacement/'
replacement
$ echo "5234678" | sed -re 's/[0-9]{2,}/replacement/'
replacement
$ echo "5223902949082309448792387234" | sed -re 's/[0-9]{2,}/replacement/'
replacement
This page has more information on special/escape characters: http://sed.sourceforge.net/sedfaq6.html