UNIX II:grep, Awk, Sed: October 30, 2017
UNIX II:grep, Awk, Sed: October 30, 2017
2017-10-27T18:39:28.100Z,36.4921,-98.7233,6.404,2.7,ml,,50,,0.41,us,us1000axpz,2017-10-28T02:02:23.625Z,"33km NW of Fairview,
Oklahoma",earthquake,1.3,2.6,,,reviewed,tul,tul
2017-10-27T10:00:07.430Z,36.2851,-97.506,5,2.8,mb_lg,,25,0.216,0.19,us,us1000axgi,2017-10-27T19:39:37.296Z,"19km W of Perry,
Oklahoma",earthquake,0.7,1.8,0.071,52,reviewed,us,us
2017-10-25T15:17:48.200Z,36.2824,-97.504,7.408,3.1,ml,,25,,0.23,us,us1000awq6,2017-10-25T21:38:59.678Z,"19km W of Perry,
Oklahoma",earthquake,1.1,5,,,reviewed,tul,tul
2017-10-25T11:05:21.940Z,35.4134,-97.0133,5,2.5,mb_lg,,157,0.152,0.31,us,us1000awms,2017-10-27T21:37:47.660Z,"7km ESE of McLoud,
Oklahoma",earthquake,1.7,2,0.117,19,reviewed,us,us
2017-10-25T01:50:53.100Z,36.9748,-99.4244,8.115,2.9,ml,,197,,0.64,us,us1000awir,2017-10-26T00:52:01.343Z,"23km NE of Buffalo,
Oklahoma",earthquake,2,7.6,,,reviewed,tul,tul
2017-10-24T23:18:09.000Z,35.3787,-98.0931,7.72,2.7,ml,,91,,0.49,us,us1000awhe,2017-10-26T00:47:37.010Z,"13km W of Union City,
Oklahoma",earthquake,2.4,5.7,,,reviewed,tul,tul
What is a regular
expression?
Regular Expression
• Set of characters that specify a pattern
• Makes changing and searching for text easy just from the
command line.
• It’s all about syntax…. (and because it’s UNIX, it’s a little
cryptic)
• http://www.regular-expressions.info/quickstart.html
Simple Regular Expression Symbols
Generally a good idea to surround regular expression with single quotes on command line
to protect it from being interpreted by the shell.
• Non-printable characters:
• \t : for a tab character
• \r : for carriage return
• \n : for new line
• \s : for a white space
Sed – stream editor
• Command line tool for editing files line by line,
largely used for substitution
• Like grep for searching, but can replace found
pattern with something else
• Want to change every instance of mb to ml in my
file?
>>> sed s/mb/ml filename
Sed
>>> sed s/mb/ml filename
• If want to do a search and replace globally (in entire file), put “/g”
at end. Otherwise it will replace only the first instance found on
each line
>>> sed 's/\/usr\/local\/bin/\/usr\/bin/g' file
• Sed uses regular expressions, same as grep
awk
• Programming language available on most Unix-like OS
• Developed in 1970s (name comes from first letters of
last names of developers)
• Useful for manipulating text files
• One of the most useful unix tools you can develop
• Also able to do floating point math
• Structured as a sequence of patterns and then
actions do perform when patterns are found
• Used on text files: columns = fields; lines = records
awk vs nawk vs gawk
• Different versions exist
• awk – original
• nawk – “new awk”, version used on Macs as “awk”
• gawk – GNU awk, standard on linux, compatible
with awk and nawk. Can access this on Macs as
well – use “gawk” or set an alias for it
latitude: -1.698
longitude: 98.298
awk and if
• If statements are very useful in awk:
awk and math
• Big advantage – it does floating point math (remember bash does not)
• it stores all variables as strings, but when math operators are applied, it converts the
strings to floating point numbers if the string consists of numeric characters
• All basic arithmetic is left to right associative
• + : addition
• - : subtraction
• * : multiplication
• / : division
• % : remainder or modulus
• ^ : exponent
• other standard C programming operators
• Assignment operators
• = : set variable equal to value on right
• += : set variable equal to itself plus the value on right
• -= : set variable equal to itself minus the value on right
• *= : set variable equal to itself times the value on right
• /= : set variable equal to itself divided by value on right
• %= : set variable equal to the remainder of itself divided by the value on the right
• ^= : set variable equal to the itself to the exponent following the equal sign
awk relational operators
• Returns 1 if true and 0 if false
• All relational operators are left to right associative
Or