Unit 4 Advanced Shell Programming
Unit 4 Advanced Shell Programming
In This chapter, we will study about the simple filter commands .Filter is a system
commands thar accept in from standard input, processes on it and sends output to
standard output stream. A filter can always receive that accept input from a
keyboard. The input of a filter can also be obtained from a file. A filter can always
send its output always receive the monitor. The output of the filter can be
redirected to a file or any output devices. filter works with the send its output like
redirection and piping. Some filter works on field, some on line and some on
character. In this chapter works with the will discuss various types of filters.
Splitting a Files
Unix supports filter commands to split a file. A file can be split horizontally,
vertically or any combination. There are some filter commands such as head, tail,
cut etc… to split a file.
1.head
Displays the first n lines of the specified text files. If the number of lines is not
specified then by default prints first 10 lines.
Syntax:
Head [option] [filename(s)]
Options:
1.-Nc(number of characters):It prints first N-characters of file(s).here N is a positive
integer number.
✓ Consider three files f1,f2 and f3 as follow:
$cat f1
Computer science
Unix and shell programming
$cat f2
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 1|Page
Unix & Shell Programming Unit-4
hello world
hello surat
$cat f3
HELLO WORLD
$
✓ If you issue a command as follow:
$head -5c f1 f2 f3
= = > f1 < = =
compu
= = > f2 < = =
Hello
= = > f3 < = =
HELLO
$
The result shows that it display first five characters of each file.
3.-q(quiet):It never print name of file as header,if more then one file is given.
$head -lq f1 f2 f3
Computer science
hello world
HELLO WORLD
$
A user can use head command for the different purpose as follow:
✓ You can display name of last modified file under the current directory like
this:
Ls -t|head -l
✓ You can display largest file present under the current directory like this:
Ls -S|head -l
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 2|Page
Unix & Shell Programming Unit-4
2.tail:It display the part of the file. The general syntax of tail command is as
follows:
Syntax:
Tail [option] [filename(s)]
✓ With more than one filename, it display last 10-lines precede each with a
header as filename.
$tail file1 file2 file3<enter>
-----> file1 <-----
This is a line of file1
-----> file2 <-----
This is a line of file2
This is a end of file2
-----> file3 <-----
This is file3
$
Options:
1.-Nc(number of characters):It display last N-characters of file(s).consider the
file f1 as follow:
$cat f1
hello world
hello surat
$tail -5c f1
Surat
$
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 3|Page
Unix & Shell Programming Unit-4
4.+n(n is number):It display lines from nth to last line of input file.
$tail +5 f1
Note: tail +n do not work with more than one file.
3.cut: cut command is useful for selecting a specific column of a file. It is used
to cut a specific sections by byte position, character, and field and writes
them to standard output. It cuts a line and extracts the text data. It is
necessary to pass an argument with it; otherwise, it will throw an error
message.
Syntax:
Cut option [file(S)]
The syntax shows that option is compulsory,Without option,cut command will
now work.Some of the options used with cut command are as follow:
1.-c: The '-c' option is used to cut a specific section by character. However,
these character arguments can be a number or a range of numbers, a list of
comma-separated numbers, or any other character.
$cat f1
Unix and shell programming
Read hat linux
Hello surat
Advanced C and DS
$cut -cl f1
U
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 4|Page
Unix & Shell Programming Unit-4
R
H
A
$
$cat>marks
Alex-50
Alen-70
Jon-75
Carry-84
Celena-98
3.-f(filed): It is used to select the specific fields. It also prints any line that
does not contain any delimiter character, unless the -s option is specified.
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 5|Page
Unix & Shell Programming Unit-4
✓ To display 2nd and 3rd words of each line of input file, the command is:
$cut -d” “ -f2, 3 f1<enter>
and shell
hat linux
surat
4.Split:It splits a file into fixed-size pieces. The general syntax of this
command is as follow:
Syntax:
It splits a file into fixed-size pieces. The general syntax of this command is as
follows:
Syntax:
Split[OPTION] [file[PREFIX]]
By default, split command writes 1000 lines of input file into each output file.
The name of output files consist specified PREFIX (x'by default) followed by a
group of letters 'aa', 'ab', and so on up to 'az' and then again from "ba', "bb'
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 6|Page
Unix & Shell Programming Unit-4
and so on up to 'bz'. In this way, there are 676 output files (26*26), the last
one having name 'ZZ’.
It creates five files named xaa, xab, xac, xad and xae in user's current
directory. If a user want to give prefix name 'my' instead of default 'x' then
the command is:
$split -10 all user my
Then it creates five files named myaa, myab, myac, myad and myae in user's
current directory.
✓ Without any argument, it takes input from standard device and creates
output files.
$split -2-
hello
hello world
bye
^d
$
Comparing
Sometimes a user wants to know that two files are identical or not. That
means contents of the two files are sa or different. Unix provides three types
of command that compare the contents of the files. In this section, we
discuss file comparison commands like: cmp, comm and diff.
1. cmp (compare):
It compares two files to know that whether files are identical. Here, two files
are compared character character, and display the position of first mismatch
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 7|Page
Unix & Shell Programming Unit-4
The cmp utility compares two files of any type and writes the results to the
standard output.
✓ By default, cmp is silent if the files are the same; if they differ, the character
(byte) and line number which the first difference occur is reported. Bytes and
lines are numbered beginning with one.
Consider the files fl,f2 and f3 as follow:
$cat fl
cpp programming
linux OS
Unix OS
$cat f2
c++ programming
linux OS
Unix OS
$cat f3
c programming
linux OS
$
✓ To compare the content of file f1 and f2 ,we can write the command like this:
$cmp f1 f2
f1 f2 differ: byte 2, line 1
$
The output shows that both files are mismatch at 2" character of 1* line.
✓ If you apply command as follow:
$cmpfl #second file is standard input
c programming < enter>
<ctrl+d>
file l - differ: byte 2, line 1
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 8|Page
Unix & Shell Programming Unit-4
$
It takes standard input until user press < ctrl+d> and then display differing
values.
(2) -c(character): It displays differ byte number, line number and differ
character along with its octal value for b files.
F1 f2 differ: byte 2, line 1 is 160p 53 +
Scmp -Ic file1 file2 #display all differ characters
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 9|Page
Unix & Shell Programming Unit-4
2160р 53+
3 160 р 53+
$
(5) -s(silent) : It prints nothing for differing files; it returns exit status only.
The value of exit status is shown int table-(c.5):
(2) diff
It compares files line by line and display differences using special symbol with
instructions that these change are needed to make two files identical. The
general syntax of diff command is as follows:
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 10 | P a g e
Unix & Shell Programming Unit-4
syntax:
diff [option] from-file to-file
In the simplest case, diff compares the contents of the two files from-file and
to-file. A file name is hyphen (i.e. stands for text read from the standard
input.
✓ If from-file and to-file are regular files then diff compares its contents.
Consider the files f1 and f2 al follows:
- $cat f1 $cat f2`
hello abc
unix hello
linux unix
ds linux OS
c++ ds
$ $
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 11 | P a g e
Unix & Shell Programming Unit-4
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 12 | P a g e
Unix & Shell Programming Unit-4
$ diff -i f1 f2
$
It display nothing means both are identical files.
(2)-r(recursive):It recursively compares files of subdirectories with same name
and display nothing if identify otherwise display differences to make both files
identical.
$diff -r dir1 dir2
(3)-s: It is used to report when two files are the same.
$diff -s f1 f1 #display message if files are same
Files f1 and f1 are identical
$
(3)comm:
It compares two sorted files line by line. It outputs three columns:
1. Lines that are only in the first file.
2. Lines that are only in the second file.
3. Lines that are in both files.
Syntax:
Comm [option] file1 file2
✓ Both files must be sorted before using the comm command. If the files are not
sorted, the output may not be accurate.
$cat file1
apple
banana
cherry
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 13 | P a g e
Unix & Shell Programming Unit-4
$cat file2
banana
cherry
date
If you run the command:
comm file1 file2
options:
(1).-1: Suppress the first column (lines unique to the first file).
$comm -1 file1 file2
(2)-2: Suppress the second column (lines unique to the second file).
$comm -2 file1 file2
(3)-3: Suppress the third column (lines common to both files).
$comm -3 file1 file2
✓ A user can also combine options and display only those lines that are common
$comm -12 file1 file2
It shows only those lines present in both the files.
Translating characters
Unix supports a tool for translating characters. This tool works on individual
characters rather than on lines.tr command provides such facility.
tr(Translating characters):
It can work on individual characters in a file. It translate, squeeze, and/or delete
characters from standard input and writes it to standard output.
Syntax:
Tr [option(s) ] expression1 [expression2] [standard-input]
Options:
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 14 | P a g e
Unix & Shell Programming Unit-4
(1).sort:
The sort command is used to sort the lines of a text file or standard input in
various ways.
Syntax:
Sort[option] [file(s)]
Options
(2).-u:It removes duplicate lines from the sorted output i.e. the output contains
only unique lines.
$sort -u f1<enter>
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 15 | P a g e
Unix & Shell Programming Unit-4
(3).-m:It merges list of files that have already have been sorted.
$sort -m f1 f2 <enter> #files f1 and f2 must be sorted
(4).-t:It sorts the file on any field delimited (separated) with sep character
(default separator is tab).
e.g Suppose you have a file named names.txt with the following content:
John
Alice
Bob
David
$sort -r names
o/p: John
David
Bob
Alice
$sort -n numbers.txt
o/p: 1
2
10
25
33
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 16 | P a g e
Unix & Shell Programming Unit-4
(7).-c:It checks whether files are already sorted.If they are,it does nothing.
$sort -c myfile
$
✓ Suppose you have a file named names with the following content:
Alice
Bob
John
David
Here file names is not sorted and you write command like this:
$sort -c names
sort: names.txt:4: disorder: David
$
(2).paste:
The paste command in Unix/Linux is used to merge lines of files side by side. It
is often used to combine columns of data from multiple files or to create a
table-like structure.
Syntax:
paste [options] file1 file2 ...
Consider the file names which contains name of the students and other file
number which contains number of the students as follow:
e.g. $cat names<enter>
kush
Nirva
Vidhi
Kavya
Jenil
$cat numbers <enter>
6851792
2460995
9225806
2771592
6876318
✓ Now,we want to merge this file side-by-side then the command is:
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 17 | P a g e
Unix & Shell Programming Unit-4
Options:
(1).-d: Specifies a delimiter to be used between columns (default is a tab).
$paste -d”|” names numbers<enter>
Kush|6851792
Nirva|2460995
Vidhi|9225806
Kavya|2771592
Jenil|6876318
(2).-s(serial): It paste one file at a time instead of in parallel i.e. it paste files
serially in line-by-line manner.
Syntax:
Uniq [OPTION]… [input file [output file]]
hello surat
hello surat
red hat linux
unix os
$
✓ If you apply both input and output file with uniq command then it writes
unique lines into the output file.
$uniq f1 f1.out
$
It creates output file f1.out which contains unique lines of input file.
$cat f1.out
C++ language
Cpp language
hello surat
red hat linux
unix os
options:
hello surat
$
(3).-D(all-repeated):It prints all duplicate lines.
$uniq -D f1
hello surat
hello surat
$
(4).-fN(skip-fields=N):It avoid first N fields of each line during comparison.
$uniq -f1 f1
C++ language
hello surat
red hat linux
unix os
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 20 | P a g e
Unix & Shell Programming Unit-4
(2).wc
The wc (word count) command in Unix/Linux is used to count the number of
lines, words, and characters in a file or from standard input. It provides a
simple way to gather basic statistics about the content of files.
Syntax:
wc [options] [file ...]
whithout any argument , it takes standard input until user press <ctrl+d> and
display number of lines,words and characters on a screen.
$wc<enter>
This is wc command<enter>
<ctrl+d>
1 4 18
$
Options:
(1).-l (lines):It prints the new -line characters counts.
✓ For example , to count number of new-line character then command is:
$wc -l f1
2 f1
$
(2).-L(max-line-length): It prints the length of the longest line.
✓ If you write command like this:
$wc-Lf1
26 f1
$
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 21 | P a g e
Unix & Shell Programming Unit-4
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 22 | P a g e
Unix & Shell Programming Unit-4
(i) -a(append): It does not overwrite. The output is appended to the given
file.
The user also wishes to preserve the user's list in a file called alluser and also
display list of logged-in use screen then the command is:
$who | tee alluser < enter>
✓ To display both, list of logged in users as well as their counts on a screen then
the command is:
$who | tee/dev/tty| wc-1
bca pts/0 2013-09-02 09:35
bca63 pts/l 2013-09-0213:29
bharat pts/2 2013-09-02 13:29,
3
$
✓ It is also useful to create a new file like this:
$ tee t1
unix and shell programmong <enter>
unix and shell programmong
red hat linux<enter>
red hat linux
<ctrl+d>
$cat t1
unix and shell programmong
red hat linux
Here, whatever you enter from standard input will be display on the standard
output and simultaneo writes on a file tl until user press
< ctrI+d>.
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 23 | P a g e
Unix & Shell Programming Unit-4
Introduction
Till now, we discussed different filter utilities. In this chapter, we will learn
about a very powerful filter utility known as grep. The grep stands for globally
search a regular expression and print it. It is also known as pattern matching
utility. Itis used to search a file for a particular pattem of characters, and
display all records/lines that contain a pattern. The pattern that is searched in
the file is referred to as the regular expression.
Pattern matching utility: grep
It is a filter utility that performs various tasks as follow:
✓ It scans a file for the occurrences of a pattern and displays lines in which
scanned pattern is found.
✓ It scans a file for the occurrences of a pattern and displays lines in which
scanned pattern does not found.
✓ It scans files for the occurrences of a patter and displays name of files which
contains a pattern in them.
✓ It also displays count of lines which contains pattern.
The general syntax for the grep command is as follows:
syntax:
grep [options] pattern [filename(s)]
It is use to select and extract lines from a file and print only those lines that
match a given patter. In the above syntax square bracket indicates optional
part. The filename(s) and options are optional and pattern is compulsory in
the grep command. Here, a pattern is a simple string or more complex which
contains metacharacters, a special character for pattern matching. A pattern
is also known as regular expression.
Without a filename grep expects standard input. As a line is input, grep
searches for the regular expression in the line and displays the line if it
contains that regular expression. Execution stops when the user indicates end
of input by pressing <ctrl+ d›.
✓ For example, a user supply a command at shell prompt as follow:
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 24 | P a g e
Unix & Shell Programming Unit-4
$ grep'unix'
unix and shell programming < enter>
unix and shell programming red hat linux<enter>
unix OS< enter>
unix OS
<ctrl+ d>
$
grep requires an expression to represent the pattern to be searched for,
followed by one or more filenames.
The first argument is always treated as the expression, and the other
arguments are considered as filenames.
Specifying regular expression:
A regular expression is a pattern that describes a set of strings. Regular
expressions are constructed analogously to arithmetic expressions, by using
various operators to combine smaller expressions.
An expression formed with some special and ordinary characters, which is
expanded by a command, and not by the shell to match more than one
string. A regular expression is always quoted to prevent its interpretation by
the shell. Regular expressions can be used to specify very simple patterns of
characters to highly complex ones. Some very simple patterns are shown in
table-(a.12):
Table-(a.12): Example of simple regular expression
Regualar Meaning
expression
A It display all lines that contain character “A”.
“Unix” It display all lines that conatain pattern “Unix”
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 25 | P a g e
Unix & Shell Programming Unit-4
$ cat f1
sco Unix
The red hat linux user name user 1
$
If a user wish to display lines of file fl which contains pattern 'Unix' then the
command is as follow:
$grep Unix fl
sco Unix
$
✓ If you want to locate lines of file fl which contains character'' then the
command is as follow:
$ grep x f1
sco Unix
The red hat linux
$
More complex regular expressions can be specified by the grep's metacharacters,
always written in the quotes, shown in table-(b.12).
Table-(b.12): grep metacharacters
Character use
[…] or […] It matches any one single character within a square bracket.
^pattern It matches a pattern at the beginning of each line.
Pattern$ It matches a pattern at the end of each line.
.(dot) It matches any single character except new-line character.
\(backslash) It indicates that grep should ignore the special meaning of the
character following it in regular expression
\<pattern It matches a pattern at the beginning of any word in a line.
Pattern\> It matches a pattern at the end of any word in a line.
ch* It matches zero or more occurrence of character ch.
ch\{m\} The preceding character ch is occurred m-times.
ch\{m,\} The preceding character ch is occurred at least m times.
ch\{m,n\} The preceding character ch is occurredbetween m and n times.
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 26 | P a g e
Unix & Shell Programming Unit-4
Unix OS
vb.net
program and process
$
Here, file f2 contains blank and non-blank line. If you wish to remove blank line
from the output then the command is:
$grep ’.’ f2
Unix and shell programming
red hat linux
Unix OS
vb.net
program and process
$
It displays all lines which contains any character in a line except blank-line (contains
only new line character).
✓ You can protect special meaning of grep metacharacter using back-slash. For
example, a user wish to display lines which contains'! character anywhere in a
line then the command is as follow:
$ grep ‘\.’ f2
vb.net
$
It displays lines which contains '.’ in a line.
✓ To display lines of file /2 which contains pattern 'program' then the command
is as follow:
$ grep 'program' f2
Unix and shell programming
program and process
$
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 28 | P a g e
Unix & Shell Programming Unit-4
But, if you want to display lines of file f2 which contains word 'program' that means
it is not a part of any string then the command is as follow:
$grep ‘\<program|> ‘f2
program and process
$
✓ The * (asterisk) refers to the immediately preceding character. It matches
zero or more occurrences of previous character. The pattern a* matches a
null string, single character a* and any number of as.
i.e. (nothing) a aa aaa aaaa …..
✓ A user can locate lines which contains characters repeated more than one
times then the command is:
$grep 'mm*’ f2
Unix and shell programming
program and process
$
It locates lines in which character 'm' repeated one or more times.
✓ You can display lines of file fl which contains exact 8 characters then the
command is as follow:
Sgrep’^. \{8}$’ f1
sco Unix
$
✓ To display lines of input file which contains characters between 5 and 15 then
the command is like this:
$grep ‘^.\{5,i5\}$’ f2
red hat linux
Unix Os
vb.net
It displays lines of file f2 that contains character between 5 and 15.
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 29 | P a g e
Unix & Shell Programming Unit-4
✓ To display lines which contains pattern at the beginning of line would occur in
the same line anywhere then
you can use save operator with back references as follow:
$grep '^(.\). *\1'f2
program and process
It displays lines of file f2 which contains any character occur at the beginning of line
would also occur anywhere in the same line. The output shows that 1st character ‘p'
occur in the same line therefore we get such output.
✓ grep is silent and simply returns the prompt when a pattern is not found in a
file.
$grep hello f2 #No hello found
It displays nothing that means hello pattern do not present in file f2.
✓ grep also accept output of other command. For example, a user want to
display filenames of working directory having permission read and write to
owner, group and other user then the command is like this:
$ ls-lgrep ‘^rw-rw-rw-'
-rw-rw-rw- 1 bharat bharat 43 Apr 317:25 f1
-rw-rw-rw-2 bcal tybcasems 77 Apr 411:02 f1.In
-rw-rw-rw-2 bcal tybcasem5 77 Apr 411:02 12
-rw-rw-rw- 1 bhrat bhrat 34 Jul 18 2013 f3
✓ When grep is used with a series of strings, it interprets the first argument as
the pattern and the rest as filenames along with the output. For example,
consider a command as follow:
$grep red hat linux
It indicates that argument red is considered as pattern and other arguments
hat and linux are considered as filenames.
✓ Quote is compulsory when a pattern contains more than one word. For
example, consider the following command:
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 30 | P a g e
Unix & Shell Programming Unit-4
$grep – cv ’^$' f2
5
$
It counts all non-empty lines (contains only new-line character) of file f2.
✓ Consider another command as follow:
$grep -v 'Unix' f1
The red hat linux
user name user1
$
It displays lines of file f1 which do not contains Unix pattern.
(5)-i (ignore): It ignores case in pattern matching.
✓ For example, you want to print lines that contains pattern unix in any case
then the command is as follow:
$grep-i 'unix'fl
sco Unix
$
It displays lines of file f1 having unix pattern in any case.
(6) -h (hide): It omits filenames when handling multiple files.
✓ For example, consider an example as follow:
$grep-h'Unix'fl f2
sco Unix
Unix and shell programming
Unix OS
$
It displays lines of files fl and f2 which contains pattern Unix. It does not display
filename before, matched line i.e. it hides name of a file.
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 33 | P a g e
Unix & Shell Programming Unit-4
(7)-e Reg Exp : You can specity regular expression with this option. You can use this
option multiple times.
✓ For example, you want to locates lines of file which contains patter either
Unix or linux then the command is as follow:
$ grep-e 'Unix'-e 'linux' f2
Unix and shell programming
red hat linux
Unix OS
$
(8) -f fname: A list of strings to be match is stored in file name.
✓ For example, consider a patfile as follow:
$ cat patfile
Unix linux
$
It contains list of pattern in a separate lines. Now, we want to locate lines of file f1
that contains any of the pattern given in file patfile then the command is as follow:
$ grep-f patfilefl
sco Unix
The red hat linux
$
Grep family
There is a small family of grep utility which includes egrep and fgrep. These two
utilities operate in a similar way to grep but each has its own particular usage, and
there are small differences in the way that each work.
Both utilities search for specific pattern in either the standard input stream or a
series of input files supplied at command-line.
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 34 | P a g e
Unix & Shell Programming Unit-4
egrep
egrep stands for extended grep. It was invented by Alfred Aho. It extends grep's
pattern-matching capabilities in two major ways.
✓ It admits alternates
✓ It enables regular expressions to be bracketed/grouped using the pair of
parenthesis (i.e. (...)), also known as factoring.
It offers all the options and regular expression metacharacters of grep, but its most
useful feature is the facility to specify more than one pattern for search. While grep
uses some more characters that are not recognized by egrep, egrep includes some
additional extended metacharacters not used by either grep or sed utilities that are
given in table-(c.12).
Expression Meaning
ch+ It matches one or more occurrence of character ch.
ch? It matches zero or one occurrence of character ch.
exp1\exp2 It matches expression exp1 or exp2.
(x1\x2)x3 It matches expression x1x3 or x2x3.
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 35 | P a g e
Unix & Shell Programming Unit-4
$
It displays lines of file f1 which contains pattern either Unix or linux.
✓ Sometimes, you want to display lines which contains either software or
hardware then the command is as follow:
$egrep "(soft| hard) ware" f1
NOTE:In grep,if a pattern contains some special characters then it must be quoted.
-foption: Storing pattern in a file
egrep provides a facility to take patterns from a file. If there are number of pattern
that you have to match; egrep offers the -f (file) option to take such patterns from
the file. For example, a file patfile contains patterns in which each pattern is
delimited by'|’ as follow:
$ cat patfile
Unix linux
$
Now, you can execute egrep with the -foption in this way:
$egrep -f patfile f1
sco Unix
The red hat linux
$
Here, the command takes the pattern/expression from file patfile and display
matched lines of file f1.
fgrep
fgrep stands for fixed/fast grep. The fgrep utility can normally only search for fixed
strings i.e. character string without embedded metacharacters. However, some
implementations of the fgrep utility allow it to be used with a few metacharacters -
check your version to make sure. fgrep accepts multiple patterns, both from the
command line and a file, but unlike grep and egrep, does not accept regular
expressions. So, if the pattern to be search is a simple string, or a group of them,
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 36 | P a g e
Unix & Shell Programming Unit-4
fgrep is recommended. It is arguably faster than grep and egrep, and should be
used when using fixed strings.
Alternative patterns in fgrep are specified by separating one pattern from another
using the new-line character. This is unlike in egrep, which uses the '|’ to delimit
two expressions. You may either specify these patterns in the command line itself,
or store them in a file.
✓ For example consider a file patfile which contains list of pattern delimited by
new-line character as follow:
$ cat patfile
Unix
linux
$
We can use this file using -f option as follow:
$fgrep -f patfile f1
sco Unix
The red hat linux
$
✓ You can achieved same output without using file patfile by supplying patterns
at command-line as follow:
$fgrep ‘Unix <enter>
> linux'f1 < enter>
sco Unix
The red hat linux
$
✓ The disadvantage with grep family is that none of them has separate facilities
to identify fields. This limitation is overcome by awk utility.
Limitation of grep family:
The grep family has following limitation.
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 37 | P a g e
Unix & Shell Programming Unit-4
As shown in table-(d.12), both grep and egrep utilities allows all the atoms in
regular expression whereas fgrep utility supports only character atom.
Similarly, table-(e. 12) shows the operators used in regular expression by grep
family:
Table-(e.12): Operators used by grep family
Operators grep fgrep egrep
Sequence ✓ ✓ ✓
Repetition ✓ X ✓
Altermation X X ✓
Group X X ✓
Save ✓ X ✓
Table-(e.12) indicates that grep utility supports sequence, repetition and save
operators, egrep utility supports all operators but fgrep utility supports only
sequence operator.
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 38 | P a g e
Unix & Shell Programming Unit-4
Sed
So far, we discussed many filter utilities. In this chapter, we will discuss multi-
purpose filter utility known as sed. The term sed stands for stream editor and was
designed by Lee McMohan. Stream editor i.e. sed is derived from ed, known as line
editor. Everything in sed is an instruction. The general form of this utility is:
Syntax:
sed [options] instruction [filename(s)]
An instruction consists of two components an address and a command which are
enclosed within quotes. The address selects/searches the line to be processed or
not processed by the command. The command indicates the action that sed is to
apply to each input line that matches the address.
Addresses
An address may be a line numbers), pattern(s) or combination of them. The sed
supports two types of addresses.
✓ Line number.
✓ A pattern.
Line address can be a single line, a set of line or range of line. A single line address
can be defined by line number or a dollar ($). A dollar is a special symbol and it
specifies the last line in the input file. Some of the examples of a single line address
are given in table-a. 13):
Table-(a.13): Example of single line address
address Meaning
5command It applies command on 5th line.
$command It applies command on last line.
15command It applies command on 15th line.
A set of line address allows you to specify more than two lines and may be
consecutive or alternate in the input file. A set of line address can be defined by a
pattern or regular expression. When you use pattern or regular expression as line
address then it must be enclosed within front-slashes. A process of specifying a
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 39 | P a g e
Unix & Shell Programming Unit-4
pattern within the slashes is known as context address Table-(b.13) denotes the
example of set of line address used by a pattern.
Table-(b.13): Example of set of line address
/pattern/command It applies command on lines that contains pattern.
/^pattern/command It applies command on lines which begins with pattern.
/pattern$/command It applies command on lines which ends with pattern.
If,a user want to acess a set of consecutive lines of input files then range of address
is given.You can define range of address by start address followed by comma with
no space followed by end address.The start address and end address may be line
number or a pattern or combination of them.Table-(c.13) shows example of range
addresses:
Table-(c.13):Example of range of address
Address Meaning
5,10command It applies command on lines between 5 and 10
of input file.
5,$command It applies command on lines between 5 and last
line of input file.
2,/pattern/command It applies command on 2nd line up to first
occurrences of line which contains pattern.
/pattern/,10command It applies command on line which contains
pattern up to 10th lines of input file.
/pattern1/,/pattern2/command It applies command on those lines which are
occurring between pattern1 and pattern2 of
input files.
Note: sed does not change the input file. All modified output is written to standard
output and to be saved must be redirected to a file.
Commands
The sed support several commands. Commands are used to apply action on
specified lines. They are categorized as follow:
✓ Print v Quit
✓ Line number
✓ Modify
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 40 | P a g e
Unix & Shell Programming Unit-4
✓ Files
✓ Substitute
print(p) command:
This denoted by character p. It prints selected lines on a standard output. Consider
an input file f1 as follow:
$cat f1
unix and shell programming
red hat linux
linux and shell programming
unix operating system
linux is open source
$
✓ If a user wants to print top -lines of input file then the command is like this:
$sed'1,3p'f1
unix and shell programming
unix and shell programming
red hat linux
red hat linux
linux and shell programming
linux and shell programming
unix operating system
linux is open source
$
The output shows that by default sed utility reads entire file so it prints all lines on
the standard output as well as specified lines that are affected by the command p.
In other words, the addressed lines are printed twice. To overcome the problem of
printing duplicate lines, you can use -n option whenever you use the p command.
Therefore the above command is rewritten as follow:
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 41 | P a g e
Unix & Shell Programming Unit-4
$sed-n'1,3p' f1
unix and shell programming
red hat linux
linux and shell programming
$
✓ A special character dollar ($) is used to print last line of an input file. For
example, to print the last line of file f1 then the command is:
$sed-n'$p'f1
linux is open source
$
✓ A command p without any address, displays all lines of input file by default.
$sed-n p f1
It displays all lines of input file f1.
✓ Reversing line selection criteria (!): A user can use negation operator (i.e. !)
with any command of sed utility. So, selecting first 3-lines means not
selecting lines from fourth line to last line of input file. Therefore the
command is:
$sed -n'4,S!p'file1 OR
sed-n'1,3p' file1
✓ To select non-contiguous groups of lines of input file then the command is as
follow:
$sed-n'1,3p
> 7,9p #It select lines 1 to 3, 7 to 9
> $p'fl # and last line of file f1
It displays 1 to 3, 7 to 9 and last lines of input file f1.
✓ In addition, a user can get similar output by incinding inline script/expression
using -e option.This option allows user to include multiple instruction with
sed utility. Therefore, the above command is rewritten in a single line as
follow:
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 42 | P a g e
Unix & Shell Programming Unit-4
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 43 | P a g e
Unix & Shell Programming Unit-4
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 44 | P a g e
Unix & Shell Programming Unit-4
✓ Insert bank-line before each line of an input file then the command is:
$sed 'i\ < enter> or $sed'il <enter>
> <enter> > ‘f1
> ‘f1
It inserts blank-lines before each line of input file f1.
b) Append command(a):
It is denoted by a character a. It is similar to the insert command except that
it writes the text directly to the output after the specified line.
Insert 2-lines at the end of file f1 then the command is:
$sed'$a\<enter>
> unix is portable operating system Kenter >
> It is designed to facilitate programming, text processing and comm.
✓ A user want to redirect the output of a command in to another file then the
command is:
$sed’$a\<enter>
> unix is portable operating system Kenter>
> It is designed to facilitate programming, text processing and comm.
> 'f1 >f1.out
It creates an output file f1.out.
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 45 | P a g e
Unix & Shell Programming Unit-4
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 47 | P a g e
Unix & Shell Programming Unit-4
✓ To write selected lines of input file to output file fI. out then the command is
as follow:
$sed-n’/unix/wf1.out'f1
It writes lines of file f1 to file f1.out that contains unix pattern.
✓ To write top 5-lines of input file 1 to output file then the command is like this:
$sed'1, 5w f2'f1
✓ A user can create multiple output files that contain selected lines of input file.
$sed -n'/linux/w lfile <enter>
> /unix/w ufile f1
Or
sed -ne '/linux/w lfile' -e ‘/unix/w ufile' f1
ufile.
It writes lines that contains patten linux to tile file and lines that contains
pattern unix to a ufile.
Now, you can use this instruction using -f option of sed utility as follow:
$sed-n-finstr.txt f1
It creates two output files ulist and list which contains lines having unix pattern
and linux pattern respectively.
✓ A user can also use more than one instruction files by repeating -f option with
each instruction file like this:
$sed-n-finstr1. txt-finstr2.txtf1
✓ You can combine the -e and -f options as many times as you want.
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 48 | P a g e
Unix & Shell Programming Unit-4
If the address is not specified, the substitution will be performs for all lines
containing first occurrences of search_ patter. A search_pattern may be a
regular expression or literal string. Both search_pattern and replace string are
delimited by slash (/). The replace_string is a string that consists of either
ordinary characters or an atom or meta-characters or combination of them.
Only a back reference atom and meta-characters such as ampersand (&) and
back slash (1) can be used in a replace string. We will discuss these tokens later
on in this section.
✓ To replaces first occurrences of word unix in each line by word linux in a file f1
then the command is:
$sed 's/unix/linux/' f1
✓ To replaces first occurrences of word unix in each line by word linux in top 5-
lines of file f1 then the command is:
$sed '1, 5s/unix/linux/' f1
Flag (g):
We know that the following command replaces first occurrences of the unix by
linux in each line of an input file f1 then the command is as follow:
$sed-n's/unix/linux/p' f1
To replace all occurrences unix with linux, a user need to use the g(global) flag
at the end of the instruction. This is referred to as global substitution. A global
(g) flag replaces all occurrences of search pattern with replace_string.
✓ For example, to replace all occurrences of unix with linux in each line of file f1
then the command is:
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 49 | P a g e
Unix & Shell Programming Unit-4
$sed's/unix/linux/g' f1
It replaces all occurrences of hello with bye in selected lines. Here replacement
occurs between the start line which contains a pattern unix up to the line which
contains a pattern linux.
Remembered pattern:
Sometimes, an address pattern is similar to search_pattern in other words
scanned pattern and search_pattern are same then we can ignore search
pattern in an instruction part. For example, user wishes replace word unix with
word linux in those lines of file f1 that contains unix pattern then the command
is:
$sed'/unix/s/unix/linux/' f1
In this example, both address pattern and search pattern are same. So, if you
ignore search pattern then the above command is re-written as follow:
$sed /unix/s//linux/' f1
The two front slashes (i.e. //) represents an empty or null regular expression
which is interpreted as the scanned pattern and search pattern are the same.
We will call it the remembered pattern.
Therefore, another alternative to write the above command is like this:
$sed's/unix/linux/’ file1
✓ However, when a user can use // in the replace_string then it removes the
patten from the output.
For example, to remove all unix words from file f1 then the command is:
$sed's/unix//g' f1 Or $sed 'unix/s///g' file
✓ Sometimes, an address pattern, search pattern and replace string may also be
different string.
For example, consider the following command:
$sed-n'/The unix/s/unix/UNIX/gp'f1
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 50 | P a g e
Unix & Shell Programming Unit-4
It selects lines that contain We The unit (i.e. address patter) and replace each
unix (i.e. search pattern) with UNIX (i.e. replace string).
Repeat pattern:
There might be a situation when a search pattern occurs in a replace_string. To
repeat search_pattern in replace_string a special meta-character ampersand
(i.e. &) is used. For example, to replace pattern director with in-charge director
then we can write command as follow:
✓ Display the list of files of working directory which have write permission set to
either group of others then the command is as follow:
$ls-1|grep "^.\{5,8\}w"
BRE or IRE
Sometimes, a user wishes to print those lines that containing characters that
occurs number of times in a line or locate fixed length of lines. This is possible
with BRE or IRE. So we can define BRE or IRE as it an expression that consists
of character and a single or pair of numbers enclosed within a pair of escape
curly braces (i.e. \{\}).
This expression derived from ed, and takes the four forms as follow:
(i) Ch\ {m\}: It indicates that character ch occurs m-times.
(ii) Ch\ {m,n\}: It indicates that character ch occurs between m and n times.
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 51 | P a g e
Unix & Shell Programming Unit-4
✓ Instead of write 50-dots to locate lines having more than 50-characters, we can
use IRE as follow:
$sed-n’/.\ {51\}/p' f1
It prints all lines longer than 50 characters. Here the expression \{51\} specifies
that the any character (i.e. dot for any character) has to occur 51 times.
✓ To display all lines having length between 51 and 100 characters then the
command is:
$sed -n ‘/^.\{51,100\}$/p’ f1
✓ Display a lines that consist of only alphabets then the command is:
$sed—n ‘/^[a-zA-Z]\{1,\}$/p' f1
✓ To replace all consecutive space by single space, use the regular expression as
follows.
$sed-n 's/[]\{2,\}//gp' f1
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 52 | P a g e
Unix & Shell Programming Unit-4
✓ For example, a user wishes to replace the word new line by new. line. Then the
command is
$echo "new line" | sed's/\(new\) (line\)/\1-\2/'
Here, we have two tagged patterns \(new\) and \(line\) in the search_pattern.
They are automatically reproduced in the replace_string back references \1
and \2, respectively. Each escaped pattern is called a Tagged Regular
Expression (TRE).
The search pattern of sed utility uses only a subset of the regular expression
atoms and patterns. The allowable atoms are listed in table-d. 13):
Table-(d.13): atoms used by sed utility
Atoms Allowed
Character v
Dot v
Class v
Anchors ^ and $
Back Reference v
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 53 | P a g e
Unix & Shell Programming Unit-4
The last column of table shows that sed utilty supports all atoms except two
anchors \<and \>.
Operators Allowed
Sequence v
Repetition V
Alternation X
Group X
Save v
The second column of table shows that sed utility supports all operators except
group and alternation.
AWK
he awk command is used for text processing in Linux. Although, the sed
command is also used for text processing, but it has some limitations, so the
awk command becomes a handy option for text processing. It provides
powerful control to the data.
The Awk is a powerful scripting language used for text scripting. It searches
and replaces the texts and sorts, validates, and indexes the database.
It is one of the most widely used tools for the programmer, as they write the
scaled-down effective program in the form of a statement to define the text
patterns and designs.
Basic Syntax
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 54 | P a g e
Unix & Shell Programming Unit-4
If pattern is omitted, the action is applied to every line. If action is omitted, awk prints the
lines that match the pattern.
Example:
Alice 30 Engineer
Bob 25 Artist
Carol 28 Scientist
2. Basic Examples
This command prints the first and third fields of each line:
Copy code
Alice Engineer
Bob Artist
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 55 | P a g e
Unix & Shell Programming Unit-4
Carol Scientist
Alice 30 Engineer
Conditional Actions:
This command prints the names and ages of people older than 26:
Alice 30
Carol 28
3. Field Separator
John,Doe
Jane,Smith
John Doe
Jane Smith
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 56 | P a g e
Unix & Shell Programming Unit-4
Output:
5. Built-in Variables
Example
awk '{ print "Line", NR, "has", NF, "fields" }' data.txt
Output:
1. User-Defined Functions
awk '
function square(x) { return x * x }
{ print $1, "squared is", square($2) }
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 57 | P a g e
Unix & Shell Programming Unit-4
' data.txt
This defines a function square to compute the square of a number and applies it to the second
field.
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 58 | P a g e
Unix & Shell Programming Unit-
4
Addition
It is represented by plus (+) symbol which adds two or more numbers.
The following example demonstrates this −
Example
[jerry]$ awk 'BEGIN { a = 50; b = 20; print "(a + b) = ", (a + b) }'
On executing this code, you get the following result −
Output
(a + b) = 70
Subtraction
It is represented by minus (-) symbol which subtracts two or more
numbers. The following example demonstrates this −
Example
[jerry]$ awk 'BEGIN { a = 50; b = 20; print "(a - b) = ", (a - b) }'
On executing this code, you get the following result −
Output
(a - b) = 30
Multiplication
It is represented by asterisk (*) symbol which multiplies two or more
numbers. The following example demonstrates this −
Example
[jerry]$ awk 'BEGIN { a = 50; b = 20; print "(a * b) = ", (a * b) }'
On executing this code, you get the following result −
Output
(a * b) = 1000
Division
It is represented by slash (/) symbol which divides two or more numbers.
The following example illustrates this −
Example
[jerry]$ awk 'BEGIN { a = 50; b = 20; print "(a / b) = ", (a / b) }'
59
Unix & Shell Programming Unit-
4
Output
(a / b) = 2.5
Modulus
It is represented by percent (%) symbol which finds the Modulus division
of two or more numbers. The following example illustrates this −
Example
[jerry]$ awk 'BEGIN { a = 50; b = 20; print "(a % b) = ", (a % b) }'
On executing this code, you get the following result −
Output
(a % b) = 10
Simple Assignment
It is represented by =. The following example demonstrates this –
Example
[jerry]$ awk 'BEGIN { name = "Jerry"; print "My name is", name }'
On executing this code, you get the following result −
Output
My name is Jerry
Shorthand Addition
It is represented by +=. The following example demonstrates this −
Example
[jerry]$ awk 'BEGIN { cnt = 10; cnt += 10; print "Counter =", cnt }'
On executing this code, you get the following result −
Output
Counter = 20
60
Unix & Shell Programming Unit-
4
Shorthand Subtraction
It is represented by -=. The following example demonstrates this –
Example
[jerry]$ awk 'BEGIN { cnt = 100; cnt -= 10; print "Counter =", cnt }'
On executing this code, you get the following result −
Output
Counter = 90
In the above example, the first statement assigns value 100 to the
variable cnt. In the next statement, the shorthand operator decrements
its value by 10.
Shorthand Multiplication
It is represented by *=. The following example demonstrates this –
Example
[jerry]$ awk 'BEGIN { cnt = 10; cnt *= 10; print "Counter =", cnt }'
On executing this code, you get the following result −
Output
Counter = 100
In the above example, the first statement assigns value 10 to the
variable cnt. In the next statement, the shorthand operator multiplies its
value by 10.
Shorthand Division
It is represented by /=. The following example demonstrates this −
Example
[jerry]$ awk 'BEGIN { cnt = 100; cnt /= 5; print "Counter =", cnt }'
On executing this code, you get the following result −
Output
Counter = 20
61
Unix & Shell Programming Unit-
4
In the above example, the first statement assigns value 100 to the
variable cnt. In the next statement, the shorthand operator divides it by
5.
Shorthand Modulo
It is represented by %=. The following example demonstrates this −
Example
[jerry]$ awk 'BEGIN { cnt = 100; cnt %= 8; print "Counter =", cnt }'
On executing this code, you get the following result −
Output
Counter = 4
Shorthand Exponential
It is represented by ^=. The following example demonstrates this −
Example
[jerry]$ awk 'BEGIN { cnt = 2; cnt ^= 4; print "Counter =", cnt }'
On executing this code, you get the following result −
Output
Counter = 16
The above example raises the value of cnt by 4.
Shorthand Exponential
It is represented by **=. The following example demonstrates this −
Example
[jerry]$ awk 'BEGIN { cnt = 2; cnt **= 4; print "Counter =", cnt }'
On executing this code, you get the following result −
Output
Counter = 16
This example also raises the value of cnt by 4.
Equal to
62
Unix & Shell Programming Unit-
4
Example
awk 'BEGIN { a = 10; b = 10; if (a == b) print "a == b" }'
On executing this code, you get the following result −
Output
a == b
Not Equal to
It is represented by !=. It returns true if both operands are unequal,
otherwise it returns false.
Example
[jerry]$ awk 'BEGIN { a = 10; b = 20; if (a != b) print "a != b" }'
On executing this code, you get the following result −
Output
a != b
Less Than
It is represented by <. It returns true if the left-side operand is less than
the right-side operand; otherwise it returns false.
Example
[jerry]$ awk 'BEGIN { a = 10; b = 20; if (a < b) print "a < b" }'
On executing this code, you get the following result −
Output
a<b
Greater Than
63
Unix & Shell Programming Unit-
4
Logical AND
It is represented by &&. Its syntax is as follows −
Syntax
expr1 && expr2
It evaluates to true if both expr1 and expr2 evaluate to true; otherwise it
returns false. expr2 is evaluated if and only if expr1 evaluates to true. For
instance, the following example checks whether the given single digit
number is in octal format or not.
Example
[jerry]$ awk 'BEGIN {
num = 5; if (num >= 0 && num <= 7) printf "%d is in octal format\n",
num
}'
On executing this code, you get the following result −
Output
5 is in octal format
Logical OR
It is represented by ||. The syntax of Logical OR is −
Syntax
64
Unix & Shell Programming Unit-
4
expr1 || expr2
It evaluates to true if either expr1 or expr2 evaluates to true; otherwise
it returns false. expr2 is evaluated if and only if expr1 evaluates to false.
The following example demonstrates this −
Example
[jerry]$ awk 'BEGIN {
ch = "\n"; if (ch == " " || ch == "\t" || ch == "\n")
print "Current character is whitespace."
}'
On executing this code, you get the following result −
Output
Current character is whitespace
Logical NOT
It is represented by exclamation mark (!). The following example
demonstrates this −
Example
! expr1
It returns the logical compliment of expr1. If expr1 evaluates to true, it
returns 0; otherwise it returns 1. For instance, the following example
checks whether a string is empty or not.
Example
[jerry]$ awk 'BEGIN { name = ""; if (! length(name)) print "name is empty
string." }'
On executing this code, you get the following result −
Output
name is empty string.
65
Unix & Shell Programming Unit-
4
If statement
It simply tests the condition and performs certain actions depending
upon the condition. Given below is the syntax of if statement –
Syntax
if (condition)
action
We can also use a pair of curly braces as given below to execute multiple
actions –
Syntax
if (condition) {
action-1
action-1
.
.
action-n
}
For instance, the following example checks whether a number is even or
not −
Example
[jerry]$ awk 'BEGIN {num = 10; if (num % 2 == 0) printf "%d is even
number.\n", num }'
On executing the above code, you get the following result −
Output
10 is even number.
If Else Statement
In if-else syntax, we can provide a list of actions to be performed when a
condition becomes false.
The syntax of if-else statement is as follows −
Syntax
if (condition)
66
Unix & Shell Programming Unit-
4
action-1
else
action-2
In the above syntax, action-1 is performed when the condition evaluates
to true and action-2 is performed when the condition evaluates to false.
For instance, the following example checks whether a number is even or
not −
Example
[jerry]$ awk 'BEGIN {
num = 11; if (num % 2 == 0) printf "%d is even number.\n", num;
else printf "%d is odd number.\n", num
}'
On executing this code, you get the following result −
Output
11 is odd number.
If-Else-If Ladder
We can easily create an if-else-if ladder by using multiple if-
else statements. The following example demonstrates this −
Example
[jerry]$ awk 'BEGIN {
a = 30;
if (a==10)
print "a = 10";
else if (a == 20)
print "a = 20";
else if (a == 30)
print "a = 30";
}'
On executing this code, you get the following result −
Output
a = 30
67
Unix & Shell Programming Unit-
4
AWK - Loops
This chapter explains AWK's loops with suitable example. Loops are used
to execute a set of actions in a repeated manner. The loop execution
continues as long as the loop condition is true.
For Loop
The syntax of for loop is –
Syntax
for (initialization; condition; increment/decrement)
action
Example
[jerry]$ awk 'BEGIN { for (i = 1; i <= 5; ++i) print i }'
While Loop
The while loop keeps executing the action until a particular logical
condition evaluates to true. Here is the syntax of while loop –
Syntax
while (condition)
action
68
Unix & Shell Programming Unit-
4
AWK first checks the condition; if the condition is true, it executes the
action. This process repeats as long as the loop condition evaluates to
true. For instance, the following example prints 1 to 5 using while loop –
Example
[jerry]$ awk 'BEGIN {i = 1; while (i < 6) { print i; ++i } }'
On executing this code, you get the following result −
Output
1
2
3
4
5
Do-While Loop
The do-while loop is similar to the while loop, except that the test
condition is evaluated at the end of the loop. Here is the syntax of do-
whileloop –
Syntax
do
action
while (condition)
In a do-while loop, the action statement gets executed at least once
even when the condition statement evaluates to false. For instance, the
following example prints 1 to 5 numbers using do-while loop –
Example
[jerry]$ awk 'BEGIN {i = 1; do { print i; ++i } while (i < 6) }'
On executing this code, you get the following result −
Output
1
2
3
4
5
69
Unix & Shell Programming Unit-
4
Break Statement
As its name suggests, it is used to end the loop execution. Here is an
example which ends the loop when the sum becomes greater than 50.
Example
[jerry]$ awk 'BEGIN {
sum = 0; for (i = 0; i < 20; ++i) {
sum += i; if (sum > 50) break; else print "Sum =", sum
}
}'
On executing this code, you get the following result −
Output
Sum = 0
Sum = 1
Sum = 3
Sum = 6
Sum = 10
Sum = 15
Sum = 21
Sum = 28
Sum = 36
Sum = 45
Continue Statement
The continue statement is used inside a loop to skip to the next iteration
of the loop. It is useful when you wish to skip the processing of some
data inside the loop. For instance, the following example
uses continue statement to print the even numbers between 1 to 20.
Example
[jerry]$ awk 'BEGIN {
for (i = 1; i <= 20; ++i) {
if (i % 2 == 0) print i ; else continue
}
}'
On executing this code, you get the following result –
70
Unix & Shell Programming Unit-
4
Output
2
4
6
8
10
12
14
16
18
20
Exit Statement
It is used to stop the execution of the script. It accepts an integer as an
argument which is the exit status code for AWK process. If no argument
is supplied, exit returns status zero. Here is an example that stops the
execution when the sum becomes greater than 50.
Example
[jerry]$ awk 'BEGIN {
sum = 0; for (i = 0; i < 20; ++i) {
sum += i; if (sum > 50) exit(10); else print "Sum =", sum
}
}'
Output
On executing this code, you get the following result −
Sum = 0
Sum = 1
Sum = 3
Sum = 6
Sum = 10
Sum = 15
Sum = 21
Sum = 28
Sum = 36
Sum = 45
Let us check the return status of the script.
Example
71
Unix & Shell Programming Unit-
4
[jerry]$ echo $?
On executing this code, you get the following result −
Output
10
72