0% found this document useful (0 votes)
12 views

Unit 4 Advanced Shell Programming

Unit 4 of the Unix & Shell Programming course covers advanced shell programming concepts, focusing on filter commands that process input and output data. It discusses various commands such as head, tail, cut, and split, explaining their syntax, options, and practical applications for manipulating files. Additionally, it introduces file comparison commands like cmp and diff, detailing how to determine if files are identical or to identify their differences.

Uploaded by

siddharthch1612
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Unit 4 Advanced Shell Programming

Unit 4 of the Unix & Shell Programming course covers advanced shell programming concepts, focusing on filter commands that process input and output data. It discusses various commands such as head, tail, cut, and split, explaining their syntax, options, and practical applications for manipulating files. Additionally, it introduces file comparison commands like cmp and diff, detailing how to determine if files are identical or to identify their differences.

Uploaded by

siddharthch1612
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 72

Unix & Shell Programming Unit-4

Unit 4. Advanced Shell Programming

In This chapter, we will study about the simple filter commands .Filter is a system
commands thar accept in from standard input, processes on it and sends output to
standard output stream. A filter can always receive that accept input from a
keyboard. The input of a filter can also be obtained from a file. A filter can always
send its output always receive the monitor. The output of the filter can be
redirected to a file or any output devices. filter works with the send its output like
redirection and piping. Some filter works on field, some on line and some on
character. In this chapter works with the will discuss various types of filters.

Splitting a Files
Unix supports filter commands to split a file. A file can be split horizontally,
vertically or any combination. There are some filter commands such as head, tail,
cut etc… to split a file.

1.head
Displays the first n lines of the specified text files. If the number of lines is not
specified then by default prints first 10 lines.
Syntax:
Head [option] [filename(s)]
Options:
1.-Nc(number of characters):It prints first N-characters of file(s).here N is a positive
integer number.
✓ Consider three files f1,f2 and f3 as follow:
$cat f1
Computer science
Unix and shell programming
$cat f2
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 1|Page
Unix & Shell Programming Unit-4

hello world
hello surat
$cat f3
HELLO WORLD
$
✓ If you issue a command as follow:
$head -5c f1 f2 f3
= = > f1 < = =
compu
= = > f2 < = =
Hello
= = > f3 < = =
HELLO
$
The result shows that it display first five characters of each file.

2.-N(number of lines):It prints first N-lines instead of default 10 -lines of


file(s).Here also N is a positive integer number.

3.-q(quiet):It never print name of file as header,if more then one file is given.

$head -lq f1 f2 f3
Computer science
hello world
HELLO WORLD
$
A user can use head command for the different purpose as follow:
✓ You can display name of last modified file under the current directory like
this:
Ls -t|head -l
✓ You can display largest file present under the current directory like this:
Ls -S|head -l

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 2|Page
Unix & Shell Programming Unit-4

2.tail:It display the part of the file. The general syntax of tail command is as
follows:

Syntax:
Tail [option] [filename(s)]

✓ By default,tail display last -10 lines of the file.


$tail file1<enter>
---last 10-lines of file1---
$

It shows last 10-lines of file file1 on standard output, otherwise display


content of whole file if number lines less than equal to 10.

✓ With more than one filename, it display last 10-lines precede each with a
header as filename.
$tail file1 file2 file3<enter>
-----> file1 <-----
This is a line of file1
-----> file2 <-----
This is a line of file2
This is a end of file2
-----> file3 <-----
This is file3
$

Options:
1.-Nc(number of characters):It display last N-characters of file(s).consider the
file f1 as follow:
$cat f1
hello world
hello surat
$tail -5c f1
Surat
$

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 3|Page
Unix & Shell Programming Unit-4

2.-n(number of lines):It shows last n=lines of file(s),instead of the last 10.


$tail -l f1
hello surat
$
3.-q(quiet/silent):It never shows filename as heading for specified files. use
with more than one filename.
$tail -q f1 f2 f3

4.+n(n is number):It display lines from nth to last line of input file.
$tail +5 f1
Note: tail +n do not work with more than one file.

3.cut: cut command is useful for selecting a specific column of a file. It is used
to cut a specific sections by byte position, character, and field and writes
them to standard output. It cuts a line and extracts the text data. It is
necessary to pass an argument with it; otherwise, it will throw an error
message.

Syntax:
Cut option [file(S)]
The syntax shows that option is compulsory,Without option,cut command will
now work.Some of the options used with cut command are as follow:

1.-c: The '-c' option is used to cut a specific section by character. However,
these character arguments can be a number or a range of numbers, a list of
comma-separated numbers, or any other character.
$cat f1
Unix and shell programming
Read hat linux
Hello surat
Advanced C and DS
$cut -cl f1
U

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 4|Page
Unix & Shell Programming Unit-4

R
H
A
$

2.-d(filed-delimiter):It specifies filed delimiter.It uses the <tab> as the default


filed delimiter,but can also we can also with a different delimiter.
cut -d- -f(columnNumber) <fileName>
Consider the following commands:
1. cut -d- -f2 marks.txt
2. cut -d- -f1 marks.txt

$cat>marks
Alex-50
Alen-70
Jon-75
Carry-84
Celena-98

$cut -d- -f2 marks


50
70
75
84
98

$cut -d- -f1 marks


Alex
Alen
Jon
Carry
Celena

3.-f(filed): It is used to select the specific fields. It also prints any line that
does not contain any delimiter character, unless the -s option is specified.

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 5|Page
Unix & Shell Programming Unit-4

✓ To display 2nd and 3rd words of each line of input file, the command is:
$cut -d” “ -f2, 3 f1<enter>
and shell
hat linux
surat

✓ Without filename with -f option, it takes input from standard input.Here, it


consider default firstfefault delimiter <tab>.
$cut -f1
hello world
how are you
<ctrl+d>
hello
how
$

4.Split:It splits a file into fixed-size pieces. The general syntax of this
command is as follow:

Syntax:
It splits a file into fixed-size pieces. The general syntax of this command is as
follows:

Syntax:
Split[OPTION] [file[PREFIX]]

It creates output files containing consecutive sections of input file or standard


input if file name is given.

By default, split command writes 1000 lines of input file into each output file.
The name of output files consist specified PREFIX (x'by default) followed by a
group of letters 'aa', 'ab', and so on up to 'az' and then again from "ba', "bb'

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 6|Page
Unix & Shell Programming Unit-4

and so on up to 'bz'. In this way, there are 676 output files (26*26), the last
one having name 'ZZ’.

The following options are used with split command:


(I-line)N: It puts specified N number of lines in each output file, where N is
positive number. (ii) -N: It is similar to -IN, where N is positive number.
For example, all user file contains 50-lines and user want to split this file into
several files each contains lines per file then the command is as follows:
$split - 10 all user or $split -110 all user

It creates five files named xaa, xab, xac, xad and xae in user's current
directory. If a user want to give prefix name 'my' instead of default 'x' then
the command is:
$split -10 all user my
Then it creates five files named myaa, myab, myac, myad and myae in user's
current directory.
✓ Without any argument, it takes input from standard device and creates
output files.
$split -2-
hello
hello world
bye
^d
$

Comparing

Sometimes a user wants to know that two files are identical or not. That
means contents of the two files are sa or different. Unix provides three types
of command that compare the contents of the files. In this section, we
discuss file comparison commands like: cmp, comm and diff.

1. cmp (compare):
It compares two files to know that whether files are identical. Here, two files
are compared character character, and display the position of first mismatch
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 7|Page
Unix & Shell Programming Unit-4

character on standard output otherwise display nothing on screen. The


general syntax of cmp command is as follows:
syntax:
cmp [option] file 1 [file2 [n] [n2]]

The cmp utility compares two files of any type and writes the results to the
standard output.

✓ By default, cmp is silent if the files are the same; if they differ, the character
(byte) and line number which the first difference occur is reported. Bytes and
lines are numbered beginning with one.
Consider the files fl,f2 and f3 as follow:
$cat fl
cpp programming
linux OS
Unix OS

$cat f2
c++ programming
linux OS
Unix OS
$cat f3
c programming
linux OS
$
✓ To compare the content of file f1 and f2 ,we can write the command like this:
$cmp f1 f2
f1 f2 differ: byte 2, line 1
$
The output shows that both files are mismatch at 2" character of 1* line.
✓ If you apply command as follow:
$cmpfl #second file is standard input
c programming < enter>
<ctrl+d>
file l - differ: byte 2, line 1

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 8|Page
Unix & Shell Programming Unit-4

$
It takes standard input until user press < ctrl+d> and then display differing
values.

In the syntax of cmp, arguments ni and n2 indicates number of characters


skip in 1* and 2' file respectively.
By default, nI and n2 is decimal but may be expressed as a hexadecimal or
octal value by preceding it with a leading '0x' or '0' respectively.
✓ To skip first three characters in files f1 and f2 then the command is:
$cmp f1 f2 3 3 #Display nothing means identical files
$
✓ A user can give dissimilar value of n1 and n2. To ignore 1" three characters in
file f2 and one character in file f3 then the command is like this:
$cmp f2/3 31
cmp: EOF on f3
$
If you skip initial three and one characters of files f2 and f3 respectively then
first two lines become identical but EOF is encountered in shorter file f3. So
we get the message that 'EOF on f3'.
The following options are used with cmp:
(1)-l-(list) : It prints the byte/character number in decimal and the differing
character value in octal for each character which is differ in both files.
$ cmp -l f1 /2
2160 53
3 160 53
$
It displays detailed list in three columns. The 1st column shows positions of
differ characters in a files, 2nd column shows octal value of differ character in
file fl and 3rd column shows octal value of differ character in file f2.

(2) -c(character): It displays differ byte number, line number and differ
character along with its octal value for b files.
F1 f2 differ: byte 2, line 1 is 160p 53 +
Scmp -Ic file1 file2 #display all differ characters

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 9|Page
Unix & Shell Programming Unit-4

2160р 53+
3 160 р 53+
$

(3) -i(ignore)N: It skips first N-characters of inputted files.


$cmp-13 f1/2 #display nothing means identical files

(4)-in1[:n2): It skips firstn1-characters of 1" file and n2-characters of 2"' file.


$ cmp -i3:3f1 /2 #display nothing means identical files
$

(5) -s(silent) : It prints nothing for differing files; it returns exit status only.
The value of exit status is shown int table-(c.5):

Table-(c.5): value of exit status


value Meaning
0 It indicates that the files are identical.
1 It indicates that the files are different or EOF encountered in
the shorter file.
>1 If an error occurred.

✓ A user can compare files fl and f2 using -s option like this:


$cmp-s f1 f2 < enter.>
$ echo $? #$? is a special variable
1 #to know exit status of last executed command
$
✓ A user can count the number of differ characters in both files like this:
$cmp-I file 1 file2 | wc-1< enter >
2
$

(2) diff
It compares files line by line and display differences using special symbol with
instructions that these change are needed to make two files identical. The
general syntax of diff command is as follows:
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 10 | P a g e
Unix & Shell Programming Unit-4

syntax:
diff [option] from-file to-file

In the simplest case, diff compares the contents of the two files from-file and
to-file. A file name is hyphen (i.e. stands for text read from the standard
input.
✓ If from-file and to-file are regular files then diff compares its contents.
Consider the files f1 and f2 al follows:
- $cat f1 $cat f2`

hello abc
unix hello
linux unix
ds linux OS
c++ ds
$ $

✓ To compare content of these files then the command is:


$ diff f1 f2
0a1
> abc
3c4
< linux
---
> linux OS
5d5
<c++

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 11 | P a g e
Unix & Shell Programming Unit-4

diff takes a different approach to display file differences. In above output,


symbol > before lines indicates that such lines from file f2 and symbol < before
lines indicates that such lines from file f1. Moreover, it display instruction with
a symbol 'a' for append a line, 'd' for delete a line and 'c' for change a line to
make file f1 similar to file f2.
The result shows that instruction Oa1 indicates that 1* line of file f2 is
appended before the first line in file f1 (i.e. Oal and >abc). Instruction 3c4
indicates that 3rd line in file f1 is changed with 4th line of file f2 (i.e. 3c4 and <
linux with > linux OS) and 5d5 indicates 5th line in file fl is deleted which
remains as line 5th in file fl (i.e. 5d5 and < c++).
✓ If from-file is a directory and to-file is not then diff compares the file in
directory from-file whose file name is similar to file to-file, and vice versa. For
example, directory di and file t1 are at same level and directory d1 contains
filename t1. Moreover, the content of both files t and dl/tI are same and you
write a command like this:
$diff d1 t1
OR
$diff t1 d1
✓ It displays nothing means file in directory dI (i.e. dl/tl) and file t/ are same. If
both from-file and to-file are directories, diff compares corresponding files in
both directories, in alphabetical order.
Options used with diff command are as follow:
(1).-i(ignore): It ignores case differences in file contents. Consider two files f1
and f2 as follow:
$cat f1 $cat f2
HELLO WORLD hello world
Hello hello

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 12 | P a g e
Unix & Shell Programming Unit-4

$ diff -i f1 f2
$
It display nothing means both are identical files.
(2)-r(recursive):It recursively compares files of subdirectories with same name
and display nothing if identify otherwise display differences to make both files
identical.
$diff -r dir1 dir2
(3)-s: It is used to report when two files are the same.
$diff -s f1 f1 #display message if files are same
Files f1 and f1 are identical
$
(3)comm:
It compares two sorted files line by line. It outputs three columns:
1. Lines that are only in the first file.
2. Lines that are only in the second file.
3. Lines that are in both files.

Syntax:
Comm [option] file1 file2

✓ Both files must be sorted before using the comm command. If the files are not
sorted, the output may not be accurate.

$cat file1
apple
banana
cherry

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 13 | P a g e
Unix & Shell Programming Unit-4

$cat file2
banana
cherry
date
If you run the command:
comm file1 file2
options:
(1).-1: Suppress the first column (lines unique to the first file).
$comm -1 file1 file2
(2)-2: Suppress the second column (lines unique to the second file).
$comm -2 file1 file2
(3)-3: Suppress the third column (lines common to both files).
$comm -3 file1 file2
✓ A user can also combine options and display only those lines that are common
$comm -12 file1 file2
It shows only those lines present in both the files.

Translating characters
Unix supports a tool for translating characters. This tool works on individual
characters rather than on lines.tr command provides such facility.

tr(Translating characters):
It can work on individual characters in a file. It translate, squeeze, and/or delete
characters from standard input and writes it to standard output.

Syntax:
Tr [option(s) ] expression1 [expression2] [standard-input]

✓ Replace all lowercase letters with uppercase:


$echo "hello world" | tr 'a-z' 'A-Z'

Options:
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 14 | P a g e
Unix & Shell Programming Unit-4

(1.)-d(delete): Delete all digits from the input:


$echo "h3llo w0rld 2024" | tr -d '0-9'

(2).-s(squeeze/compress): Replace sequences of spaces with a single space.


$echo "hello world" | tr -s ' '
(3).-c(complement): Complement the characters in SET1. This means that
all characters not in SET1 will be considered.
$echo "hello123world456" | tr -c '0-9' ' '

Sorting and merging files


Unix also supports sort and merge tools which provides a facility to arrange file
contents in specific order and merge them with specified separator
respectively.

(1).sort:
The sort command is used to sort the lines of a text file or standard input in
various ways.

Syntax:
Sort[option] [file(s)]

✓ Consider the following command.


$sort f1 #display sorted contents of file1

Options

(1).-o: Write output to a file instead of standard output.


$sort -o myfile f1 f2 f3

(2).-u:It removes duplicate lines from the sorted output i.e. the output contains
only unique lines.
$sort -u f1<enter>

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 15 | P a g e
Unix & Shell Programming Unit-4

(3).-m:It merges list of files that have already have been sorted.
$sort -m f1 f2 <enter> #files f1 and f2 must be sorted

(4).-t:It sorts the file on any field delimited (separated) with sep character
(default separator is tab).

(5).-r:It reverses the sort order.


$sort -r [options] file

e.g Suppose you have a file named names.txt with the following content:
John
Alice
Bob
David

$sort -r names
o/p: John
David
Bob
Alice

(6).-n:It sorts data in numeric order.By default ,sorting is not numeric.


$sort -n [options] file
Example
Suppose you have a file named numbers with the following content:
10
2
33
25
1

$sort -n numbers.txt
o/p: 1
2
10
25
33

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 16 | P a g e
Unix & Shell Programming Unit-4

(7).-c:It checks whether files are already sorted.If they are,it does nothing.
$sort -c myfile
$
✓ Suppose you have a file named names with the following content:
Alice
Bob
John
David

Here file names is not sorted and you write command like this:
$sort -c names
sort: names.txt:4: disorder: David
$

(2).paste:
The paste command in Unix/Linux is used to merge lines of files side by side. It
is often used to combine columns of data from multiple files or to create a
table-like structure.

Syntax:
paste [options] file1 file2 ...

Consider the file names which contains name of the students and other file
number which contains number of the students as follow:
e.g. $cat names<enter>
kush
Nirva
Vidhi
Kavya
Jenil
$cat numbers <enter>
6851792
2460995
9225806
2771592
6876318

✓ Now,we want to merge this file side-by-side then the command is:
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 17 | P a g e
Unix & Shell Programming Unit-4

$paste names numbers<enter>


Kush 6851792
Nirva 2460995
Vidhi 9225806
Kavya 2771592
Jenil 6876318

Options:
(1).-d: Specifies a delimiter to be used between columns (default is a tab).
$paste -d”|” names numbers<enter>
Kush|6851792
Nirva|2460995
Vidhi|9225806
Kavya|2771592
Jenil|6876318

(2).-s(serial): It paste one file at a time instead of in parallel i.e. it paste files
serially in line-by-line manner.

$paste -s names numbers

Kush Nirva Vidhi Kavya Jenil


6851792 2460995 9225806 2771592 6876318

Other filter utilities

(1).uniq: The uniq command in Unix/Linux is used to filter out or report


repeated lines in a file. It can be useful for removing duplicate lines or
counting occurrences of unique lines in a sorted file.

Syntax:
Uniq [OPTION]… [input file [output file]]

✓ Consider a file f1 as follow:


$cat f1
C++ language
Cpp language
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 18 | P a g e
Unix & Shell Programming Unit-4

hello surat
hello surat
red hat linux
unix os
$

✓ If you apply both input and output file with uniq command then it writes
unique lines into the output file.
$uniq f1 f1.out
$
It creates output file f1.out which contains unique lines of input file.
$cat f1.out
C++ language
Cpp language
hello surat
red hat linux
unix os

options:

(1).-c(count): Precedes each line with the number of occurrences.


$uniq -c f1
1 C++ language
1 Cpp language
2 hello surat
1 red hat linux
1 unix os
(2).-d(duplicate):It prints only duplicate lines.
$uniq -d f1
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 19 | P a g e
Unix & Shell Programming Unit-4

hello surat
$
(3).-D(all-repeated):It prints all duplicate lines.
$uniq -D f1
hello surat
hello surat
$
(4).-fN(skip-fields=N):It avoid first N fields of each line during comparison.
$uniq -f1 f1
C++ language
hello surat
red hat linux
unix os

(5).-i(ignore-case):It ignores differences in case when comparing.


$uniq f2
HELLO
hello
unix os
$
(6).-u(unique):Sometimes ,a user is interested only in unique lines of the file then -u
is used.This option -u print non-repeated lines of an input file.
✓ A user want to display only unique lines then the command is:
$uniq -u f1
c++ language
cpp language
red hat linux
unix os

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 20 | P a g e
Unix & Shell Programming Unit-4

(2).wc
The wc (word count) command in Unix/Linux is used to count the number of
lines, words, and characters in a file or from standard input. It provides a
simple way to gather basic statistics about the content of files.

Syntax:
wc [options] [file ...]

whithout any argument , it takes standard input until user press <ctrl+d> and
display number of lines,words and characters on a screen.
$wc<enter>
This is wc command<enter>
<ctrl+d>
1 4 18
$

Options:
(1).-l (lines):It prints the new -line characters counts.
✓ For example , to count number of new-line character then command is:
$wc -l f1
2 f1

$
(2).-L(max-line-length): It prints the length of the longest line.
✓ If you write command like this:
$wc-Lf1
26 f1
$

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 21 | P a g e
Unix & Shell Programming Unit-4

It prints length of longest line in file f1.


(3)-w (words): It prints the word counts.
✓ For example, to count number of words in file fl then command is:
$wc -w f1
7 f1
$
(4) -c (characters/bytes): It prints the character counts.
For example, to count number of characters in file fl then command is:
$wc-c f1
41 fl
$
You can combine more than one option as Ic, lw, wc, lew and so on
$wc-Ic f1 < enter>
2 41 fl
$
The result shows number of lines and characters in file f1.
(3).Tee:
This utility reads from standard input and writes to standard output as well
as a file. In short ,it has one input and outputs. The general syntax of tee
command is as follows:
tee [option)... [filename)...
We know that all intermediate output in a pipe is discarded by UNIX, i.e. it is
not saved on the disk. Sometimes a user want to pipe the standard output of
a command to another command, and also save it on disk for later use, i.e.
send copy of the output as standard input to next command and one copy is
redirect to a disk file.
The option use with tee command is as follow:

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 22 | P a g e
Unix & Shell Programming Unit-4

(i) -a(append): It does not overwrite. The output is appended to the given
file.
The user also wishes to preserve the user's list in a file called alluser and also
display list of logged-in use screen then the command is:
$who | tee alluser < enter>
✓ To display both, list of logged in users as well as their counts on a screen then
the command is:
$who | tee/dev/tty| wc-1
bca pts/0 2013-09-02 09:35
bca63 pts/l 2013-09-0213:29
bharat pts/2 2013-09-02 13:29,
3
$
✓ It is also useful to create a new file like this:
$ tee t1
unix and shell programmong <enter>
unix and shell programmong
red hat linux<enter>
red hat linux
<ctrl+d>
$cat t1
unix and shell programmong
red hat linux
Here, whatever you enter from standard input will be display on the standard
output and simultaneo writes on a file tl until user press
< ctrI+d>.

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 23 | P a g e
Unix & Shell Programming Unit-4

4.2. Filtering utilities: grep, sed etc.

Introduction
Till now, we discussed different filter utilities. In this chapter, we will learn
about a very powerful filter utility known as grep. The grep stands for globally
search a regular expression and print it. It is also known as pattern matching
utility. Itis used to search a file for a particular pattem of characters, and
display all records/lines that contain a pattern. The pattern that is searched in
the file is referred to as the regular expression.
Pattern matching utility: grep
It is a filter utility that performs various tasks as follow:
✓ It scans a file for the occurrences of a pattern and displays lines in which
scanned pattern is found.
✓ It scans a file for the occurrences of a pattern and displays lines in which
scanned pattern does not found.
✓ It scans files for the occurrences of a patter and displays name of files which
contains a pattern in them.
✓ It also displays count of lines which contains pattern.
The general syntax for the grep command is as follows:
syntax:
grep [options] pattern [filename(s)]
It is use to select and extract lines from a file and print only those lines that
match a given patter. In the above syntax square bracket indicates optional
part. The filename(s) and options are optional and pattern is compulsory in
the grep command. Here, a pattern is a simple string or more complex which
contains metacharacters, a special character for pattern matching. A pattern
is also known as regular expression.
Without a filename grep expects standard input. As a line is input, grep
searches for the regular expression in the line and displays the line if it
contains that regular expression. Execution stops when the user indicates end
of input by pressing <ctrl+ d›.
✓ For example, a user supply a command at shell prompt as follow:
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 24 | P a g e
Unix & Shell Programming Unit-4

$ grep'unix'
unix and shell programming < enter>
unix and shell programming red hat linux<enter>
unix OS< enter>
unix OS
<ctrl+ d>
$
grep requires an expression to represent the pattern to be searched for,
followed by one or more filenames.
The first argument is always treated as the expression, and the other
arguments are considered as filenames.
Specifying regular expression:
A regular expression is a pattern that describes a set of strings. Regular
expressions are constructed analogously to arithmetic expressions, by using
various operators to combine smaller expressions.
An expression formed with some special and ordinary characters, which is
expanded by a command, and not by the shell to match more than one
string. A regular expression is always quoted to prevent its interpretation by
the shell. Regular expressions can be used to specify very simple patterns of
characters to highly complex ones. Some very simple patterns are shown in
table-(a.12):
Table-(a.12): Example of simple regular expression
Regualar Meaning
expression
A It display all lines that contain character “A”.
“Unix” It display all lines that conatain pattern “Unix”

Consider the following examples:


✓ Let us assume that the input file fl as follow:

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 25 | P a g e
Unix & Shell Programming Unit-4

$ cat f1
sco Unix
The red hat linux user name user 1
$
If a user wish to display lines of file fl which contains pattern 'Unix' then the
command is as follow:
$grep Unix fl
sco Unix
$
✓ If you want to locate lines of file fl which contains character'' then the
command is as follow:
$ grep x f1
sco Unix
The red hat linux
$
More complex regular expressions can be specified by the grep's metacharacters,
always written in the quotes, shown in table-(b.12).
Table-(b.12): grep metacharacters
Character use
[…] or […] It matches any one single character within a square bracket.
^pattern It matches a pattern at the beginning of each line.
Pattern$ It matches a pattern at the end of each line.
.(dot) It matches any single character except new-line character.
\(backslash) It indicates that grep should ignore the special meaning of the
character following it in regular expression
\<pattern It matches a pattern at the beginning of any word in a line.
Pattern\> It matches a pattern at the end of any word in a line.
ch* It matches zero or more occurrence of character ch.
ch\{m\} The preceding character ch is occurred m-times.
ch\{m,\} The preceding character ch is occurred at least m times.
ch\{m,n\} The preceding character ch is occurredbetween m and n times.
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 26 | P a g e
Unix & Shell Programming Unit-4

\(exp\) It matches expression exp for later referencing with \1,\2…

Consider the following examples which use grep metacharacters:


✓ To display lines of file fl which contains pattern as user/ or user2 or user3
then the command is as follow:
$grep user [123]f1
user name user 1
$
It displays lines of file fl which contains pattern user 1.
✓ You can display lines which begins with pattern The then the command is as
follow:
$grep "The' f1
The red hat linux
$
It displays lines of file f1 which start with pattern The.
✓ Similarly, if you wish to match a pattern at the end of each line then the
command is as follow:
$grep 'Unix$’ f1
sco Unix
$
It displays lines of file f1 which end with pattern Unix.
✓ You can use dot. to match any character in a line. For example, consider a file
f2 as follow:
$cat f2
Unix and shell programming
#blank line contains only new-line character
red hat linux
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 27 | P a g e
Unix & Shell Programming Unit-4

Unix OS
vb.net
program and process
$
Here, file f2 contains blank and non-blank line. If you wish to remove blank line
from the output then the command is:
$grep ’.’ f2
Unix and shell programming
red hat linux
Unix OS
vb.net
program and process
$
It displays all lines which contains any character in a line except blank-line (contains
only new line character).
✓ You can protect special meaning of grep metacharacter using back-slash. For
example, a user wish to display lines which contains'! character anywhere in a
line then the command is as follow:
$ grep ‘\.’ f2
vb.net
$
It displays lines which contains '.’ in a line.
✓ To display lines of file /2 which contains pattern 'program' then the command
is as follow:
$ grep 'program' f2
Unix and shell programming
program and process
$
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 28 | P a g e
Unix & Shell Programming Unit-4

But, if you want to display lines of file f2 which contains word 'program' that means
it is not a part of any string then the command is as follow:
$grep ‘\<program|> ‘f2
program and process
$
✓ The * (asterisk) refers to the immediately preceding character. It matches
zero or more occurrences of previous character. The pattern a* matches a
null string, single character a* and any number of as.
i.e. (nothing) a aa aaa aaaa …..
✓ A user can locate lines which contains characters repeated more than one
times then the command is:
$grep 'mm*’ f2
Unix and shell programming
program and process
$
It locates lines in which character 'm' repeated one or more times.
✓ You can display lines of file fl which contains exact 8 characters then the
command is as follow:
Sgrep’^. \{8}$’ f1
sco Unix
$
✓ To display lines of input file which contains characters between 5 and 15 then
the command is like this:
$grep ‘^.\{5,i5\}$’ f2
red hat linux
Unix Os
vb.net
It displays lines of file f2 that contains character between 5 and 15.

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 29 | P a g e
Unix & Shell Programming Unit-4

✓ To display lines which contains pattern at the beginning of line would occur in
the same line anywhere then
you can use save operator with back references as follow:
$grep '^(.\). *\1'f2
program and process
It displays lines of file f2 which contains any character occur at the beginning of line
would also occur anywhere in the same line. The output shows that 1st character ‘p'
occur in the same line therefore we get such output.
✓ grep is silent and simply returns the prompt when a pattern is not found in a
file.
$grep hello f2 #No hello found
It displays nothing that means hello pattern do not present in file f2.
✓ grep also accept output of other command. For example, a user want to
display filenames of working directory having permission read and write to
owner, group and other user then the command is like this:
$ ls-lgrep ‘^rw-rw-rw-'
-rw-rw-rw- 1 bharat bharat 43 Apr 317:25 f1
-rw-rw-rw-2 bcal tybcasems 77 Apr 411:02 f1.In
-rw-rw-rw-2 bcal tybcasem5 77 Apr 411:02 12
-rw-rw-rw- 1 bhrat bhrat 34 Jul 18 2013 f3

✓ When grep is used with a series of strings, it interprets the first argument as
the pattern and the rest as filenames along with the output. For example,
consider a command as follow:
$grep red hat linux
It indicates that argument red is considered as pattern and other arguments
hat and linux are considered as filenames.
✓ Quote is compulsory when a pattern contains more than one word. For
example, consider the following command:

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 30 | P a g e
Unix & Shell Programming Unit-4

grep "hello world" filename


✓ Quote is also compulsory when a pattern contains special characters that can
be interpreted by search utility i.e. grep) not by the shell. You can generally
use either single or double quotes, but if command substitution or variable
evaluation is involved, you must use double quotes.
Consider an example which contains variable substitution in double-quote as
follow:
$a=1
$grep "$a" f1
It prints all lines of file fl that contains 1. Consider another example which uses
command substitution in double-quote as follow:
$grep "echo if" f1
It prints all lines of file f1 which contains pattern if in line.
Options:
The grep utility can be used with many options, a few of which are discussed
below:
(1)-c (count): It prints count of matching lines for each input file.
✓ For example, a command is follow:
$grep -c'.’ f2
5
$
It counts all non-empty lines of file f2.
✓ Consider another command as follow:
$grep -c '^$' f2
2
$
It counts all empty lines (consist of only new-line character) of file f2.
(2)-l (list): It displays only the names of files in which a pattern has been found.
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 31 | P a g e
Unix & Shell Programming Unit-4

✓ For example, consider a command as follow:


$grep -l’.’ *
It displays names of all files of current directory that contains any character in
it.
✓ You can print names of all files of current directory that contains pattern echo
anywhere in a file then the command is as follow:
$grep-l ‘echo' *
(3) -n (number): It can be used to display the line numbers containing the pattem,
along with the lines.
✓ If you want to print line number before matched line then the command is as
follow:
$ grep-n'Unix f2-
1: Unix and shell programming
5:Unix OS
$
It prints two column output, each column delimited by colon (;). In the 1st column,
line number will be displayed and 2nd column contains content of matched lines.
✓ You can give more than one filename as input files as follow:
$ grep-n 'Unix'fl f2
f1: 1:sco Unix
f2:1: Unix and shell programming
f2:5: Unix OS
$
It prints output in three columns, each column delimited by colon (:). The 1st
field contains name of file, 2nd column contains line number and last column
contains content of matched line.
(4). -v (inverse): The -v option select all but not the lines containing the pattern.
✓ Sometimes, a user is interested only on unmatched lines then he used -v
option as follow:
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 32 | P a g e
Unix & Shell Programming Unit-4

$grep – cv ’^$' f2
5
$
It counts all non-empty lines (contains only new-line character) of file f2.
✓ Consider another command as follow:
$grep -v 'Unix' f1
The red hat linux
user name user1
$
It displays lines of file f1 which do not contains Unix pattern.
(5)-i (ignore): It ignores case in pattern matching.
✓ For example, you want to print lines that contains pattern unix in any case
then the command is as follow:
$grep-i 'unix'fl
sco Unix
$
It displays lines of file f1 having unix pattern in any case.
(6) -h (hide): It omits filenames when handling multiple files.
✓ For example, consider an example as follow:
$grep-h'Unix'fl f2
sco Unix
Unix and shell programming
Unix OS
$
It displays lines of files fl and f2 which contains pattern Unix. It does not display
filename before, matched line i.e. it hides name of a file.

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 33 | P a g e
Unix & Shell Programming Unit-4

(7)-e Reg Exp : You can specity regular expression with this option. You can use this
option multiple times.
✓ For example, you want to locates lines of file which contains patter either
Unix or linux then the command is as follow:
$ grep-e 'Unix'-e 'linux' f2
Unix and shell programming
red hat linux
Unix OS
$
(8) -f fname: A list of strings to be match is stored in file name.
✓ For example, consider a patfile as follow:
$ cat patfile
Unix linux
$
It contains list of pattern in a separate lines. Now, we want to locate lines of file f1
that contains any of the pattern given in file patfile then the command is as follow:
$ grep-f patfilefl
sco Unix
The red hat linux
$

Grep family

There is a small family of grep utility which includes egrep and fgrep. These two
utilities operate in a similar way to grep but each has its own particular usage, and
there are small differences in the way that each work.
Both utilities search for specific pattern in either the standard input stream or a
series of input files supplied at command-line.
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 34 | P a g e
Unix & Shell Programming Unit-4

egrep
egrep stands for extended grep. It was invented by Alfred Aho. It extends grep's
pattern-matching capabilities in two major ways.
✓ It admits alternates
✓ It enables regular expressions to be bracketed/grouped using the pair of
parenthesis (i.e. (...)), also known as factoring.
It offers all the options and regular expression metacharacters of grep, but its most
useful feature is the facility to specify more than one pattern for search. While grep
uses some more characters that are not recognized by egrep, egrep includes some
additional extended metacharacters not used by either grep or sed utilities that are
given in table-(c.12).
Expression Meaning
ch+ It matches one or more occurrence of character ch.
ch? It matches zero or one occurrence of character ch.
exp1\exp2 It matches expression exp1 or exp2.
(x1\x2)x3 It matches expression x1x3 or x2x3.

Let us consider the following examples which uses extended metacharacter:


✓ To display lines which contains any character that occur one or more time
then the command is as follow:
$egrep m+ f2
Unix and shell programming
program und process
$
It prints lines of file f2 that contains character 'm' occur one or more times.
✓ If you want to locates lines which contains one of more patterns then you can
use alternate metacharacter as follow:
$egrep 'Unix\linux' fl
sco Unix
The red hat linux

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 35 | P a g e
Unix & Shell Programming Unit-4

$
It displays lines of file f1 which contains pattern either Unix or linux.
✓ Sometimes, you want to display lines which contains either software or
hardware then the command is as follow:
$egrep "(soft| hard) ware" f1
NOTE:In grep,if a pattern contains some special characters then it must be quoted.
-foption: Storing pattern in a file
egrep provides a facility to take patterns from a file. If there are number of pattern
that you have to match; egrep offers the -f (file) option to take such patterns from
the file. For example, a file patfile contains patterns in which each pattern is
delimited by'|’ as follow:
$ cat patfile
Unix linux
$
Now, you can execute egrep with the -foption in this way:
$egrep -f patfile f1
sco Unix
The red hat linux
$
Here, the command takes the pattern/expression from file patfile and display
matched lines of file f1.

fgrep
fgrep stands for fixed/fast grep. The fgrep utility can normally only search for fixed
strings i.e. character string without embedded metacharacters. However, some
implementations of the fgrep utility allow it to be used with a few metacharacters -
check your version to make sure. fgrep accepts multiple patterns, both from the
command line and a file, but unlike grep and egrep, does not accept regular
expressions. So, if the pattern to be search is a simple string, or a group of them,

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 36 | P a g e
Unix & Shell Programming Unit-4

fgrep is recommended. It is arguably faster than grep and egrep, and should be
used when using fixed strings.
Alternative patterns in fgrep are specified by separating one pattern from another
using the new-line character. This is unlike in egrep, which uses the '|’ to delimit
two expressions. You may either specify these patterns in the command line itself,
or store them in a file.
✓ For example consider a file patfile which contains list of pattern delimited by
new-line character as follow:
$ cat patfile
Unix
linux
$
We can use this file using -f option as follow:
$fgrep -f patfile f1
sco Unix
The red hat linux
$
✓ You can achieved same output without using file patfile by supplying patterns
at command-line as follow:
$fgrep ‘Unix <enter>
> linux'f1 < enter>
sco Unix
The red hat linux
$
✓ The disadvantage with grep family is that none of them has separate facilities
to identify fields. This limitation is overcome by awk utility.
Limitation of grep family:
The grep family has following limitation.

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 37 | P a g e
Unix & Shell Programming Unit-4

✓ It cannot be used to add, delete or change a line.


✓ It cannot be used to print only part of a line.
✓ It cannot read only part of a file.
It cannot select a line based on the contents of the previous or the next line. There
is only one buffer, and it holds only the current line.
Following table-(d.12) shows the atoms used in regular expression by grep family:
Table-(d.12):Atoms used by grep family
Atoms grep fgrep egrep
Character ✓ ✓ ✓
Dot ✓ X ✓
Class ✓ X ✓
Anchors ✓ X ✓
Back Reference ✓ X ✓

As shown in table-(d.12), both grep and egrep utilities allows all the atoms in
regular expression whereas fgrep utility supports only character atom.
Similarly, table-(e. 12) shows the operators used in regular expression by grep
family:
Table-(e.12): Operators used by grep family
Operators grep fgrep egrep
Sequence ✓ ✓ ✓
Repetition ✓ X ✓
Altermation X X ✓
Group X X ✓
Save ✓ X ✓

Table-(e.12) indicates that grep utility supports sequence, repetition and save
operators, egrep utility supports all operators but fgrep utility supports only
sequence operator.

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 38 | P a g e
Unix & Shell Programming Unit-4

Sed
So far, we discussed many filter utilities. In this chapter, we will discuss multi-
purpose filter utility known as sed. The term sed stands for stream editor and was
designed by Lee McMohan. Stream editor i.e. sed is derived from ed, known as line
editor. Everything in sed is an instruction. The general form of this utility is:
Syntax:
sed [options] instruction [filename(s)]
An instruction consists of two components an address and a command which are
enclosed within quotes. The address selects/searches the line to be processed or
not processed by the command. The command indicates the action that sed is to
apply to each input line that matches the address.
Addresses
An address may be a line numbers), pattern(s) or combination of them. The sed
supports two types of addresses.
✓ Line number.
✓ A pattern.
Line address can be a single line, a set of line or range of line. A single line address
can be defined by line number or a dollar ($). A dollar is a special symbol and it
specifies the last line in the input file. Some of the examples of a single line address
are given in table-a. 13):
Table-(a.13): Example of single line address
address Meaning
5command It applies command on 5th line.
$command It applies command on last line.
15command It applies command on 15th line.

A set of line address allows you to specify more than two lines and may be
consecutive or alternate in the input file. A set of line address can be defined by a
pattern or regular expression. When you use pattern or regular expression as line
address then it must be enclosed within front-slashes. A process of specifying a

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 39 | P a g e
Unix & Shell Programming Unit-4

pattern within the slashes is known as context address Table-(b.13) denotes the
example of set of line address used by a pattern.
Table-(b.13): Example of set of line address
/pattern/command It applies command on lines that contains pattern.
/^pattern/command It applies command on lines which begins with pattern.
/pattern$/command It applies command on lines which ends with pattern.

If,a user want to acess a set of consecutive lines of input files then range of address
is given.You can define range of address by start address followed by comma with
no space followed by end address.The start address and end address may be line
number or a pattern or combination of them.Table-(c.13) shows example of range
addresses:
Table-(c.13):Example of range of address
Address Meaning
5,10command It applies command on lines between 5 and 10
of input file.
5,$command It applies command on lines between 5 and last
line of input file.
2,/pattern/command It applies command on 2nd line up to first
occurrences of line which contains pattern.
/pattern/,10command It applies command on line which contains
pattern up to 10th lines of input file.
/pattern1/,/pattern2/command It applies command on those lines which are
occurring between pattern1 and pattern2 of
input files.

Note: sed does not change the input file. All modified output is written to standard
output and to be saved must be redirected to a file.
Commands
The sed support several commands. Commands are used to apply action on
specified lines. They are categorized as follow:
✓ Print v Quit
✓ Line number
✓ Modify
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 40 | P a g e
Unix & Shell Programming Unit-4

✓ Files
✓ Substitute
print(p) command:
This denoted by character p. It prints selected lines on a standard output. Consider
an input file f1 as follow:
$cat f1
unix and shell programming
red hat linux
linux and shell programming
unix operating system
linux is open source
$
✓ If a user wants to print top -lines of input file then the command is like this:
$sed'1,3p'f1
unix and shell programming
unix and shell programming
red hat linux
red hat linux
linux and shell programming
linux and shell programming
unix operating system
linux is open source
$
The output shows that by default sed utility reads entire file so it prints all lines on
the standard output as well as specified lines that are affected by the command p.
In other words, the addressed lines are printed twice. To overcome the problem of
printing duplicate lines, you can use -n option whenever you use the p command.
Therefore the above command is rewritten as follow:

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 41 | P a g e
Unix & Shell Programming Unit-4

$sed-n'1,3p' f1
unix and shell programming
red hat linux
linux and shell programming
$
✓ A special character dollar ($) is used to print last line of an input file. For
example, to print the last line of file f1 then the command is:
$sed-n'$p'f1
linux is open source
$
✓ A command p without any address, displays all lines of input file by default.
$sed-n p f1
It displays all lines of input file f1.
✓ Reversing line selection criteria (!): A user can use negation operator (i.e. !)
with any command of sed utility. So, selecting first 3-lines means not
selecting lines from fourth line to last line of input file. Therefore the
command is:
$sed -n'4,S!p'file1 OR
sed-n'1,3p' file1
✓ To select non-contiguous groups of lines of input file then the command is as
follow:
$sed-n'1,3p
> 7,9p #It select lines 1 to 3, 7 to 9
> $p'fl # and last line of file f1
It displays 1 to 3, 7 to 9 and last lines of input file f1.
✓ In addition, a user can get similar output by incinding inline script/expression
using -e option.This option allows user to include multiple instruction with
sed utility. Therefore, the above command is rewritten in a single line as
follow:
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 42 | P a g e
Unix & Shell Programming Unit-4

$sed-ne '1,3 p'-e'7,9p'-e'Sp' file1


✓ A user can also use variable substitution in an instruction part. Consider two
variables a and b which contains value 5 and 10 respectively. So, to select
lines 5 to 10 then the command is:
$sed-n'$a, $b p' f1
✓ You can also use pattern and line address in instruction part. For example,
consider the command as follow:
$sed -ne '1,3p' -e'/hello/p' f1
It prints top 3-lines and the lines which contains pattern hello of file f1.
✓ You can give range of pattern instead of range of line numbers in instruction
part. To prints all lines in which it start from the line which contains pattern
unix up to a line which contains pattern linux then the command is as follow:
$sed -n/unix/,linux/p 'f1

(2). quit(q) command:

✓ This command is denoted by a character q. It uses single address i.e. it does


not allow range of address. It quits after reading up to address line. For
example, to quit after 3rd line of file f1 then the command is:
$sed ‘3q' f1
unix and shell programming
red hat linux
linux and shell programming
$
✓ If, you do not specify any address with q command then it quits after 1st line
of input file.
$sed q f1
unix and shell programming
$
It prints first line of file fl.
✓ It also accept pattern/regular expression as address like this:
$sed'/unix/q' f1
unix and shell programming
$
It quits from file fl on encounter of line that contains unix pattern.

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 43 | P a g e
Unix & Shell Programming Unit-4

(3) Line number (=) command:


✓ It is denoted by equal sign (i.e. =). It writes line number of addressed line at
the beginning of the line. It is similar to-n option of grep utility, but the
difference is that the line number is written on a separate line.
$sed ‘=’ f1
1
unix and shell programming
2
red hat linux
…and so on
$
✓ To print only line number then -n option is used.
$sed-n’=’ f1
It prints only line number i.e. it does not display content of line.
✓ To print line number of last line of input file then the command is:
$sed-n '$=' f1
5
$
It is similar to the command wc -l < f1. In other words, it displays number of
lines in an input file f1

(4). Modify command:


There are different purposes of this command. It allows you to insert,
append, change or delete lines. They do not modify just a part of a line that
means they work on entire line. The modify commands are as follow:

(a) Insert command(i):


It is denoted by a character i. It inserts one or more lines directly to the
output before the address lines.
✓ For example, to insert two lines at the beginning of an input file then the
command is as follow:
$sed 'li\<enter>
Unix is multi-user multi-tasking operating system <enter>
It provides 3-levels of security' f1<enter>

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 44 | P a g e
Unix & Shell Programming Unit-4

Then the output is:


Unix is multi-user multi-tasking operating system
It provides 3-levels of security
unix and shell programming
red hat linux
linux and shell programming
unix operating system
linux is open source
$

NOTE: The escape character (\) must be immediately followed by a return. If


any other character, including a space. follows it, the escape character
modifies that character and sed will return an error rather than processing
your commands.

✓ Insert bank-line before each line of an input file then the command is:
$sed 'i\ < enter> or $sed'il <enter>
> <enter> > ‘f1
> ‘f1
It inserts blank-lines before each line of input file f1.

b) Append command(a):
It is denoted by a character a. It is similar to the insert command except that
it writes the text directly to the output after the specified line.
Insert 2-lines at the end of file f1 then the command is:
$sed'$a\<enter>
> unix is portable operating system Kenter >
> It is designed to facilitate programming, text processing and comm.
✓ A user want to redirect the output of a command in to another file then the
command is:
$sed’$a\<enter>
> unix is portable operating system Kenter>
> It is designed to facilitate programming, text processing and comm.
> 'f1 >f1.out
It creates an output file f1.out.

c) Change command (c):

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 45 | P a g e
Unix & Shell Programming Unit-4

A change command is denoted by a character c. It replaces


addressed/matched line with new text.
✓ To change the first line of file f1, you can write command as:
$sed '1c/<enter>
> unix & shell programming 'f1
$
It replaces 1st line of file f1 with a text ‘unix & shell programming’.

NOTE: back-slash(\) is compulsory at the end of each line (for new-line


character) when we used either a or i c command.
Insert a text introduction before l" line and hello world after 3* line then
we can use i and a commands like this:
$sed -e 'li\<enter >
> introduction' -e '3a\<enter >
> hello world' file1

(d) Delete command (d):


Using the d (delete) command, we can simulate -v option of grep utility to select
lines not containing a pattern.
✓ Removes lines of file f1 that contains unix pattern then the command is:
$sed’/unix/d'f1 Or $sed'/unix/!p'f1
red hat linux
linux and shell programming
linux is open source
$
It selects lines that do not contain unix pattern.
✓ A user can remove blank-line from an input file as follow:
$sed'/^$/d'f1
It displays non-blank lines of file f1.
✓ Following two commands are equivalent.
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 46 | P a g e
Unix & Shell Programming Unit-4

sed ‘!d' f1 Or sed 'p' f1


It displays all lines of file f1.
(5) File command:
File command is used to read or write data to or from other files respectively. There
are two types of file command: (i)read file and (il) write file
(a) read file command(r fname):
It is denoted by r fname. When a user wants to insert common content of a file
after specified line of an input file then this command is useful. It reads text from
file name and places its content after a specified line of input file.
✓ To append contents of file /2 after lines of file fI that contains unix patten
then the commandis:
$ sed ‘/unix/rf2' f1
✓ To insert content of file f2 at the end of input file fl then the commandis:
$sed 'Srf2' f1
It is similar to cat f1 f2 command
✓ To insert content of file f2 at the beginning of input filef/ then the command
is:
$sed'1 r f2' f1
✓ To insert contents of file/3 after each line of input file/2 exceptist inethen the
command is;
$sed '1!r$3' f2

(b) Write File command(wname):


It is denoted by w fname.The write file command makes possible to write the
selected lines in a separate file.

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 47 | P a g e
Unix & Shell Programming Unit-4

✓ To write selected lines of input file to output file fI. out then the command is
as follow:
$sed-n’/unix/wf1.out'f1
It writes lines of file f1 to file f1.out that contains unix pattern.
✓ To write top 5-lines of input file 1 to output file then the command is like this:
$sed'1, 5w f2'f1
✓ A user can create multiple output files that contain selected lines of input file.
$sed -n'/linux/w lfile <enter>
> /unix/w ufile f1
Or
sed -ne '/linux/w lfile' -e ‘/unix/w ufile' f1
ufile.
It writes lines that contains patten linux to tile file and lines that contains
pattern unix to a ufile.

Take instruction from a file: -f option


(It is possible io take instruction from a File rather than the command-line. The
-f option allows us to take instruction from a file. When there are numerous
editing instructions to be performed, it will be better to use the -f option to
accept instructions from a file.

✓ For example, consider an instruction file as follow:


$cat instr. txt
/unix/w ulist
/linux/w llist
$

Now, you can use this instruction using -f option of sed utility as follow:
$sed-n-finstr.txt f1

It creates two output files ulist and list which contains lines having unix pattern
and linux pattern respectively.

✓ A user can also use more than one instruction files by repeating -f option with
each instruction file like this:
$sed-n-finstr1. txt-finstr2.txtf1

✓ You can combine the -e and -f options as many times as you want.
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 48 | P a g e
Unix & Shell Programming Unit-4

$sed-ne '/linux/p'-finstr.txt-finstr2. txt file1

(6). Substitute command(s):


It is denoted by a characters. It scans a line for search pattern and substitutes
it with replacement string. This is the most powerful command in sed utility.
This command is similar to the search and replace feature found in text editor.
This feature provides us to add, delete or change text in one or more lines. The
format of the substitution command is as follow:

[address or scanned_pattern] s/search patter/replace string/flag (s)]

If the address is not specified, the substitution will be performs for all lines
containing first occurrences of search_ patter. A search_pattern may be a
regular expression or literal string. Both search_pattern and replace string are
delimited by slash (/). The replace_string is a string that consists of either
ordinary characters or an atom or meta-characters or combination of them.
Only a back reference atom and meta-characters such as ampersand (&) and
back slash (1) can be used in a replace string. We will discuss these tokens later
on in this section.

✓ To replaces first occurrences of word unix in each line by word linux in a file f1
then the command is:
$sed 's/unix/linux/' f1

✓ To replaces first occurrences of word unix in each line by word linux in top 5-
lines of file f1 then the command is:
$sed '1, 5s/unix/linux/' f1

Flag (g):
We know that the following command replaces first occurrences of the unix by
linux in each line of an input file f1 then the command is as follow:
$sed-n's/unix/linux/p' f1

To replace all occurrences unix with linux, a user need to use the g(global) flag
at the end of the instruction. This is referred to as global substitution. A global
(g) flag replaces all occurrences of search pattern with replace_string.
✓ For example, to replace all occurrences of unix with linux in each line of file f1
then the command is:
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 49 | P a g e
Unix & Shell Programming Unit-4

$sed's/unix/linux/g' f1

✓ A user can also use context address in address part.


$sed -n /unix/,linux/s/hello/bye/gp' f1

It replaces all occurrences of hello with bye in selected lines. Here replacement
occurs between the start line which contains a pattern unix up to the line which
contains a pattern linux.

Remembered pattern:
Sometimes, an address pattern is similar to search_pattern in other words
scanned pattern and search_pattern are same then we can ignore search
pattern in an instruction part. For example, user wishes replace word unix with
word linux in those lines of file f1 that contains unix pattern then the command
is:
$sed'/unix/s/unix/linux/' f1

In this example, both address pattern and search pattern are same. So, if you
ignore search pattern then the above command is re-written as follow:
$sed /unix/s//linux/' f1

The two front slashes (i.e. //) represents an empty or null regular expression
which is interpreted as the scanned pattern and search pattern are the same.
We will call it the remembered pattern.
Therefore, another alternative to write the above command is like this:
$sed's/unix/linux/’ file1

✓ However, when a user can use // in the replace_string then it removes the
patten from the output.
For example, to remove all unix words from file f1 then the command is:
$sed's/unix//g' f1 Or $sed 'unix/s///g' file

✓ Sometimes, an address pattern, search pattern and replace string may also be
different string.
For example, consider the following command:
$sed-n'/The unix/s/unix/UNIX/gp'f1

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 50 | P a g e
Unix & Shell Programming Unit-4

It selects lines that contain We The unit (i.e. address patter) and replace each
unix (i.e. search pattern) with UNIX (i.e. replace string).

Moreover, sed utility also uses regular expression in a search_pattern to be


substituted. Some of them are listed below:
✓ Repeat pattern
✓ Braces Regular Expression (BRE) or Interval Regular Expression (IRE)
✓ Tagged Regular Expression (TRE)

Repeat pattern:
There might be a situation when a search pattern occurs in a replace_string. To
repeat search_pattern in replace_string a special meta-character ampersand
(i.e. &) is used. For example, to replace pattern director with in-charge director
then we can write command as follow:

$sed's/director/ in-charge &/' f1

Other alternatives are as follow:


$sed's/director/ in-charge director/'file1
$sed’/director/s// in-charge &/’ file1

In above example, & (ampersand) is known as repeat pattern operator that


expands to the entire search_pattern.

✓ Display the list of files of working directory which have write permission set to
either group of others then the command is as follow:
$ls-1|grep "^.\{5,8\}w"

BRE or IRE
Sometimes, a user wishes to print those lines that containing characters that
occurs number of times in a line or locate fixed length of lines. This is possible
with BRE or IRE. So we can define BRE or IRE as it an expression that consists
of character and a single or pair of numbers enclosed within a pair of escape
curly braces (i.e. \{\}).

This expression derived from ed, and takes the four forms as follow:
(i) Ch\ {m\}: It indicates that character ch occurs m-times.
(ii) Ch\ {m,n\}: It indicates that character ch occurs between m and n times.
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 51 | P a g e
Unix & Shell Programming Unit-4

(iii) ch {m,\}: It indicates that character ch occurs at least m-times.


(iv) ch\{,n\}: It indicates that character ch occurs at most n-times.
A character ch may be a single character regular expression. It can be ordinary
character, dot (.) or character class followed by a pair of escaped curly braces,
containing either a single number m, or a range of numbers lying between m
and n to determine the number of times the character ch can occur. The value
of m and n cannot exceed 255.

✓ Instead of write 50-dots to locate lines having more than 50-characters, we can
use IRE as follow:
$sed-n’/.\ {51\}/p' f1

It prints all lines longer than 50 characters. Here the expression \{51\} specifies
that the any character (i.e. dot for any character) has to occur 51 times.

✓ To display all lines having length between 51 and 100 characters then the
command is:
$sed -n ‘/^.\{51,100\}$/p’ f1

✓ Display all lines having length of at least 50-characters.

$sed -n ‘/.\{50,\}p’ f1 or $sed -n ‘/.\{50,\}/p’ f1

✓ Display a lines that consist of only alphabets then the command is:
$sed—n ‘/^[a-zA-Z]\{1,\}$/p' f1

✓ To replace all consecutive space by single space, use the regular expression as
follows.
$sed-n 's/[]\{2,\}//gp' f1

Here only affected lines will be displayed.

✓ If we omit -n option & p command then it display affected lines as well as


unaffected lines and the command is as follow:
$sed's/[]\{2,\}//g'f1

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 52 | P a g e
Unix & Shell Programming Unit-4

Tagged Regular expression (TRE):


We Know that remember pattern repeats entire search_pattern in
replace_string. To repeat just a part of the search_pattern in replace_string
TRE is used. TRE is an expression which groups a search_pattern with a pair
of escape parenthesis (i.e.\(\)) and represents
these group in replace_string with back reference (i.e. \1 up to \9).That means,
the first group is represented as \1, second group is \2 and so forth.

✓ For example, a user wishes to replace the word new line by new. line. Then the
command is
$echo "new line" | sed's/\(new\) (line\)/\1-\2/'

Here, we have two tagged patterns \(new\) and \(line\) in the search_pattern.
They are automatically reproduced in the replace_string back references \1
and \2, respectively. Each escaped pattern is called a Tagged Regular
Expression (TRE).

✓ To convert date in the format mm/dd/yy in to dd-mm-yy then we can write


command as
$sed’s/\(..\) \/\(..\) \/\(..\)/\2-\1-\3/’ f1
✓ To replace 'Unix shell programming' with 'Unix & shell programming' then the
command is:
$sed's/Unix shell programming/Unix & shell programming f1
Or
$sed's /\(Unix\)\(shell programming\)/\1 \& \2/’ f1

The search pattern of sed utility uses only a subset of the regular expression
atoms and patterns. The allowable atoms are listed in table-d. 13):
Table-(d.13): atoms used by sed utility

Atoms Allowed
Character v
Dot v
Class v
Anchors ^ and $
Back Reference v

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 53 | P a g e
Unix & Shell Programming Unit-4

The last column of table shows that sed utilty supports all atoms except two
anchors \<and \>.

Table-(e.13): operators used by sed utility

Operators Allowed
Sequence v
Repetition V
Alternation X
Group X
Save v

The second column of table shows that sed utility supports all operators except
group and alternation.

AWK

he awk command is used for text processing in Linux. Although, the sed
command is also used for text processing, but it has some limitations, so the
awk command becomes a handy option for text processing. It provides
powerful control to the data.
The Awk is a powerful scripting language used for text scripting. It searches
and replaces the texts and sorts, validates, and indexes the database.
It is one of the most widely used tools for the programmer, as they write the
scaled-down effective program in the form of a statement to define the text
patterns and designs.

Basic Syntax

The basic syntax for using awk is:

awk ‘pattern {action }’ file

● pattern: A condition that, when true, triggers the action.


● action: Commands or operations to perform when the pattern matches.
● file: The name of the file to be processed.

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 54 | P a g e
Unix & Shell Programming Unit-4

If pattern is omitted, the action is applied to every line. If action is omitted, awk prints the
lines that match the pattern.

Key Concepts and Features

1. Fields and Records

● Records: By default, each line of input is considered a record.


● Fields: Each record is split into fields, with fields being separated by whitespace by
default. You can change the field separator using the -F option.

Example:

Given a file data.txt with the following content:

Alice 30 Engineer
Bob 25 Artist
Carol 28 Scientist

The default field separator is whitespace, so:

● $1 refers to the first field (name).


● $2 refers to the second field (age).
● $3 refers to the third field (profession).

2. Basic Examples

Print the Entire File:

awk '{ print }' data.txt

This command prints each line of data.txt.

Print Specific Fields:

awk '{ print $1, $3 }' data.txt

This command prints the first and third fields of each line:

Copy code
Alice Engineer
Bob Artist
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 55 | P a g e
Unix & Shell Programming Unit-4

Carol Scientist

Print Lines Matching a Pattern:

awk '/Alice/' data.txt

This command prints lines containing "Alice":

Alice 30 Engineer

Conditional Actions:

awk '$2 > 26 { print $1, $2 }' data.txt

This command prints the names and ages of people older than 26:

Alice 30
Carol 28

3. Field Separator

To use a different field separator, use the -F option:

awk -F ',' '{ print $1, $2 }' data.csv

If data.csv has the following content:

John,Doe
Jane,Smith

This command prints:

John Doe
Jane Smith

4. Patterns and Actions

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 56 | P a g e
Unix & Shell Programming Unit-4

Begin and End Blocks:

● BEGIN block: Executes before any input is processed.


● END block: Executes after all input has been processed.

awk 'BEGIN { print "Name\tAge\tProfession" } { print $1, $2, $3 } END


{ print "End of data" }' data.txt

Output:

Name Age Profession


Alice 30 Engineer
Bob 25 Artist
Carol 28 Scientist
End of data

5. Built-in Variables

● $0: The entire current record.


● $1, $2, ...: Individual fields in the current record.
● NR: The number of records processed (i.e., the current line number).
● NF: The number of fields in the current record.

Example

awk '{ print "Line", NR, "has", NF, "fields" }' data.txt

Output:

Line 1 has 3 fields


Line 2 has 3 fields
Line 3 has 3 fields

1. User-Defined Functions

You can define functions in awk to perform complex operations:

awk '
function square(x) { return x * x }
{ print $1, "squared is", square($2) }
SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 57 | P a g e
Unix & Shell Programming Unit-4

' data.txt

This defines a function square to compute the square of a number and applies it to the second
field.

SASCMA English Medium & STERS BCA & BBA College By- Charmi Chauhan 58 | P a g e
Unix & Shell Programming Unit-
4

AWK - Arithmetic Operators

AWK supports the following arithmetic operators –

Addition
It is represented by plus (+) symbol which adds two or more numbers.
The following example demonstrates this −
Example
[jerry]$ awk 'BEGIN { a = 50; b = 20; print "(a + b) = ", (a + b) }'
On executing this code, you get the following result −
Output
(a + b) = 70

Subtraction
It is represented by minus (-) symbol which subtracts two or more
numbers. The following example demonstrates this −
Example
[jerry]$ awk 'BEGIN { a = 50; b = 20; print "(a - b) = ", (a - b) }'
On executing this code, you get the following result −
Output
(a - b) = 30

Multiplication
It is represented by asterisk (*) symbol which multiplies two or more
numbers. The following example demonstrates this −

Example
[jerry]$ awk 'BEGIN { a = 50; b = 20; print "(a * b) = ", (a * b) }'
On executing this code, you get the following result −

Output
(a * b) = 1000

Division
It is represented by slash (/) symbol which divides two or more numbers.
The following example illustrates this −

Example
[jerry]$ awk 'BEGIN { a = 50; b = 20; print "(a / b) = ", (a / b) }'

59
Unix & Shell Programming Unit-
4

On executing this code, you get the following result −

Output
(a / b) = 2.5

Modulus
It is represented by percent (%) symbol which finds the Modulus division
of two or more numbers. The following example illustrates this −
Example
[jerry]$ awk 'BEGIN { a = 50; b = 20; print "(a % b) = ", (a % b) }'
On executing this code, you get the following result −
Output
(a % b) = 10

AWK - Assignment Operators

AWK supports the following assignment operators –

Simple Assignment
It is represented by =. The following example demonstrates this –

Example
[jerry]$ awk 'BEGIN { name = "Jerry"; print "My name is", name }'
On executing this code, you get the following result −
Output
My name is Jerry

Shorthand Addition
It is represented by +=. The following example demonstrates this −
Example
[jerry]$ awk 'BEGIN { cnt = 10; cnt += 10; print "Counter =", cnt }'
On executing this code, you get the following result −
Output
Counter = 20

60
Unix & Shell Programming Unit-
4

In the above example, the first statement assigns value 10 to the


variable cnt. In the next statement, the shorthand operator increments
its value by 10.

Shorthand Subtraction
It is represented by -=. The following example demonstrates this –

Example
[jerry]$ awk 'BEGIN { cnt = 100; cnt -= 10; print "Counter =", cnt }'
On executing this code, you get the following result −
Output
Counter = 90

In the above example, the first statement assigns value 100 to the
variable cnt. In the next statement, the shorthand operator decrements
its value by 10.

Shorthand Multiplication
It is represented by *=. The following example demonstrates this –

Example
[jerry]$ awk 'BEGIN { cnt = 10; cnt *= 10; print "Counter =", cnt }'
On executing this code, you get the following result −
Output
Counter = 100
In the above example, the first statement assigns value 10 to the
variable cnt. In the next statement, the shorthand operator multiplies its
value by 10.

Shorthand Division
It is represented by /=. The following example demonstrates this −
Example
[jerry]$ awk 'BEGIN { cnt = 100; cnt /= 5; print "Counter =", cnt }'
On executing this code, you get the following result −
Output
Counter = 20

61
Unix & Shell Programming Unit-
4

In the above example, the first statement assigns value 100 to the
variable cnt. In the next statement, the shorthand operator divides it by
5.

Shorthand Modulo
It is represented by %=. The following example demonstrates this −
Example
[jerry]$ awk 'BEGIN { cnt = 100; cnt %= 8; print "Counter =", cnt }'
On executing this code, you get the following result −
Output
Counter = 4

Shorthand Exponential
It is represented by ^=. The following example demonstrates this −
Example
[jerry]$ awk 'BEGIN { cnt = 2; cnt ^= 4; print "Counter =", cnt }'
On executing this code, you get the following result −
Output
Counter = 16
The above example raises the value of cnt by 4.

Shorthand Exponential
It is represented by **=. The following example demonstrates this −
Example
[jerry]$ awk 'BEGIN { cnt = 2; cnt **= 4; print "Counter =", cnt }'
On executing this code, you get the following result −
Output
Counter = 16
This example also raises the value of cnt by 4.

AWK - Relational Operators

AWK supports the following relational operators –

Equal to

62
Unix & Shell Programming Unit-
4

It is represented by ==. It returns true if both operands are equal,


otherwise it returns false. The following example demonstrates this –

Example
awk 'BEGIN { a = 10; b = 10; if (a == b) print "a == b" }'
On executing this code, you get the following result −
Output
a == b

Not Equal to
It is represented by !=. It returns true if both operands are unequal,
otherwise it returns false.
Example
[jerry]$ awk 'BEGIN { a = 10; b = 20; if (a != b) print "a != b" }'
On executing this code, you get the following result −
Output
a != b

Less Than
It is represented by <. It returns true if the left-side operand is less than
the right-side operand; otherwise it returns false.
Example
[jerry]$ awk 'BEGIN { a = 10; b = 20; if (a < b) print "a < b" }'
On executing this code, you get the following result −
Output
a<b

Less Than or Equal to


It is represented by <=. It returns true if the left-side operand is less than
or equal to the right-side operand; otherwise it returns false.
Example
[jerry]$ awk 'BEGIN { a = 10; b = 10; if (a <= b) print "a <= b" }'
On executing this code, you get the following result −
Output
a <= b

Greater Than

63
Unix & Shell Programming Unit-
4

It is represented by >. It returns true if the left-side operand is greater


than the right-side operand, otherwise it returns false.
Example
[jerry]$ awk 'BEGIN { a = 10; b = 20; if (b > a ) print "b > a" }'
On executing the above code, you get the following result −
Output
b>a

Greater Than or Equal to


It is represented by >=. It returns true if the left-side operand is greater
than or equal to the right-side operand; otherwise it returns false.
b >= a

AWK - Logical Operators

AWK supports the following logical operators –

Logical AND
It is represented by &&. Its syntax is as follows −
Syntax
expr1 && expr2
It evaluates to true if both expr1 and expr2 evaluate to true; otherwise it
returns false. expr2 is evaluated if and only if expr1 evaluates to true. For
instance, the following example checks whether the given single digit
number is in octal format or not.
Example
[jerry]$ awk 'BEGIN {
num = 5; if (num >= 0 && num <= 7) printf "%d is in octal format\n",
num
}'
On executing this code, you get the following result −
Output
5 is in octal format

Logical OR
It is represented by ||. The syntax of Logical OR is −
Syntax

64
Unix & Shell Programming Unit-
4

expr1 || expr2
It evaluates to true if either expr1 or expr2 evaluates to true; otherwise
it returns false. expr2 is evaluated if and only if expr1 evaluates to false.
The following example demonstrates this −
Example
[jerry]$ awk 'BEGIN {
ch = "\n"; if (ch == " " || ch == "\t" || ch == "\n")
print "Current character is whitespace."
}'
On executing this code, you get the following result −
Output
Current character is whitespace

Logical NOT
It is represented by exclamation mark (!). The following example
demonstrates this −
Example

! expr1
It returns the logical compliment of expr1. If expr1 evaluates to true, it
returns 0; otherwise it returns 1. For instance, the following example
checks whether a string is empty or not.
Example
[jerry]$ awk 'BEGIN { name = ""; if (! length(name)) print "name is empty
string." }'
On executing this code, you get the following result −
Output
name is empty string.

65
Unix & Shell Programming Unit-
4

AWK - Control Flow

Like other programming languages, AWK provides conditional


statements to control the flow of a program. This chapter explains AWK's
control statements with suitable examples.

If statement
It simply tests the condition and performs certain actions depending
upon the condition. Given below is the syntax of if statement –

Syntax
if (condition)
action
We can also use a pair of curly braces as given below to execute multiple
actions –

Syntax
if (condition) {
action-1
action-1
.
.
action-n
}
For instance, the following example checks whether a number is even or
not −
Example
[jerry]$ awk 'BEGIN {num = 10; if (num % 2 == 0) printf "%d is even
number.\n", num }'
On executing the above code, you get the following result −
Output
10 is even number.

If Else Statement
In if-else syntax, we can provide a list of actions to be performed when a
condition becomes false.
The syntax of if-else statement is as follows −
Syntax
if (condition)

66
Unix & Shell Programming Unit-
4

action-1
else
action-2
In the above syntax, action-1 is performed when the condition evaluates
to true and action-2 is performed when the condition evaluates to false.
For instance, the following example checks whether a number is even or
not −
Example
[jerry]$ awk 'BEGIN {
num = 11; if (num % 2 == 0) printf "%d is even number.\n", num;
else printf "%d is odd number.\n", num
}'
On executing this code, you get the following result −
Output
11 is odd number.

If-Else-If Ladder
We can easily create an if-else-if ladder by using multiple if-
else statements. The following example demonstrates this −
Example
[jerry]$ awk 'BEGIN {
a = 30;

if (a==10)
print "a = 10";
else if (a == 20)
print "a = 20";
else if (a == 30)
print "a = 30";
}'
On executing this code, you get the following result −
Output
a = 30

67
Unix & Shell Programming Unit-
4

AWK - Loops

This chapter explains AWK's loops with suitable example. Loops are used
to execute a set of actions in a repeated manner. The loop execution
continues as long as the loop condition is true.

For Loop
The syntax of for loop is –

Syntax
for (initialization; condition; increment/decrement)
action

Initially, the for statement performs initialization action, then it checks


the condition. If the condition is true, it executes actions, thereafter it
performs increment or decrement operation. The loop execution
continues as long as the condition is true. For instance, the following
example prints 1 to 5 using for loop –

Example
[jerry]$ awk 'BEGIN { for (i = 1; i <= 5; ++i) print i }'

On executing this code, you get the following result −


Output
1
2
3
4
5

While Loop

The while loop keeps executing the action until a particular logical
condition evaluates to true. Here is the syntax of while loop –

Syntax
while (condition)
action

68
Unix & Shell Programming Unit-
4

AWK first checks the condition; if the condition is true, it executes the
action. This process repeats as long as the loop condition evaluates to
true. For instance, the following example prints 1 to 5 using while loop –

Example
[jerry]$ awk 'BEGIN {i = 1; while (i < 6) { print i; ++i } }'
On executing this code, you get the following result −
Output
1
2
3
4
5

Do-While Loop

The do-while loop is similar to the while loop, except that the test
condition is evaluated at the end of the loop. Here is the syntax of do-
whileloop –

Syntax
do
action
while (condition)
In a do-while loop, the action statement gets executed at least once
even when the condition statement evaluates to false. For instance, the
following example prints 1 to 5 numbers using do-while loop –

Example
[jerry]$ awk 'BEGIN {i = 1; do { print i; ++i } while (i < 6) }'
On executing this code, you get the following result −
Output
1
2
3
4
5

69
Unix & Shell Programming Unit-
4

Break Statement
As its name suggests, it is used to end the loop execution. Here is an
example which ends the loop when the sum becomes greater than 50.

Example
[jerry]$ awk 'BEGIN {
sum = 0; for (i = 0; i < 20; ++i) {
sum += i; if (sum > 50) break; else print "Sum =", sum
}
}'
On executing this code, you get the following result −
Output
Sum = 0
Sum = 1
Sum = 3
Sum = 6
Sum = 10
Sum = 15
Sum = 21
Sum = 28
Sum = 36
Sum = 45

Continue Statement
The continue statement is used inside a loop to skip to the next iteration
of the loop. It is useful when you wish to skip the processing of some
data inside the loop. For instance, the following example
uses continue statement to print the even numbers between 1 to 20.

Example
[jerry]$ awk 'BEGIN {
for (i = 1; i <= 20; ++i) {
if (i % 2 == 0) print i ; else continue
}
}'
On executing this code, you get the following result –

70
Unix & Shell Programming Unit-
4

Output
2
4
6
8
10
12
14
16
18
20

Exit Statement
It is used to stop the execution of the script. It accepts an integer as an
argument which is the exit status code for AWK process. If no argument
is supplied, exit returns status zero. Here is an example that stops the
execution when the sum becomes greater than 50.
Example
[jerry]$ awk 'BEGIN {
sum = 0; for (i = 0; i < 20; ++i) {
sum += i; if (sum > 50) exit(10); else print "Sum =", sum
}
}'

Output
On executing this code, you get the following result −
Sum = 0
Sum = 1
Sum = 3
Sum = 6
Sum = 10
Sum = 15
Sum = 21
Sum = 28
Sum = 36
Sum = 45
Let us check the return status of the script.
Example

71
Unix & Shell Programming Unit-
4

[jerry]$ echo $?
On executing this code, you get the following result −
Output
10

72

You might also like