Week 7&8
Week 7&8
Awk is a scripting language used for manipulating data and generating reports. The awk command
programming language requires no compiling and allows the user to use variables, numeric functions,
string functions, and logical operators
1. AWK Operations:
(a) Scans a file line by line
(b) Splits each input line into fields
(c) Compares input line/fields to pattern
(d) Performs action(s) on matched lines
2. Useful For:
(a) Transform data files
(b) Produce formatted reports
3. Programming Constructs:
(a) Format output lines
(b) Arithmetic and string operations
(c) Conditionals and loops
Syntax:
awk options 'selection _criteria {action }' input-file > output-file
Options:
-f program-file: Reads the AWK program source from the file
program-file, instead of from the
first command line argument.
-F fs : Use fs for the input field separator
Sample Commands
Example:
Consider the following text file as the input file for all cases below:
$cat > employee.txt
ajay manager account 45000
sunil clerk account 25000
varun manager sales 50000
amit manager account 47000
tarun peon sales 15000
deepak clerk sales 23000
sunil peon sales 13000
satvik director purchase 80000
1. Default behavior of Awk: By default Awk prints every line of data from the specified file.
3. Splitting a Line Into Fields : For each record i.e line, the awk command splits the record
delimited by whitespace character by default and stores it in the $n variables. If the line has 4 words,
it will be stored in $1, $2, $3 and $4 respectively. Also, $0 represents the whole line.
$ awk '{print $1,$4}' employee.txt
Output:
Built-In Variables In Awk
Awk’s built-in variables include the field variables—$1, $2, $3, and so on ($0 is the entire line) —
that break a line of text into individual words or pieces called fields.
NR: NR command keeps a current count of the number of input records. Remember that
records are usually lines. Awk command performs the pattern/action statements once for each
record in a file.
NF: NF command keeps a count of the number of fields within the current input record.
FS: FS command contains the field separator character which is used to divide fields on the
input line. The default is “white space”, meaning space and tab characters. FS can be
reassigned to another character (typically in BEGIN) to change the field separator.
RS: RS command stores the current record separator character. Since, by default, an input line
is the input record, the default record separator character is a newline.
OFS: OFS command stores the output field separator, which separates the fields when Awk
prints them. The default is a blank space. Whenever print has several parameters separated
with commas, it will print the value of OFS in between each parameter.
ORS: ORS command stores the output record separator, which separates the output lines when
Awk prints them. The default is a newline character. print automatically outputs the contents
of ORS at the end of whatever it is given to print.
Examples:
Use of NR built-in variables (Display Line Number)
$ awk '{print NR,$0}' employee.txt
Output:
In the above example $1 represents Name and $NF represents Salary. We can get the Salary using
$NF , where $NF represents last field.
More Examples
For the given text file:
$cat > cse.txt
A B C
Tarun A12 1
Man B6 2
Praveen M42 3
1) To print the first item along with the row number(NR) separated with ” – “ from each line in
cse.txt:
$ awk '{print NR "- " $1 }' cse.txt
Output:
2) To return the second column/item from cse.txt:
$ awk '{print $2}' cse.txt
Output :
The grep filter searches a file for a particular pattern of characters, and displays all lines that contain
that pattern. The pattern that is searched in the file is referred to as the regular expression (grep stands
for global search for regular expression and print out).
Options Description
-c : This prints only a count of the lines that match a pattern
-h : Display the matched lines, but do not display the filenames.
-i : Ignores, case for matching
-l : Displays list of a filenames only.
-n : Display the matched lines and their line numbers.
-v : This prints out all the lines that do not matches the pattern
-e exp : Specifies expression with this option. Can use multiple times.
-f file : Takes patterns from file, one per line.
-E : Treats pattern as an extended regular expression (ERE)
-w : Match whole word
-o : Print only the matched parts of a matching line,
with each such part on a separate output line.
1. Case insensitive search : The -i option enables to search for a string case insensitively in the given
file. It matches the words like “UNIX”, “Unix”, “unix”.
2. Displaying the count of number of matches : We can find the number of lines that matches the
given string/pattern
$grep -c "unix" cse1.txt
Output:
3. Display the file names that matches the pattern : We can just display the files that contains the
given string/pattern.
$grep -l "unix" *
or
cse1.txt
4. Checking for the whole words in a file : By default, grep matches the given string/pattern even if
it is found as a substring in a file. The -w option to grep makes it match only the whole words.
5. Displaying only the matched pattern : By default, grep displays the entire line which has the
matched string. We can make the grep to display only the matched string by using the -o option.
7. Inverting the pattern match : You can display the lines that are not matched with the specified
search string pattern using the -v option.
$ grep -v "unix" cse1.txt
Output:
8. Matching the lines that start with a string : The ^ regular expression pattern specifies the start of
a line. This can be used in grep to match the lines which start with the given string or pattern.
$ grep "^unix" cse1.txt
Output:
9. Matching the lines that end with a string : The $ regular expression pattern specifies the end of a
line. This can be used in grep to match the lines which end with the given string or pattern.
11. -f file option Takes patterns from file, one per line.
$cat pattern.txt
Agarwal
Aggarwal
Agrawal
Example:
$grep -A1 learn cse1.txt
Output:
13. Search recursively for a pattern in the directory: -R prints the searched pattern in the given
b) Develop an interactive grep script that asks for a word and a file name and then tells how many
lines contain that word.
Cat > filename
Day by day week by end
Week by week month by end
Month by month year by end
But friendship is never end
$ vi grep.sh
echo "Enter the pattern to be searched: "
read pattern
echo "Enter the file to be used: "
read filename
echo "Searching for $pattern from file $filename"
echo "The selected records are: "
grep "$pattern" $filename
echo "The no.of lines contains the word( $pattern ) :"
grep -c "$pattern" $filename
Output :
$ sh grep.sh
Week-8:
a) Write a shell script that takes a command–line argument and reports on whether it is a
directory, a file, or something else.
Solution:
echo " enter file"
read str
if test -f $str
then echo "file exists n it is an ordinary file"
elif test -d $str
then echo "directory file"
else
echo "not exists"
fi
if test -c $str
then echo "character device files"
fi
Output:
[lab2@localhost ~]$ sh exp5.sh
b) Write a shell script that accepts one or more file name as arguments and converts all of them to
uppercase, provided they exist in the current directory.
# get filename
echo -n "Enter File Name : "
read fileName
# make sure file exits for reading
if [ ! -f $fileName ]
then
echo "Filename $fileName does not exists"
exit 1
else
# convert uppercase to lowercase using tr command
tr '[a-z]' '[A-Z]' < $fileName
fi
Output:
[lab2@localhost ~]$ cat textdata
Output: Enter File Name :
[lab2@localhost ~]$ sh exp6.sh
Enter File Name :
c) Write a shell script that determines the period for which a specified user is working on the
System.
echo -e "enter the user name :\c"
read usr
tuser=`who | tr -s " " | head -1 | cut -d " " -f1`
if [ "$tuser" = "$usr" ]
then
tm=`who | tr -s " " | head -1 | cut -d " " -f4`
uhr=`echo $tm | cut -d ":" -f1`
umin=`echo $tm | cut -d ":" -f2`
shr=`date "+%H"`
smin=`date "+%M"`
if [ $smin -lt $umin ]
then
shr=`expr $shr - 1`
smin=`expr $smin + 60`
fi
h=`expr $shr - $uhr`
m=`expr $smin - $umin`
echo "user name : $usr"
echo "login period : $h : $m"
else
echo "Invalid User"
fi
Output:
[lab2@localhost ~]$ sh exp8.sh
enter the user name :