0% found this document useful (0 votes)
28 views88 pages

Unit 2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views88 pages

Unit 2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 88

Module 2

Chapter 6
Chapter 8
Chapter 13
Chapter 14
ls -l : Listing file attributes
• ls command is used to list the content of a directory
• ls command with -l ( long) option gives more details about the contents of a
directory
• attributes of a file like its permissions, size and ownership details
• ls looks up the file’s inode to fetch its attributes

$ ls -l
total 72
-rw-r--r-- 1 kumar metal 19514 May 10 13:45 chap01
drwxr-xr-x 2 kumar metal 512 May 9 10:31 progs
• total 72 indicates that a total of 72 blocks are occupied by these files on disk,
each block consisting of 512 bytes ( 1024 in Linux)
File Type and Permission
• the first column shows the type and permission associated with each file
• the first character tells the type of the file, - means ordinary file, d means
directory
• then there is a series of characters r, w, x and -
• in UNIX files can have three types of permissions - read, write and execute
Links
• second column indicates the number of links associated with each file
• this is number of filenames maintaind by the system of that file
• UNIX lets a file have as many names as you want it to have, even though there is
a single file on disk
Ownership
• third column shows who is the owner of the files
• when you create a file or directory, you automatically become its owner
Group Ownership
• when opening a user account, the system administrator also assigns the user to
some group
• the fourth column represents the group owner of the file
File Size
• the fifth column shows the size of the file in bytes, i.e the amount of data it
contains
• it is only a character count, not a measure of disk space that it occupies
• disk space is usually larger than this count since files are written to disk in blocks
of 1024 bytes or more
Last Modification Time
• the sixth, seventh and eighth columns indicate the last modification time of the
file, which is stred to the nearest second
• the file is said to be modified only if the contents have changed in any way
• if you change only the permissions or ownership of the file, the modification time
remains unchanged
Filename
• the last column displays the filenames arranged in ASCII collating sequence
ls -d : Listing Directory attributes

• the ls command when used with directory names, it lists the files in the directory
• to force ls to list the attributes of a directory, rather than its contents you need to
use the -d ( directory ) option
• $ ls -ld helpdir progs
• for the directories first character of the first column will be d, for ordinary files it
will be - and for device files it will be either a, b or c
File Ownership

• when you create a file you are the owner of that file
• if you can’t create files in other user’s home directories, it is because those directories
are not owned by you ( and the owner has not allowed you write access )
• several users may belong to a single group
• the privileges for the group are set by the owner of the file and not by the group
members
• when the system administrator creates a user he has to assign 2 parameters to the user
• the user id ( UID ) - both its name and numeric representation
• the group id ( GID ) - both its name and numeric representation
• etc/passwd file contains UID ( number and name) and GID ( number )
• etc/group file contains GID ( number and name )
$ id
uid=655537 (kumar) gid=655535(metal)
File Permissions

• UNIX has simple and well-defined system of assigning permissions to files


• it follows a three-tiered file protection system that determines a file’s access rights
• consider r w x r-x r--
• each group represents a category and contains three slots representing read, write and
execute permission of the file in that order
• r indicates the read permission, which means cat can display the file
• w indicates write permission; you can edit such a file
• x indicates execute permission; the file can be executed as a program
• - shows the absence of the corresponding permission
• the three categories are owner, group and others
• you can set different permissions for the three categories of users
chmod : Changing File Permissions

• a file or a directory is created with a default set of permissions


• generally default setting write-protects a file from all except the user ( owner ),
though all may have read access.
• the chmod ( change mode ) command is used to set the permissions of one or
more files for all three categories
• it can be run only by the user ( owner ) and the superuser.
• the command can be used in 2 ways
• in a relative manner by specifying the changes to the current permissions
• in an absolute manner by specifying the final permission
Relative Permissions

• when changing the permission in a relative manner, chmod only changes the
permissions specified in the command line and leaves the other permissions
unchanged
• chmod category operation permission filename(s)
• chmod takes as its argument an expression comprising some letters and symbols
that completely describe the user category and the type of permission being
assigned or removed
• the expression contains three components
• user category ( user, group, others )
• the operation to be performed ( assign or remove a permission )
• the type of permission ( read, write, execute )
Category Operation Permission

u - User + Assigns permission r Read permission

g - Group - Removes permission w Write permission

o - Others = Assigns absolute permission x Execute permission

a - all ( ugo )
• to assign excute permission to the user ( owner ) of the file xstart
$ chmod u+x xstart
• execute permission to all three category
$ chmod ugo+x xstart
$ chmod a+x xstart
$ chmod +x xstart
• chmod accepts multiple filenames in the command line
$ chmod u+x note note1 note2
• permissions are removed with the - operator
$ chmod go-r xstart
• chmod also accepts multiple expressions delimited by commas
$ chmod a-x, go+r xstart
• chmod also accepts more than one premission to be set; u+rwx is a valid chmod
expression
$chmod o+wx xstart
Absolute Permissions

• sometimes we may want to set read, write and execute permissions to 3


categories explicitly
• the expression used by chmod here is a string of three octal numbers
• octal numbers use the base 8 and octal digits have the value 0 to 7, so a set of
three bits can represent one octal digit
• to represent the permissions we use one ocatal digit per category
• we have three category and three permissions for each category, so three octal
digits can describe a files permissions completely
• the most significant digit represents the user and the least one represents others
Binary Octal Permission Significance

000 0 --- No permission

001 1 --x Execute only

010 2 -w- Write only

011 3 -wx Write and execute

100 4 r-- Read only

101 5 r-x Read and execute

110 6 rw- Read and write

111 7 rwx Read, write and execute


$ chmod 666 xstart

$ chmod 644 xstrat

$ chmod 761 xstart

$ chmod 777 xstart


Using chmod Recursively ( -R )

• it is possible to make chmod descend a directory hierarchy and apply the


expression to every file and subdirectory it finds
• to do this use -R ( recursive ) option
$ chmod -R a+x shell_scripts
• the above command makes all files and subdirectories found in the tree-walk
( that commences from the shell_scripts directory ) executable by all users
• you can also provide multiple directory and filenames
Directory Permissions

• directories also have their own permissions


• read and write access to an ordinary file are also influenced by the permission of
the directory housing them
• it is possible that a file can’t be accessed even though it has read permission and
can be removed even when it is write-protected; this is because of the directory
permissions
• the default permissions for a directory usually be rwxr-xr-x ( or 755)
• A directory must never be writable by group and others
• if you find that your files are being tampered with even though they appear to be
protected; check up the directory permissions
The SHELL’s Interpretive Cycle

• when you log on to a UNIX system, you first see a prompt. This prompt remains
there until you key in something. Even though it may appear that the system is
idling, a UNIX command is in fact running at the terminal, but this command is
special; its with you all the time and never terminates until you log out. This
command is the shell
• when you key in a command, it goes as input to the shell
• the shell first scans the command line for metacharacters
• metacharacters are special characters that mean nothing to the command, but
mean something special to the shell
• when shell sees metacharacters like >, |, * etc it performs all actions represented
by the symbols before the command can be executed
• when all preprocessing is complete, the shell passes on the command line to the
kernel for ultimate execution
• while the command is running, the shell has to wait for notice of its termination
from the kernel
• after the command has completed execution, the shell once again issues the
prompt to take up your next command
Pattern Matching - The Wild Cards

• wild-cards are set of special characters that the shell uses to match filenames
• often, you may need to enter multiple filenames in a command line
$ ls chap01 chap02 chap03 chap04 chapx chapy chapz
• if the filenames are similar, we can use the facility offered by the shell of
representing them by a single pattern or model
• the pattern chap* represents all filenames begining with chap
• this pattern is framed with ordinary characters ( like chap ) and a metacharacter
( like * ) using well defined rules
• the pattern can then be used as an argument to the command,and the shell will
expand it suitably before the command is excuted
Wild - Cards Matches

* any number of characters including none

? a single character

[ijk] a single character - either i, j or k

a single character that is within the ASCII range of the character x


[x-z]
and z

[!ijk] a single character that is not an i, j or k ( not in C shell )

a single character that is not within the ASCII range of the


[!x-z]
character x and z ( not in C shell )

{pat1, pat2...} pat1, pat2, etc ( not in Bourne Shell )


The * and ?

• the metacharacter * is one of the characters of the shell’s wild-card set


• it matches any number of charactes ( including none )
$ ls chap*
chap chap01 chap02 chap03 chapx chapy chapz
• when the shell encounters this command line, it identifies the * as a wild card.
• it then looks in the current directory and recreates the command line as below
from the filenames that match the pattern chap*
• ls chap chap01 chap02 chap03 chapx chapy chapz
• the shell now hands over this command line to the kernel, which in turn runs the
command
• the wild card ? matches the single character
• when we use the pattern chap?, the shell matches all five-character filenames
begining with chap
$ ls chap?
chapx chapy chapz
• the pattern chap?? matches six-character filenames begining with chap
$ ls chap??
chap01 chap02 chap03
Matching the Dot ( . )

• the behaviour of the * and ? in relation to the dot isn’t straightforward


• there are two things that the * and ? can’t match
• they don’t match a filename begining with a dot, but they can match any number
of embedded dots, for example apache*gz matches apache_1.3.20.tar.gz
• these characters don’t match the / in a pathname, you cant use cd /usr?local to
switch to usr/local
• for example, if you want to list all hidden files in your directory having at least 3
characters after the dot, the dot must be matched explicitly
$ ls .???*
• however to match a dot anywhere but not at the begining, it need not me
matched explicitly
$ ls emp*lst
emp.lst emp1.lst emp234.lst
The character class

• the wild cards * and ? are not very restrictive, using them its not easy to list only
chapy and chaz
• you can frame more restrictive patterns with the character class
• the character class comprises a set of characters enclosed by the rectangular
brackets, [ and ], but it matches a single character in the class
• the pattern [abcd] is a character class and it matches a single character - a, b, c or d
• this can be combined with any string or another wild card expression
• the pattern chap0[124] will match all the filenames that start with chap0 followed
by a number 1 or 2 or 4
• range specification is also possible inside the class with a - ( hyphen); the two
characters on either side of it form the range of characters to be matched
$ ls chap0[1-4]
$ ls chap[x-z]
• a valid range specification requires that the character on the left have a lower
ASCII value than the one in the right
• the expression [a-zA-Z] matches all filenames begining with an alphabet
irrespective of case
Negating the Character Class ( ! )
• doesnt work on C shell
• you can use ! as the first character in the class to negate the class
• the expression [!a-zA-Z] mathches all the filenames that doesn’t begin with an
alphabet character
Matching Totally Dissimilar Patterns

• this feature is not available in Bourne shell


• how does one copy all the C and Java source progrms from another directory?
• delimit the patterns with a comma and then put curly braces around them ( no
spaces )
$ cp $HOME/prog_sources/*.{c,java}
• using the curly brace form, you can also access multiple directories
$ cp /home/kumar/{project,html,scripts}
• the above command will copy all the files from three directories project, html,
scripts to the current directory
Rounding Up

• some of the wild-card characters have different meanings depending on where


they are placed in the pattern
• the * and ? lose their meaning when used inside the class and are matched
literally
• similarly - and ! also lose their significance when placed outside the class
Escaping and Quoting

• if the shell uses some special characters to match the filenames, then filenames
themselves must not contain any of these characters?
• this is not true, filenames can have any of these characters
$ ls chap*
chap chap* chap01 chap02
• now we want to remove the file chap*; which is not easy
• trying rm chap* would be dangerous; it would remove the other filenames
begining with chap also
• we must be able to protect all special characters so the shell is not able to
interpret them
• the shell provides two solutions to prevent its own interference
Escaping
• providing a \ ( backslash ) before the wild-card to remove ( escape ) its special
meaning
Quoting
• enclosing the wild-card or even entire pattern within quotes ( like ‘chap*’ )
Escaping

• placing a \ immediately before a metacharacter turns off its special meaning


• for example, in the pattern \*, the \ tells the shell that the asterisk has to be
matched literally instead of being interpreted as a metacharacter
• this means that we can remove the file chap* without effecting the other
filenames that begin with chap by using
• $ rm chap\*
• the \ suppresses the wild-card nature of the * thus preventing the shell
performing filename expansion on it. this feature is known as escaping
• if you have files chap01, chap02 and chap03 and still you create a file called
chap0[1-3].
• now if you want to acces the file chap0[1-3] then you should escape the 2
rectangular brackets
• $ ls chap0\[1-3\]
Escaping the space

• apart from the metacharacters, there are other charachters that are special - like
space character.
• the shell uses it to delimit command line arguments.
• so to remove the file My Document.doc which has a space embedded, use below
approach
$ rm My\ Document.doc

Escaping the \ itself

• sometimes you may need to interpret the \ itself literally; you need another \
before it
$ echo \\
\
Escaping the Newline Character

• the newline character marks the end of the command line


• some command lines that use several arguments can be long enough to overflow
to the nextline
• to ensure better readability, you need to split the wrapped line into two lines; to
do so you need to input a \ before you press [Enter]
• the \ escapes the meaning of newline character generated by [Enter] and
produces a second prompt ( > or ? )
Quoting

• another way to turn off the meaning of a metacharacter is quoting


• when a command argument is enclosed in quotes, the meanings of all enclosed
special characters are turned off
• you can use either single or double quotes
$ echo ‘\’
$ rm ‘chap*’
$ rm “My Document.doc”
• escaping becomes tedious when there are too many characters to protect;
quoting is better solution
$ echo ‘ The characters |, <, >, and $ are also special’
• in the previous example, we could have used escaping, but then we would need
to use four \s in front of each of these four metacharacters
• we used single quotes because they protect all special characters( except single
quotes)
• double quotes are more permissive; they dont protect ( apart from double
quotes) the $ and the ` ( backquote)
$ echo “command substitution uses`` while TERM is evaluated using $TERM”
command substitution uses while TERM is evaluated using vt100
$ echo ‘command substitution uses`` while TERM is evaluated using $TERM’
command substitution uses `` while TERM is evaluated using $TERM
Redirection: The three standard files
• in the context of redirection terminal is a generic name that represents the
screen, display or keyboard
• we see command output and error messages on the terminal ( display), and we
sometimes provide command input through the terminal ( keyboard )
• the shell associates three files with terminal - two for the display and one for the
keyboard
• these special files are actually streams of characters which many commands see
as input and output
• a stream is simply a sequence of bytes
• when a user logs in, the shell makes available three files representing three
streams
• Standard Input
• the file representing input, which is connected to the keyboard
• Standard Output
• the file representing output, which is connected to the display
• Standard Error
• the file representing error messages that emanate from command or shell

• even though the shell associates each of these files with a default physical device,
this association is not permanent
• the shell can unhook a stream from default device and connect it to a disk file
( for example) the moment it sees some characters in the command line
• to do so, user have to instruct the shell to do that by using symbols like > and < in
the command line
Standard Input

• the cat and wc commands can be used to read disk files


• when these are used without arguments, they read the file representing the
standard input
• this file can represent three input sources
• the keyboard, the default source
• a file using redirection with < symbol
• another program using a pipeline
• when wc is used without argument and have no special symbols like the < and | in
the command line, wc obtains its input from the default source, i.e keyboard and
make the end of input with [Ctrl-d]
$ wc
Standard input can be redirected
It can come from a file
or a pipeline
[Ctrl-d]
3 14 71
• the shell can reassign the standard input file to a disk file
• this means it can redirect the standard input to originate from a file on disk.
• this reassignment requires < symbol
$ wc < sample.txt
3 4 17
• the file name is missing; means that wc didn’t open sample.txt
How it works?

• command: wc < sample.txt


• on seeing the <, the shell opens the disk file, sample.txt for reading
• it unplugs the standard input file from its default source and assigns it to
sample.txt
• wc reads from standard input which has earlier been reassigned by the shell to
sample.txt
standard output

• all commands displaying output on the terminal actually write to the standard
output file as a stream of characters and not directly to the termial as such
• three possible destinations
• the terminal, the default destination
• a file using the redirection symbols > and >>
• as input to another program using a pipeline
• the shell can effect redirection of this stream when it sees the > and >> symbols
in the command line
• we can replace the default destination ( the terminal ) with any file by using the >
symbols followed by the filename
$ wc sample.txt > newfile
• if the file newfile exists the shell overwrites it
• the shell provides the >> symbol to append to a file
$ wc sample.txt >> newfile

How It Works?
• Command: wc sample.txt > newfile
• on seeing the >, the shell opens the disk file, newfile for writing
• it unplugs the standard output file from its default destination and assigns it to
newfile
• wc opens the file sample.txt for reading
• wc writes to standard output which has earlier been assigned by the shell to
newfile
• redirection also useful when concatenating the standard output of a number of
files
$ cat *.c > c_progs_all.txt
Standard Error

• each of theree standard files are represented by a number called file descriptor
• a file is opened by referring to its pathname, but subsequent read and write
operations identify the file by this file descriptor
• 0 - standard input
• 1 - standard output
• 2 - standard error
• we need to explicitly use one of these descriptors when handling the standard
error stream
• when you enter an incorrect command or try to open a nonexistent file, certain
diagnostic message show up on the screen; this is the standard error stream
whose default destination is the terminal
$ cat foo
cat: cannot open foo
• you can redirect this stream to a file
• using the symbol for standard output obviously won’t do:
$ cat foo > errorfile
cat cannot open foo
• the diagnostic error has not been sent to errorfile
$ cat foo 2>errorfile
$ cat errorfile
cat: cannot open foo
• we can also append diagnostic output as below
$ cat foo 2>>errorfile
• if you have a program that runs for a long time and is not error-free, you can
redirect the standard error to a separate file and then stay away from the
terminal
PIPES
• standard input and standard output are two separate streams that can be
individually manipulated by the shell
• the shell can connect these streams so that one command takes input from the
other
$ who > user.txt
$ wc -l < user.txt
• using an intermediate file ( user.txt ) we effectively counted the number of users
• this method has 2 disadvantages
• for long-running commands this process can be slow
• you need an intermediate file that has to be removed after completion of the job
• in the previous example who’s standard output was redirected and wc’s standard input.
• the shell can connect these streams using a special operator, the | (pipe), and avoid
creation of the disk file
$ who | wc -l
5
• here, the output of who has been passed directly to the input of wc, and who is said to
be piped to wc
• when multiple commands are connected this way a pipeline is said to be formed
• it is the shell that sets up this connection and the commands have no knowledge of it
$ ls | wc -l
15
• the output of wc can also be redirected to new file
$ ls | wc -l > fcount
• there is no restriction on the number of commands you can use in a pipeline
Filters using Regular Expressions - grep and sed
• you often need to search a file for a pattern, either to see the lines containing ( or
not containing) it or to have it replaced with something else
• there are two important filters that can be used for these tasks - grep and sed
• grep takes care of all search requirements you may have
• sed can even manipulate the individual characters in a line
grep : Searching for a Pattern

• UNIX has a special family of commands for handling search requirements and the
principal member of this family is grep command
• grep scans its input for a patternand displays lines containing the pattern, the line
numbers or filenames where pattern occurs
• the command uses the following syntax
• grep options pattern filename(s)
• grep searches for pattern in one or more filenames, or the standard input if no
filename is specified
$ grep “sales” emp.lst
• the above command will display all the lines containg sales from the file emp.lst
• pattern can be specified without double quotes also, is pattern is a single word
$ grep sales emp.lst
• grep silently returns the prompt in case the pattern cant be located
$ grep president emp.lst
_
• when grep is used with multiple filenames, it displays the filenames along with
the output
$ grep “director” emp1.lst emp2.lst
emp1.lst:1006 | chanchal | director | sales | 6700
emp2.lst:2365 | jai sharma| director | marketing | 7000
• quoting is essential when the pattern contains multiple words
$ grep ‘jai sharma’ emp.lst
grep Options
Options Significance
-i ignores case for matching
-v doesnt display lines matching expression
-n displays line numbers along with lines
-c displays cpount of number of occurrence
-l displays list of filenames only
-e exp specifies expression with this option. Can use multiple times. Also used for matching expression
begining with a hyphen
-x matches pattern with entire line
-f file takes pattern from file, one per line
-E treats pattern as an extended regular expression
-F matches multiple fixed strings
Ignoring Case ( -i )
• when you look for a name but are not sure of the case use -i option
$ grep -i ‘agarwal’ emp.lst
• this will match either Agarwal or agarwal
Deleting Lines ( -v )
• grep can play inverse role too; the -v option selects all lines except those
containing the pattern
$ grep -v ‘director’ emp.lst > otherlist
• the original file wont be changed
Displaying line numbers ( -n )
• displays the line numbers containing the pattern, along with the lines
• line numbers are shown at the begining of each line separated from the actual
line by a :
$ grep -n ‘marketing’ emp.lst
Counting lines containing pattern ( -c )
• -c option counts the number of lines containing the pattern
$ grep -c ‘dierctor’ emp.lst
4
• if you use this option with multiple files, the filename is prefixed to the line count
$ grep -c ‘dierctor’ emp*.lst
emp.lst:4
emp1.lst:2

Displaying filenames ( -l )
• this option displays only the names of the file containing the pattern
$ grep -l ‘manager’ *.lst
design.lst
emp.lst
emp1.lst
Matching multiple patterns ( -e )
• if you want to match multiple patterns -e option can be used
$ grep -e “Agarwal” -e “agarwal” -e “aggarwal” emp.lst

Taking patterns from a file ( -f )


• we can place the patterns in a separate file, one pattern per line.
• grep uses the -f option to take patterns from a file
$ grep -f pattern.lst emp.lst
Basic Regular Expressions

• in the previous example, it was tedious to specify multiple patterns with -e option
• UNIX has special feature which allows you to locate a pattern without knowing
eaxctly how it is spelled
• grep uses an expression to match a group of similar patterns
• this expression is a feature of the command that uses it and has nothing to do
with the shell
• the expression is called regular expression; the metacharacter set is shown below
• some of these metacharacters are also meaningful to the shell, so expressions
should be quoted
• POSIX identifies regular xpressions as belonging to 2 categories - basic and
extended
• grep supports basic regular expression ( BRE )by default and extended regular
expression ( ERE ) with the -E option
• sed supports onlu BRE set
Expression Matches
* zero or more occurences of the previous character
g* nothing or g , gg, ggg etc
. a single character
.* nothing or any number of characters
[pqr] a single character p, q or r
[c1-c2] a single character within the ASCII range represented by c1 and c2
[1-3] a single digit between 1 and 3
[^pqr] a single character which is not p, q, r
[^a-zA-Z] a nonalphabetic character
^pat pattern pat at begining of line
pat$ pattern pat at end of line
bash$ bash at end of line
^bash$ bash as the only word in line
^$ lines containing nothing
The Character Class

• a regular expression lets you specify a group of characters enclosed within a pair
of rectangular brackets, [ ], in which case the match is performed for a single
character in the group
• the expression [ra] matches either r or an a
• to match Agarwal and agrawal use the following regular expression
• [aA]g[ar][ar]wal
• the model [ar][ar] matches any of the four patterns aa ar ra rr
$ grep “[aA]g[ar][ar]wal” emp.lst
• a single pattern has matched two similar strings
• we can also use ranges - [a-zA-Z0-9] matches a single alphanumeric character
• while using range, the left side character must have a lower ASCII value than the
right character

Negating a Class ( ^ )
• regular expressions use the ^ symbol to negate the character class
• when a character class begins with ^ symbol, all characters other than the ones
grouped in the class are matched
• [^a-zA-Z] will match a single nonalphabetic character
The *

• the symbol * refers to the immediately preceding character


• it indicates that the previous character can occur many times or not at all ( zero
or more occurence )
• the pattern g* matches single character g, or many number of gs also matches a
null string
• to match a string beginning with g, dont use g*, use gg*
• if you want to match Agarwal, agrawal or aggrawal use following regular
expression
$ grep “[aA]gg*[ar][ar]wal” emp.lst
• the above regular expression matches all three names, no need to use -e option
three times
The Dot
• a . ( dot ) matches a single character
• the pattern 2... matches a four character pattern begining with 2
The regular expression .*
• the dot with the * signifies any number of characters or none
• consider you want to look up the name j. saxena but not sure whether it actually
exists in the file as j.b. saxena or as joginder saxena.
• use the following regular expression
• $ grep “j.*saxena” emp.lst
• note that if you look for the name j.b.saxena, the expression should be
j\.b\.saxena
• the dots needs to be escaped with the \
Specifying Pattern Locations ( ^ and $ )

• there are 2 characters that can match a pattern at the beginning or end of a line
• ^ - for matching at the beginning of a line
• $ - for matching at the end of line
• Consider an example; you want to extract those lines where the emp-id begins
with a 2.
• if you use the regular expression 2...
• this wont do because the character 2 followed by three characters can occur
anywhere in the line
• you must indicate the grep that the pattern occurs at the beginning of the line
and the ^ does it easily
$ grep “^2” emp.lst
• similarly, to select those line where the salary lies between 7000 and 7999, you
have to use the $ at the end of the pattern
$ grep “7...$” emp.lst
• how to reverse the search?
• if you want to select only those lines where emp-id don’t begin with 2 use the
regular expression “^[^2]”
$ grep “^[^2]” emp.lst
• to list only directories ( UNIX has no command ), we can use a pipeline to grep
those lines from the listing that begin with a d
$ ls -l | grep “^d”
Extended Regular Expression(ERE) and grep

• ERE makes it possible to match dissimilar patterns with a single expression


• to use this feature you should use -E option along with grep
• it uses some additional characters
• ch+ - matches one or more occurences of character ch
• ch? - matches zero or one occurences of character ch
• exp1|exp2 - matches exp1 or exp2
• GIF|JPEG - matches GIF or JPEG
• (x1|x2)x3 - matches x1x3 or x2x3
• (lock|ver)wood - matches lockwood or verwood
• note - if your version of grep doesn’t support -E option then use egrep without -E
option
The + and ?

• the ERE set includes two special characters + and ? to restrict the matching scope
• + symbol matches one or more occurrences of the previous character
• ? symbol matches zero or one occurrence of the previous character
• in both cases, the emphasis is on the previous character
• b+ matchs b, bb, bbb etc but unlike b*, it doesn’t match nothing
• the expression b? matches either a single b or nothing
• to match Agarwal and aggarwal we can use below command
$ grep -E “[aA]gg?arwal” emp.lst
Matching multiple patterns ( |, ( and ) )

• the | is the delimiter of multiple patterns


• using it we can locate both sengupta and dasgupta from the file and without
using the -e option
$ grep -E ‘sengupta|dasgupta’ emp.lst
• using the characters ( and ) you can group patterns and when you use the | inside
the parantheses, you can frame an even more compact pattern
$ grep -E ‘(sen|das)gupta’ emp.lst
Shell Programming
Shell Scripts

• when a group of commands have to be executed regularly, they can be stored ina
file and the file itself executed as a shell script or shell program
• we normally use the .sh extension for shell scripts ; though its not mandatory
• shell scripts are executed in a separate child shell process and this sub-shell need
not be of the same type as your login shell
• in other words, even if your login shell is Bourne, you can use a Korn sub-shell to
run your script
• by default, parent and child shells belong to same type, but you can provide a
special interpreter line in the first line of the script to specify a different shell for
your script
• use a text editor ( vi editor ) to create the shell script, script.sh
• the below script runs three echo commands and shows the use of variable
evaluation and command substitution

#!/bin/sh
# script.sh: Sample shell scripts
echo “Todays date: `date`”
echo “this month calendar:”
cal `date “+%m 20%y”`
echo “My shell: $SHELL”
• to run this script, make it executable first and then invoke the script name
$ chmod +x script.sh
$ script.sh
$ ./script.sh
read : Making Scripts Interactive

• the read statement is the shell’s internal tool for taking input from the user; i.e
making scripts interactive
• it is used with one or more variables
• input supplied through the standard input is read into these variables
• when you use a statement like
read name
• the script pauses at that point to take input from the keyboard
• whatever you enter is stored in the variable name
• since this is a form of assignment no $ is used before name
• A single read statement can be used with one or more variables
• read pname filename
#!/bin/bash
echo “Enter the pattern to be searched”
read pname
echo “Enter the filename to be used”
read fname
echo “Searching for $pname from file $fname”
grep “$pname” $fname
echo “Seleted records shown above”
Using command line arguments

• shell scripts can accept arguments from the command line


• they can run noninteractively and be used with redirection and pipelines
• when arguments are specified with a shell script, they are assigned to certain
special variables - positional parameters
• the first argument is read by the shell into the paramenter #1, the second
argument into $2 and so on
• there are few other special parameters used by the shell
• $* - stores the complete set of positional parameters as a single string
• $# - it is set to the number of arguments specified
• $0 - Holds the command name itself
• to use a multi word string as a single command line argument, you must quote it
#!/bin/bash
echo “Program: $0”
The number of arguments specified is $#
The arguments are $*”
grep “$1” $2
echo “Job Over”
Exit and Exit status of command

• shell scripts use exit command to terminate a program


• the command is generally run with a numeric argument
• exit 0 - used when everything went fine
• exit 1 - used when something went wrong
• it is through the exit command or function that every command returns an exit
status to the caller
The Parameter $?
• this parameter stores the exit status of the last command
• it has the value 0 if the command succeeds and a non zero value if it fails
• this parameter is set by exit’s argument
• if no exit status is specified then $? is set to zero ( true)
$ grep director emp.lst > /dev/null; echo $?
0
$ grep manager emp.lst > /dev/null; echo $?
1
$ grep director emp1.lst > /dev/null; echo $?
gerp: can’t open emp1.lst
3
The logical operators && and || - conditional execution

• cmd1 && cmd2


• cmd1 || cmd2
• the && delimits two commands; the cmd2 is executed only when cmd1 succeeds
$ grep ‘director’ emp.lst && echo “Pattern found in the file”
1006 | chanchal | director | sales
Pattern found in the file
• the || plays inverse role; cmd2 is executed only when cmd1 fails
$ grep ‘manager’ emp.lst || echo “Pattern not found in the file”
Pattern not found in the file
• you can use || whenever you need to terminate a script when a command fails
• grep “$1” $2 || exit 2
The if - Conditional
• the if statement makes two-way decisions depending on the fulfillment of sertain consition
• form 1
if command is successful
then
execute commands
else
execute commands
fi
• form 2
if command is successful
then
execute commands
fi
• form 3

if command is successful
then
execute commands
elif command is successful
then...
execute commands

else
fi
#! /bin/sh
if grep “^$1” /etc/passwd 2>/dev/null
then
echo “Pattern found”
else
echo “Pattern not found”
fi
Using test And [ ] to evaluate expressions

• when you use if to evaluate expressions, you need the test statement because
the true and false values returned by expressions cant’t be directly handled by if
• test uses certain operators to evaluate the condition on its right and returns
either a true or flase exit status, which is then used by if to make the decisions
• test works in three ways
• compares two numbers
• compares two strings or a single one for a null value
• checks a files attributes
Numeric Comparision

• -eq - equal to
• -ne - not equal to
• -gt - greater than
• -ge - greater than or equal to
• -lt - less than
• -le - less than or equal to
• numerical comparision operators always starts with a -, followed by a two letter
string and enclosed on either side by whitespace
• numeric comparision in the shell is confined to integer values only; decimal
values are simply truncated
$ x=5; y=7; z=7.2
$ test $x -eq $y ; echo $?
$ test $x -lt $y ; echo $?
$ test $z -gt $y ; echo $?
$ test $z -eq $y ; echo $?
#! /bin/bash
if test $# -eq 0; then
echo “Usage: $0 pattern file” >/dev/tty
elif test $# -eq 2; then
grep “$1” $2 || echo “$1 not found in $2” >/dev/tty
else
echo “You didnt enter two arguments” >/dev/tty
fi
shorthand for test

• a pair of rectangular brackets enclosing the expression can replace test


• test $x -eq $y can be replaced as
• [ $x -eq &y ]
• you must provide the whitespaces around the operators , their operands and
inside the [ and ]
• if [ $x ] is a shorthand form of if [ $x -gt 0 ]
String comparision

• test can be used to compare strings with another set of operators as listed below
• s1 = s2 - string s1 equals to s2
• s1 != s2 - string s1 is not equals to s2
• -n stg - stg is not a null string
• -z stg - stg is a null string
• stg - string stg is assigned and not null
• s1 == s2 - string s1 is equal to s2 ( Korn and Bash shell only )
The case conditional

• the case statement is another conditional offered by the shell


• the statement matches an expression for more than one alternative and uses a
compact construct to permit multiway branching
• case also handles string tests, but in a more efficient manner than if
• the general syntax
case expression in
pattern 1) commands1 ;;
pattern 2) commands2 ;;
pattern 3) commands3 ;;
.....
esac
while : Looping

• loops lets you perform a set of instructions repeatedly


• the shell features three types of loops - while, until and for
• all of them repeat the instruction set enclosed by certain keywords as often as
their control command permits
• while - syntax
while condition is true
do
commands
done
• the commands enclosed by do and done are executed repeatedly as long as
condition remains true
for : looping with a list

• there is no three part structure as used in C


• for doesn’t test a condition, but uses a list instead
• syntax
for variable in list
do
commands
done
• each whitespace separated word in list is assigned to variable in turn and
commands are executed until list is exhausted
$ for file in chap01 chap02 chap03 ; do
> cp $file ${file}.bak
> echo $file copied to $file.bak
>done
chap01 copied to chap01.bak
chap02 copied to chap02.bak
chap03 copied to chap03.bak
• Possible sources of the list
• list from variables
• list from command substitution
• list from wildcards

You might also like