Unit 2
Unit 2
Chapter 6
Chapter 8
Chapter 13
Chapter 14
ls -l : Listing file attributes
• ls command is used to list the content of a directory
• ls command with -l ( long) option gives more details about the contents of a
directory
• attributes of a file like its permissions, size and ownership details
• ls looks up the file’s inode to fetch its attributes
$ ls -l
total 72
-rw-r--r-- 1 kumar metal 19514 May 10 13:45 chap01
drwxr-xr-x 2 kumar metal 512 May 9 10:31 progs
• total 72 indicates that a total of 72 blocks are occupied by these files on disk,
each block consisting of 512 bytes ( 1024 in Linux)
File Type and Permission
• the first column shows the type and permission associated with each file
• the first character tells the type of the file, - means ordinary file, d means
directory
• then there is a series of characters r, w, x and -
• in UNIX files can have three types of permissions - read, write and execute
Links
• second column indicates the number of links associated with each file
• this is number of filenames maintaind by the system of that file
• UNIX lets a file have as many names as you want it to have, even though there is
a single file on disk
Ownership
• third column shows who is the owner of the files
• when you create a file or directory, you automatically become its owner
Group Ownership
• when opening a user account, the system administrator also assigns the user to
some group
• the fourth column represents the group owner of the file
File Size
• the fifth column shows the size of the file in bytes, i.e the amount of data it
contains
• it is only a character count, not a measure of disk space that it occupies
• disk space is usually larger than this count since files are written to disk in blocks
of 1024 bytes or more
Last Modification Time
• the sixth, seventh and eighth columns indicate the last modification time of the
file, which is stred to the nearest second
• the file is said to be modified only if the contents have changed in any way
• if you change only the permissions or ownership of the file, the modification time
remains unchanged
Filename
• the last column displays the filenames arranged in ASCII collating sequence
ls -d : Listing Directory attributes
• the ls command when used with directory names, it lists the files in the directory
• to force ls to list the attributes of a directory, rather than its contents you need to
use the -d ( directory ) option
• $ ls -ld helpdir progs
• for the directories first character of the first column will be d, for ordinary files it
will be - and for device files it will be either a, b or c
File Ownership
• when you create a file you are the owner of that file
• if you can’t create files in other user’s home directories, it is because those directories
are not owned by you ( and the owner has not allowed you write access )
• several users may belong to a single group
• the privileges for the group are set by the owner of the file and not by the group
members
• when the system administrator creates a user he has to assign 2 parameters to the user
• the user id ( UID ) - both its name and numeric representation
• the group id ( GID ) - both its name and numeric representation
• etc/passwd file contains UID ( number and name) and GID ( number )
• etc/group file contains GID ( number and name )
$ id
uid=655537 (kumar) gid=655535(metal)
File Permissions
• when changing the permission in a relative manner, chmod only changes the
permissions specified in the command line and leaves the other permissions
unchanged
• chmod category operation permission filename(s)
• chmod takes as its argument an expression comprising some letters and symbols
that completely describe the user category and the type of permission being
assigned or removed
• the expression contains three components
• user category ( user, group, others )
• the operation to be performed ( assign or remove a permission )
• the type of permission ( read, write, execute )
Category Operation Permission
a - all ( ugo )
• to assign excute permission to the user ( owner ) of the file xstart
$ chmod u+x xstart
• execute permission to all three category
$ chmod ugo+x xstart
$ chmod a+x xstart
$ chmod +x xstart
• chmod accepts multiple filenames in the command line
$ chmod u+x note note1 note2
• permissions are removed with the - operator
$ chmod go-r xstart
• chmod also accepts multiple expressions delimited by commas
$ chmod a-x, go+r xstart
• chmod also accepts more than one premission to be set; u+rwx is a valid chmod
expression
$chmod o+wx xstart
Absolute Permissions
• when you log on to a UNIX system, you first see a prompt. This prompt remains
there until you key in something. Even though it may appear that the system is
idling, a UNIX command is in fact running at the terminal, but this command is
special; its with you all the time and never terminates until you log out. This
command is the shell
• when you key in a command, it goes as input to the shell
• the shell first scans the command line for metacharacters
• metacharacters are special characters that mean nothing to the command, but
mean something special to the shell
• when shell sees metacharacters like >, |, * etc it performs all actions represented
by the symbols before the command can be executed
• when all preprocessing is complete, the shell passes on the command line to the
kernel for ultimate execution
• while the command is running, the shell has to wait for notice of its termination
from the kernel
• after the command has completed execution, the shell once again issues the
prompt to take up your next command
Pattern Matching - The Wild Cards
• wild-cards are set of special characters that the shell uses to match filenames
• often, you may need to enter multiple filenames in a command line
$ ls chap01 chap02 chap03 chap04 chapx chapy chapz
• if the filenames are similar, we can use the facility offered by the shell of
representing them by a single pattern or model
• the pattern chap* represents all filenames begining with chap
• this pattern is framed with ordinary characters ( like chap ) and a metacharacter
( like * ) using well defined rules
• the pattern can then be used as an argument to the command,and the shell will
expand it suitably before the command is excuted
Wild - Cards Matches
? a single character
• the wild cards * and ? are not very restrictive, using them its not easy to list only
chapy and chaz
• you can frame more restrictive patterns with the character class
• the character class comprises a set of characters enclosed by the rectangular
brackets, [ and ], but it matches a single character in the class
• the pattern [abcd] is a character class and it matches a single character - a, b, c or d
• this can be combined with any string or another wild card expression
• the pattern chap0[124] will match all the filenames that start with chap0 followed
by a number 1 or 2 or 4
• range specification is also possible inside the class with a - ( hyphen); the two
characters on either side of it form the range of characters to be matched
$ ls chap0[1-4]
$ ls chap[x-z]
• a valid range specification requires that the character on the left have a lower
ASCII value than the one in the right
• the expression [a-zA-Z] matches all filenames begining with an alphabet
irrespective of case
Negating the Character Class ( ! )
• doesnt work on C shell
• you can use ! as the first character in the class to negate the class
• the expression [!a-zA-Z] mathches all the filenames that doesn’t begin with an
alphabet character
Matching Totally Dissimilar Patterns
• if the shell uses some special characters to match the filenames, then filenames
themselves must not contain any of these characters?
• this is not true, filenames can have any of these characters
$ ls chap*
chap chap* chap01 chap02
• now we want to remove the file chap*; which is not easy
• trying rm chap* would be dangerous; it would remove the other filenames
begining with chap also
• we must be able to protect all special characters so the shell is not able to
interpret them
• the shell provides two solutions to prevent its own interference
Escaping
• providing a \ ( backslash ) before the wild-card to remove ( escape ) its special
meaning
Quoting
• enclosing the wild-card or even entire pattern within quotes ( like ‘chap*’ )
Escaping
• apart from the metacharacters, there are other charachters that are special - like
space character.
• the shell uses it to delimit command line arguments.
• so to remove the file My Document.doc which has a space embedded, use below
approach
$ rm My\ Document.doc
• sometimes you may need to interpret the \ itself literally; you need another \
before it
$ echo \\
\
Escaping the Newline Character
• even though the shell associates each of these files with a default physical device,
this association is not permanent
• the shell can unhook a stream from default device and connect it to a disk file
( for example) the moment it sees some characters in the command line
• to do so, user have to instruct the shell to do that by using symbols like > and < in
the command line
Standard Input
• all commands displaying output on the terminal actually write to the standard
output file as a stream of characters and not directly to the termial as such
• three possible destinations
• the terminal, the default destination
• a file using the redirection symbols > and >>
• as input to another program using a pipeline
• the shell can effect redirection of this stream when it sees the > and >> symbols
in the command line
• we can replace the default destination ( the terminal ) with any file by using the >
symbols followed by the filename
$ wc sample.txt > newfile
• if the file newfile exists the shell overwrites it
• the shell provides the >> symbol to append to a file
$ wc sample.txt >> newfile
How It Works?
• Command: wc sample.txt > newfile
• on seeing the >, the shell opens the disk file, newfile for writing
• it unplugs the standard output file from its default destination and assigns it to
newfile
• wc opens the file sample.txt for reading
• wc writes to standard output which has earlier been assigned by the shell to
newfile
• redirection also useful when concatenating the standard output of a number of
files
$ cat *.c > c_progs_all.txt
Standard Error
• each of theree standard files are represented by a number called file descriptor
• a file is opened by referring to its pathname, but subsequent read and write
operations identify the file by this file descriptor
• 0 - standard input
• 1 - standard output
• 2 - standard error
• we need to explicitly use one of these descriptors when handling the standard
error stream
• when you enter an incorrect command or try to open a nonexistent file, certain
diagnostic message show up on the screen; this is the standard error stream
whose default destination is the terminal
$ cat foo
cat: cannot open foo
• you can redirect this stream to a file
• using the symbol for standard output obviously won’t do:
$ cat foo > errorfile
cat cannot open foo
• the diagnostic error has not been sent to errorfile
$ cat foo 2>errorfile
$ cat errorfile
cat: cannot open foo
• we can also append diagnostic output as below
$ cat foo 2>>errorfile
• if you have a program that runs for a long time and is not error-free, you can
redirect the standard error to a separate file and then stay away from the
terminal
PIPES
• standard input and standard output are two separate streams that can be
individually manipulated by the shell
• the shell can connect these streams so that one command takes input from the
other
$ who > user.txt
$ wc -l < user.txt
• using an intermediate file ( user.txt ) we effectively counted the number of users
• this method has 2 disadvantages
• for long-running commands this process can be slow
• you need an intermediate file that has to be removed after completion of the job
• in the previous example who’s standard output was redirected and wc’s standard input.
• the shell can connect these streams using a special operator, the | (pipe), and avoid
creation of the disk file
$ who | wc -l
5
• here, the output of who has been passed directly to the input of wc, and who is said to
be piped to wc
• when multiple commands are connected this way a pipeline is said to be formed
• it is the shell that sets up this connection and the commands have no knowledge of it
$ ls | wc -l
15
• the output of wc can also be redirected to new file
$ ls | wc -l > fcount
• there is no restriction on the number of commands you can use in a pipeline
Filters using Regular Expressions - grep and sed
• you often need to search a file for a pattern, either to see the lines containing ( or
not containing) it or to have it replaced with something else
• there are two important filters that can be used for these tasks - grep and sed
• grep takes care of all search requirements you may have
• sed can even manipulate the individual characters in a line
grep : Searching for a Pattern
• UNIX has a special family of commands for handling search requirements and the
principal member of this family is grep command
• grep scans its input for a patternand displays lines containing the pattern, the line
numbers or filenames where pattern occurs
• the command uses the following syntax
• grep options pattern filename(s)
• grep searches for pattern in one or more filenames, or the standard input if no
filename is specified
$ grep “sales” emp.lst
• the above command will display all the lines containg sales from the file emp.lst
• pattern can be specified without double quotes also, is pattern is a single word
$ grep sales emp.lst
• grep silently returns the prompt in case the pattern cant be located
$ grep president emp.lst
_
• when grep is used with multiple filenames, it displays the filenames along with
the output
$ grep “director” emp1.lst emp2.lst
emp1.lst:1006 | chanchal | director | sales | 6700
emp2.lst:2365 | jai sharma| director | marketing | 7000
• quoting is essential when the pattern contains multiple words
$ grep ‘jai sharma’ emp.lst
grep Options
Options Significance
-i ignores case for matching
-v doesnt display lines matching expression
-n displays line numbers along with lines
-c displays cpount of number of occurrence
-l displays list of filenames only
-e exp specifies expression with this option. Can use multiple times. Also used for matching expression
begining with a hyphen
-x matches pattern with entire line
-f file takes pattern from file, one per line
-E treats pattern as an extended regular expression
-F matches multiple fixed strings
Ignoring Case ( -i )
• when you look for a name but are not sure of the case use -i option
$ grep -i ‘agarwal’ emp.lst
• this will match either Agarwal or agarwal
Deleting Lines ( -v )
• grep can play inverse role too; the -v option selects all lines except those
containing the pattern
$ grep -v ‘director’ emp.lst > otherlist
• the original file wont be changed
Displaying line numbers ( -n )
• displays the line numbers containing the pattern, along with the lines
• line numbers are shown at the begining of each line separated from the actual
line by a :
$ grep -n ‘marketing’ emp.lst
Counting lines containing pattern ( -c )
• -c option counts the number of lines containing the pattern
$ grep -c ‘dierctor’ emp.lst
4
• if you use this option with multiple files, the filename is prefixed to the line count
$ grep -c ‘dierctor’ emp*.lst
emp.lst:4
emp1.lst:2
Displaying filenames ( -l )
• this option displays only the names of the file containing the pattern
$ grep -l ‘manager’ *.lst
design.lst
emp.lst
emp1.lst
Matching multiple patterns ( -e )
• if you want to match multiple patterns -e option can be used
$ grep -e “Agarwal” -e “agarwal” -e “aggarwal” emp.lst
• in the previous example, it was tedious to specify multiple patterns with -e option
• UNIX has special feature which allows you to locate a pattern without knowing
eaxctly how it is spelled
• grep uses an expression to match a group of similar patterns
• this expression is a feature of the command that uses it and has nothing to do
with the shell
• the expression is called regular expression; the metacharacter set is shown below
• some of these metacharacters are also meaningful to the shell, so expressions
should be quoted
• POSIX identifies regular xpressions as belonging to 2 categories - basic and
extended
• grep supports basic regular expression ( BRE )by default and extended regular
expression ( ERE ) with the -E option
• sed supports onlu BRE set
Expression Matches
* zero or more occurences of the previous character
g* nothing or g , gg, ggg etc
. a single character
.* nothing or any number of characters
[pqr] a single character p, q or r
[c1-c2] a single character within the ASCII range represented by c1 and c2
[1-3] a single digit between 1 and 3
[^pqr] a single character which is not p, q, r
[^a-zA-Z] a nonalphabetic character
^pat pattern pat at begining of line
pat$ pattern pat at end of line
bash$ bash at end of line
^bash$ bash as the only word in line
^$ lines containing nothing
The Character Class
• a regular expression lets you specify a group of characters enclosed within a pair
of rectangular brackets, [ ], in which case the match is performed for a single
character in the group
• the expression [ra] matches either r or an a
• to match Agarwal and agrawal use the following regular expression
• [aA]g[ar][ar]wal
• the model [ar][ar] matches any of the four patterns aa ar ra rr
$ grep “[aA]g[ar][ar]wal” emp.lst
• a single pattern has matched two similar strings
• we can also use ranges - [a-zA-Z0-9] matches a single alphanumeric character
• while using range, the left side character must have a lower ASCII value than the
right character
Negating a Class ( ^ )
• regular expressions use the ^ symbol to negate the character class
• when a character class begins with ^ symbol, all characters other than the ones
grouped in the class are matched
• [^a-zA-Z] will match a single nonalphabetic character
The *
• there are 2 characters that can match a pattern at the beginning or end of a line
• ^ - for matching at the beginning of a line
• $ - for matching at the end of line
• Consider an example; you want to extract those lines where the emp-id begins
with a 2.
• if you use the regular expression 2...
• this wont do because the character 2 followed by three characters can occur
anywhere in the line
• you must indicate the grep that the pattern occurs at the beginning of the line
and the ^ does it easily
$ grep “^2” emp.lst
• similarly, to select those line where the salary lies between 7000 and 7999, you
have to use the $ at the end of the pattern
$ grep “7...$” emp.lst
• how to reverse the search?
• if you want to select only those lines where emp-id don’t begin with 2 use the
regular expression “^[^2]”
$ grep “^[^2]” emp.lst
• to list only directories ( UNIX has no command ), we can use a pipeline to grep
those lines from the listing that begin with a d
$ ls -l | grep “^d”
Extended Regular Expression(ERE) and grep
• the ERE set includes two special characters + and ? to restrict the matching scope
• + symbol matches one or more occurrences of the previous character
• ? symbol matches zero or one occurrence of the previous character
• in both cases, the emphasis is on the previous character
• b+ matchs b, bb, bbb etc but unlike b*, it doesn’t match nothing
• the expression b? matches either a single b or nothing
• to match Agarwal and aggarwal we can use below command
$ grep -E “[aA]gg?arwal” emp.lst
Matching multiple patterns ( |, ( and ) )
• when a group of commands have to be executed regularly, they can be stored ina
file and the file itself executed as a shell script or shell program
• we normally use the .sh extension for shell scripts ; though its not mandatory
• shell scripts are executed in a separate child shell process and this sub-shell need
not be of the same type as your login shell
• in other words, even if your login shell is Bourne, you can use a Korn sub-shell to
run your script
• by default, parent and child shells belong to same type, but you can provide a
special interpreter line in the first line of the script to specify a different shell for
your script
• use a text editor ( vi editor ) to create the shell script, script.sh
• the below script runs three echo commands and shows the use of variable
evaluation and command substitution
#!/bin/sh
# script.sh: Sample shell scripts
echo “Todays date: `date`”
echo “this month calendar:”
cal `date “+%m 20%y”`
echo “My shell: $SHELL”
• to run this script, make it executable first and then invoke the script name
$ chmod +x script.sh
$ script.sh
$ ./script.sh
read : Making Scripts Interactive
• the read statement is the shell’s internal tool for taking input from the user; i.e
making scripts interactive
• it is used with one or more variables
• input supplied through the standard input is read into these variables
• when you use a statement like
read name
• the script pauses at that point to take input from the keyboard
• whatever you enter is stored in the variable name
• since this is a form of assignment no $ is used before name
• A single read statement can be used with one or more variables
• read pname filename
#!/bin/bash
echo “Enter the pattern to be searched”
read pname
echo “Enter the filename to be used”
read fname
echo “Searching for $pname from file $fname”
grep “$pname” $fname
echo “Seleted records shown above”
Using command line arguments
if command is successful
then
execute commands
elif command is successful
then...
execute commands
else
fi
#! /bin/sh
if grep “^$1” /etc/passwd 2>/dev/null
then
echo “Pattern found”
else
echo “Pattern not found”
fi
Using test And [ ] to evaluate expressions
• when you use if to evaluate expressions, you need the test statement because
the true and false values returned by expressions cant’t be directly handled by if
• test uses certain operators to evaluate the condition on its right and returns
either a true or flase exit status, which is then used by if to make the decisions
• test works in three ways
• compares two numbers
• compares two strings or a single one for a null value
• checks a files attributes
Numeric Comparision
• -eq - equal to
• -ne - not equal to
• -gt - greater than
• -ge - greater than or equal to
• -lt - less than
• -le - less than or equal to
• numerical comparision operators always starts with a -, followed by a two letter
string and enclosed on either side by whitespace
• numeric comparision in the shell is confined to integer values only; decimal
values are simply truncated
$ x=5; y=7; z=7.2
$ test $x -eq $y ; echo $?
$ test $x -lt $y ; echo $?
$ test $z -gt $y ; echo $?
$ test $z -eq $y ; echo $?
#! /bin/bash
if test $# -eq 0; then
echo “Usage: $0 pattern file” >/dev/tty
elif test $# -eq 2; then
grep “$1” $2 || echo “$1 not found in $2” >/dev/tty
else
echo “You didnt enter two arguments” >/dev/tty
fi
shorthand for test
• test can be used to compare strings with another set of operators as listed below
• s1 = s2 - string s1 equals to s2
• s1 != s2 - string s1 is not equals to s2
• -n stg - stg is not a null string
• -z stg - stg is a null string
• stg - string stg is assigned and not null
• s1 == s2 - string s1 is equal to s2 ( Korn and Bash shell only )
The case conditional