18cs56 Unix Notes
18cs56 Unix Notes
18cs56 Unix Notes
Chapter 1: Introduction
1. Unix Components/Architecture
2. Features of Unix
3. The UNIX Environment and UNIX Structure, Posix and Single Unix specification
4. General features of Unix commands/ command structure. Command arguments and options
5. Basic Unix commands such as echo, printf, ls, who, date,passwd, cal, Combining commands
Unix Components/Architecture
Apart from providing support to user's program, kernel also does important housekeeping.
It manages the system's memory, schedules processes, and decides their priorities and so on.
The kernel has to do a lot of this work even if no user program is running.
The kernel is also called as the operating system - a programs gateway to the computer's
resources.
The Shell
Computers don't have any capability of translating commands into action.
That requires a command interpreter, also called as the shell.
Shell is acts interface between the user and the kernel.
Most of the time, there's only one kernel running on the system, there could be
several shells running – one for each user logged in.
The shell accepts commands from user, if require rebuilds a user command,
and finally communicates with the kernel to see that the command is executed.
Example:
$ echo VTU Belagavi #Shell rebuilds echo command by removing multiple spaces
VTU Belagavi
The Process
The process is the name given to the file when it is executed as a program (Process is
program under execution).
We can say process is the “time image” of an executable file.
UNIX provides tools to control processes move them between foreground and background and
kill them.
1. A Multiuser System
UNIX is a multiprogramming system, it permits multiple programs to run and compete for the
attention of the CPU.
This can happen in two ways:
- Multiple users can run separate jobs
- A single user can also run multiple jobs
A single user system where the CPU, memory and hard disks are all dedicated to a single user.
In UNIX, the resources are shared between all users; UNIX is also a multiuser system.
2. A Multitasking System
A single user can also run multiple tasks concurrently.
UNIX is a multitasking system.
It is usual for a user to edit a file, print another one on the printer, send email to a friend and browse
www- all without leaving any of applications.
The kernel is designed to handle a user's multiple needs.
In a multitasking environment, a user sees one job running in the foreground; the rest run in the
background.
User can switch jobs between background and foreground, suspend, or even terminate them.
5. Pattern Matching
UNIX features very sophisticated pattern matching features.
Example: The * (zero or more occurrences of characters) is a special character used by system
6. Programming Facility
The UNIX shell is also a programming language; it was designed for programmer, not for end user.
It has all the necessary ingredients, like control structures, loops and variables, that establish
powerful programming language.
These features are used to design shell scripts – programs that can also invoke UNIX commands.
Many of the system's functions can be controlled and automated by using these shell scripts.
7. Documentation
The principal on-line help facility available is the man command, which remains the most
important references for commands and their configuration files.
Apart from the man documentation, there's a vast ocean of UNIX resources available on the Internet.
A variable that specifies how an operating system or another program runs, or the devices
that the operating system recognizes.
Set of dynamic values that can affect the way running processes will behave on a computer.
Basic Unix commands such as echo, printf, ls, who, date,passwd, cal, Combining
commands
echo command is used is shell scripts to display a message on the terminal, or to issue a
prompt for taking user input.
$echo $SHELL
/usr/bin/bash
Echo can be used with different escape sequences
Constant Meaning
„a‟ Audible Alert (Bell)
„b‟ Back Space
„f‟ Form Feed
„n‟ New Line
„r‟ Carriage Return
„t‟ Horizontal Tab
„v‟ Vertical Tab
„\‟ Backslash
„\0n‟ ASCII character represented by the octal value n
The printf command is available on most modern UNIX systems, and is the one we can use
instead of echo. The command in the simplest form can be used in the same way as echo:
printf also accepts all escape sequences used by echo, but unlike echo, it doesn‟t
automatically insert newline unless the \n is used explicitly. printf also uses formatted
strings in the same way the C language function of the same name uses them:
The %s format string acts as a placeholder for the value of $SHELL, and printf replaces %s
with the value of $SHELL. %s is the standard format used for printing strings. printf uses
many of the formats used by C‟s printf function. Here are some of the commonly used ones:
%s – String
Prof. Mamatha B Dept of CSE Page 7
Module 1_UNIX Programming (18CS56) 2022-2023
Example:
$ printf "The value of 255 is %o in octal and %x in hexadecimal\n" 255 255
$ who
root tty7 2017-09-04 16:38 (:0)
root tty17 2017-09-04 16:38 (:0)
$_
One can display the current date with the date command, which shows the date and time to
the nearest second:
$ date
Mon Sep 4 16:40:02 IST 2017
The command can also be used with suitable format specifiers as arguments. Each symbol is
preceded by the + symbol, followed by the % operator, and a single character describing the
format. For instance, you can print only the month using the format +%m:
Prof. Mamatha B Dept of CSE Page 9
Module 1_UNIX Programming (18CS56) 2022-2023
$date +%m
09
Or
the month name name:
$ date +%h
Aug
Or
You can combine them in one command:
$ date + “%h %m”
Aug 08
There are many other format specifiers, and the useful ones are listed below:
- d – The day of month (1 - 31)
- y – The last two digits of the year.
- H, M and S – The hour, minute and second, respectively.
- D – The date in the format mm/dd/yy
- T – The time in the format hh:mm:ss
6. Passwd: Changing your password
cal command can be used to see the calendar of any specific month or a complete year.
Syntax:
cal [ [ month] year ]
Everything within the rectangular box in optional. So cal can be used without any arguments, in
which case it displays the calendar of the current month
$ cal
September 2017
Su Mo Tu We Th Fr Sa
1 2
345 6 78 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
The syntax show that cal can be used with arguments, the month is optional but year is not.
To see the calendar of month August 2017, we need to use two arguments as shown below,
$ cal 8 2017
August 2017
Su Mo Tu We Th Fr Sa
1 2 345
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31
You can‟t hold the calendar of a year in a single screen page; it scrolls off rapidly before you
can use [ctrl-s] to make it pause. To make cal pause using pager using the | symbol to connect
them.
$cal 2017 | more
8. Combining commands
UNIX allows you to specify more than one command in the single command line.
Example:
$ wc note; ls -l note #Two commands combined here using ; (semicolon)
2 3 16 note
-rw-rw-r-- 1 mahesh mahesh 16 Jan 30 09:35 note
This is
a three-line text
message
When the shell execute command(file) from its own set of built-in commands that are not
stored as separate files in /bin directory, it is called internal command.
They can be executed any time and are independent.
If the command (file) has an independence existence in the /bin directory, it is called
external command.
External commands are loaded when the user requests for them. It will have an individual
process.
Examples:
$ type echo # echo is an internal command
echo is shell built-in
$ type ls # ls is an external command
ls is /bin/ls
If the command exists both as an internal and external one, shell execute internal command
only.
Internal commands will have top priority compare to external command of same name.
The type command is used to describe how its argument would be translated if used as commands. It
is also used to find out whether it is built-in or external binary file.
When you execute date command, the shell locates this file in the /bin directory and makes
arrangements to execute it.
The unix system provides a special login name for the exclusive use of the administrator;
it is called root. This account doesn‟t need to be separately created but comes with
every system. Its password is generally set at the time of installation of the system and
has to be used on logging in:
Becoming the super at login time
login: root
Password: ********* [Enter]
#-
/sbin:/bin:/usr/sbin:/usr/bin:/usr/dt/bin
#pwd
/home/abc
Though the current directory doesn‟t change, the # prompt indicates that abc now has
powers of a superuser. To be in root‟s home directory on superuser login, use su –l.
User‟s often rush to the administrator with the complaint that a program has
stopped running. The administrator first tries running it in a simulated environment.
Su command when used with a – (minus), recreates the user‟s environment without
the login-password route:
$su – abc
This sequence executes abc’s .profile and temporarily creates abc’s environment.
su runs in a separate sub-shell, so this mode is terminated by hitting [ctrl-d] or
using exit.
1. Naming files.
5. Reaching required files- the PATH variable, manipulating the PATH, Relative and absolute pathnames.
7. The dot (.) and double dots (..) notations to represent present and parent directories and their usage in
relative path names.
8. File related commands – cat, mv, rm, cp, wc and od commands.
The File
The file is the container for storing information.
Neither a file's size nor its name is stored in file.
All file attributes such as file type, permissions, links, owner, group owner etc are kept in a
separate area of the hard disk, not directly accessible to humans, but only to kernel.
UNIX treats directories and devices as files as well.
All physical devices like hard disk, memory, CD-ROM, printer and modem are treated as files.
Directory File
A directory contains no data but keeps some details of the files and subdirectories that it contains.
A directory file contains an entry for every file and subdirectories that it houses. If you have 20
files in a directory, there will be 20 entries in the directory.
Each entry has two components-
- the filename
- a unique identification number for the file or directory (called as inode number).
When any file is created or removed, the kernel automatically updates its corresponding directory by
adding or removing the entry (inode number & filename) associated with that file.
Device File
Installing software from CD-ROM, printing files and backing up data files to tape.
All of these activities are performed by reading or writing the file representing the device.
Advantage of device file is that some of the commands used to access an ordinary file also work
with device file.
Device filenames are generally found in a single directory structure, /dev.
The attributes of every file is stored on the disk.
The kernel identifies a device from its attributes and then uses them to operate the device.
.
The feature of UNIX file system is that there is a top, which serves as the reference point for all files
This top is called root and is represented by a / (Front slash).
The root is actually a directory.
The root directory (/) has a number of subdirectories under it.
The subdirectories in turn have more subdirectories and other files under them.
Every file apart from root, must have a parent, and it should be possible to trace the ultimate
parentage of a file to root.
In parent-child relationship, the parent is always a directory.
The home directory is the parent of mthomas. / is the parent of home and the grandparent of
mthomas.
If a user login using the login name kumar, user will land up in a directory that could have the path
name /home/kumar.
The shell variable HOME knows the home directory.
$echo $HOME
/home/kumar
2.5 Reaching required files- the PATH variable, manipulating the PATH, Relative and
absolute pathnames.
PATH
A shell variable that contains a colon-delimited list of directories that the shell will lok through to locate a
command invoked by user.
The PATH generally includes /bin and /usr/bin for nonprivileged users and /bin and /usr/sbin for the
superuser.
Use echo to evaluate this variable in which directory list separated by colons:
$ echo $PATH
/bin:/usr/bin:/usr/local/bin:/usr/ccs/bin:/usr/local/java/bin:.
There are six directories in this colon separated list.
When you issue a command , the shell searches this list in the sequence specified to locate and execute it.
Absolute Pathnames:
It shows a file’s location with reference to the top, i.e., root
It is simply a sequence of directory names separated by slashes
e.g.: /home/kumar
Suppose we are placed in /usr and want to access the file login.sql which is present in /home/kumar,
we can give the pathname of the file as below:
cat /home/kumar/login.sql
If the first character of a pathname is /, the file’s location must be determined wrt root(the first /).such
a pathname is called an absolute pathname.
If you know the location of a particular command, you van precede its name with the complete path.
Since date reside in/bin (or /usr/bin), you can use the absolute pathname:
$ /bin/date
Thu Sep 1 09:39:55 IST 2020
2.6 The dot (.) and double dots (..) notations to represent present and parent directories
and their usage in relative path names.
Relative path is defined as the path related to the present working directly(pwd). It starts at your
current directory and never starts with a /.
A pathname which specifies the location of a file using the symbols. and ..
. (single dot) –represents the current directory
.. (two dots) –represents the parent directory
Assume that we are in /home/kumar/progs/data/text , we can use .. as an argument to cd to move to the
parent directory, /home/kumar/progs/data as shown below:
$ pwd
/home/kumar/progs/data/text
$ cd .. // moves one level up
$ pwd
/home/kumar/progs/data
To move to /home , we can use the relative path name as follows:
$ pwd
/home/kumar/pis
$ cd ../.. // moves two levels up
$ pwd
/home
User can move around the UNIX file system using cd (change directory) command.
When used with the argument, it changes the current cd can also be used without arguments:
directory to the directoryspecified as argument, progs: $ pwd
/home/kumar/progs
$ pwd
$cd
/home/kumar
$ pwd
$ cd progs
/home/kumar
$ pwd
/home/kumar/progs
Here we are using the relative pathname of progs cd without argument changes the working directory
directory. The same can be done with the absolute to home directory.
pathname also.
$cd /home/sharma
$cd /home/kumar/progs $ pwd
$ pwd /home/sharma
/home/kumar/progs $cd
$cd /bin
/home/kumar
$ pwd
/bin
mkdir: MAKING DIRECTORIES
$mkdir patch
You can create a number of subdirectories with one mkdir command:
$mkdir patch dba doc
For instance the following command creates a directory tree:
$mkdir progs progs/cprogs progs/javaprogs
This creates three subdirectories – progs, cprogs and javaprogs under progs.
The order of specifying arguments is important. You cannot create subdirectories before creation of
parent directory.
For instance following command doesn‘t work
The rmdir (remove directory) command removes the directories. You have to do this to remove
progs:
$rmdir progs
If progs is empty directory then it will be removed form system.
Following command used with mkdir fails with rmdir
First it removes cprogs and javaprogs form progs directory and then it removes progs fro system.
rmdir : Things to remember
2.8 File related commands – cat, mv, rm, cp, wc and od commands.
cat: DISPLAYING AND CREATING FILES
cat command is used to display the contents of a small file on the terminal.
$ cat cprogram.c # include
<stdioh> void main ()
{
Printf(―hello‖);
}
As like other files cat accepts more than one filename as arguments
The contents of the second files are shown immediately after the first file without any header
information. So cat concatenates two files- hence its name
cat OPTIONS
cat without any option it will display text files. Nonprinting ASCII characters can be displayed
with –v option.
Numbering Lines (-n)
-n option numbers lines. This numbering option helps programmer in debugging programs.
cat is also useful for creating a file. Enter the command cat, followed by > character and the
filename.
The mv command renames (moves) files. The main two functions are:
1. It renames a file(or directory)
2. It moves a group of files to different directory
It doesn't create a copy of the file; it merely renames it. No additional space is consumed on disk
during renaming.
Eg : To rename the file csb as csa we can use the following command
$ mv csb csa
If the destination file doesn‘t exist in the current directory, it will be created. Or else it will just
rename the specified file in mv command.
A group of files can be moved to a directory.
Eg : Moves three files ch1,ch2,ch3 to the directory module
$ mv rename newname
mv replaces the filename in the existing directory entry with the new name. It doesn't create a
copy of the file; it renames it.
Group of files can be moved to a directory
mv chp1 chap2 chap3 unix
rm: DELETING FILES
The rm command deletes one or more files.
Eg: Following command deletes three files:
rm options
Interactive Deletion (-i) : Ask the user confirmation before removing each file:
$ rm -i ch1 ch2
rm: remove ch1 (yes/no)? ? y
A y removes the file (ch1) any other response like n or any other key leave the file
undeleted.
Recursive deletion (-r or -R): It performs a recursive search for all directories and files
within these subdirectories. At each stage it deletes everything it finds.
$ rm -r * Works as rmdir
It deletes all files in the current directory and all its subdirectories.
Forcing Removal (-f): rm prompts for removal if a file is write-protected. The -f
option overrides this minor protection and forces removal.
cp command can be used to copy more than one file with a single invocation of the
command. In this case the last filename must be a directory.
Eg: To copy the file ch1,chh2, ch3 to the module , use cp as
$ cp ch1 ch2 ch3 module
The files will have the same name in module. If the files are already resident in module,
they will be overwritten. In the above diagram module directoryshould already exist and
cp doesn‘t able create a directory.
UNIX system uses * as a shorthand for multiple filenames.
Eg:
$ cp ch* usp #Copies all the files beginning with ch
cp options
Interactive Copying(-i) : The –i option warns the user before overwriting the
destination file, If unit 1 exists, cp prompts for response
$ cp -i ch1 unit1
$ cp: overwrite unit1 (yes/no)? Y
A y at this prompt overwrites the file, any other response leaves it uncopied.
It performs recursive behavior command can descend a directory and examine all files in
its subdirectories.
-R : behaves recursively to copy an entire directory structure
$ cp -R usp newusp
$ cp -R class newclass
$ wc ofile
4 20 97 ofile
$ od –b file
0000000 164 150 151 163 040 146 151 154 145 040 151 163 040 141 156 040
0000020 145 170 141 155 160 154 145 040 146 157 162 040 157 144 040 143
0000040 157 155 155 141 156 144 012 136 144 040 165 163 145 144 040 141
0000060 163 040 141 156 040 151 156 164 145 162 162 165 160 164 040 153
0000100 145 171
-c character option
Prof. Mamatha B Dept of CSE Page 26
Module 1_UNIX Programming (18CS56) 2022-2023
Now it shows the printable characters and its corresponding ASCII octal representation
$ od –bc file
od -bc ofile
0000000 164 150 151 163 040 146 151 154 145 040 151 163 040 141 156 040
T h i s f i l e i s a n 0000020 145 170
141 155 160 154 145 040 146 157 162 040 157 144 040 143
e x a m p l e f o r o d c 0000040 157
155 155 141 156 144 012 136 144 040 165 163 145 144 040 141
o m m a n d \n ^ d u s e d a 0000060 163
040 141 156 040 151 156 164 145 162 162 165 160 164 040 153
s a n i n t e r r u p t k 0000100 145
171
e y
2. Changing file permissions: the relative and absolute permissions changing methods.
4. Directory permissions.
For example,
$ls -d
This command will not list all subdirectories in the current directory .
For example,
$ls –ld helpdir progs
Directories are easily identified in the listing by the first character of the first column, which
here shows d.
The significance of the attributes of a directory differs a good deal from an ordinary file.
To see the attributes of a directory rather than the files contained in it, use ls –ld with the
directory name. Note that simply using ls –d will not list all subdirectories in the current
directory. Strange though it may seem, ls has no option to list onlydirectories.
2. Changing file permissions: the relative and absolute permissions changing methods.
File Ownership
When you create a file, you become its owner. Every owner is attached to a group owner.
Several users may belong to a single group, but the privileges of the group are set by the owner
of the file and not by the group members.
When the system administrator creates a user account, he has to assign these parameters to the
user:
UNIX follows a three-tiered file protection system that determines a file‘s access rights.
rwx r-x r- -
owner/user group owner others
The first group(rwx) has all three permissions. The file is readable, writable and executable by the
owner ofthe file.
The second group(r-x) has a hyphen in the middle slot, which indicates the absence of write
permission bythe group owner of the file.
The third group(r- -) has the write and execute bits absent. This set of permissions is applicable to
others.
You can set different permissions for the three categories of users – owner, group and others.
A file or a directory is created with a default set of permissions, which can be determined by
umask.
Let us assume that the file permission for the created file is -rw-r-- r--. Using chmod
command, we canchange the file permissions and allow the owner to execute his file.
Relative Permissions
chmod only changes the permissions specified in the command line and leaves the other
permissions unchanged.
Its syntax is:
chmod category operation permission filename(s)
chmod takes an expression as its argument which contains:
user category (user, group, others)
operation to be performed (assign or remove a permission)
type of permission (read, write, execute)
Category: u – user g – group o – others a - all (ugo)
Operations : + assign - remove = absolute
Permissions: r – read w – write x - execute
Initially,
-rw-r—r-- 1 kumar metal 1906 sep 23:38 xstart
The command assigns (+) execute (x) permission to the user (u), other permissions remain
unchanged.
$chmod ugo+x xstart or chmod a+x xstart or chmod +x xstart
$ls –l xstart
Let initially,
-rwxr-xr-x 1 kumar metal 1906 sep 23:38 xstart
$chmod go-r xstart
Then, it becomes
$ls –l xstart
-rwx—x--x 1 kumar metal 1906 sep 23:38 xstart
Absolute Permissions
Here, we need not to know the current file permissions. We can set all nine permissions
explicitly.
A string of three octal digits is used as an expression.
The permission can be represented by one octal digit for each category. For each
category, we addoctal digits.
If we represent the permissions of each category by one octal digit, this is how the
permission can be represented:
Read permission – 4 (octal 100)
Write permission – 2 (octal 010)
Execute permission – 1 (octal 001)
Octal Permissions Significance
0 --- no permissions
will assign all permissions to the owner, read and write permissions for the group and only
execute permission to the others.
777 signify all permissions for all categories, but still we can prevent a file from being deleted.
000 signifies absence of all permissions for all categories, but still we can delete a file.
It is the directory permissions that determine whether a file can be deleted or not.
Only owner can change the file permissions. User cannot change other user‘s file‘s permissions.
But the system administrator can do anything.
The UNIX system by default, never allows this situation as you can never have a secure
system. Hence, directory permissions also play a very vital role here .
4. Directory Permissions
It is possible that a file cannot be accessed even though it has read permission, and can be
removed even when it is write protected. The default permissions of a directory are,
rwxr-xr-x (755)
$mkdir c_progs
$ls –ld c_progs
Usually, on BSD and AT&T systems, there are two commands meant to change the ownership of
a file or directory. Let kumar be the owner and metal be the group owner. If sharma copies a file
of kumar, then sharma will become its owner and he can manipulate the attributes.
chown changing file owner and chgrp changing group owner
On BSD, only system administrator can use chown
On other systems, only the owner can change both
chown
Changing ownership requires super user permission, so use su command
$ls -l note
-rwxr --- x 1 kumar metal 347 may 10 20:30 note
Once ownership of the file has been given away to sharma, the user file permissions that
previously applied to Kumar now apply to sharma. Thus, Kumar can no longer edit note since
there is no write privilege for group and others. He cannot get back the ownership either. But he
can copy the file to his own directory, in which case he becomes the owner of the copy.
chgrp
This command changes the file‘s group owner. No super user permission is required.
#ls –l dept.lst
-rw-r—r-- 1 kumar metal 139 jun 8 16:43 dept.lst
THE SHELL
The shell sits between you and the operating system, acting as a command interpreter.
It reads your terminal input and translates the commands into actions taken by the system.
When you log into the system you are given a default shell.
When the shell starts up it reads its startup files and may set environment variables,
command search paths, and command aliases, and executes any commands specified
in these files.
The original shell was the Bourne shell, sh. Every Unix platform will either have the
Bourne shell, or a Bourne compatible shell available.
Even though the shell appears not to be doing anything meaningful when there is no
activity at the terminal, it swings into action the moment you key in something.
The following activities are typically performed by the shell in its interpretive cycle:
The shell issues the prompt and waits for you to enter a command.
After a command is entered, the shell scans the command line for meta
characters and expands abbreviations (like the * in rm *) to recreate a simplified
command line.
It then passes on the command line to the kernel for execution.
The shell waits for the command to complete and normally can’t do any work
while the command is running.
After the command execution is complete, the prompt reappears and the shell
returns to its waiting role to start the next cycle. You are free to enter another
1. Wild-Cards
Wild-card Matches
* Any number of characters including none
? A single character
[ijk] A single character – either an i, j or k
[x-z] A single character that is within the ASCII range of characters x and z
[!ijk] A single character that is not an i, j or k (Not in C shell)
[!x-z] A single character that is not within the ASCII range of the characters x and z (Not in C
Shell)
{pat1,pat2...} Pat1, pat2, etc. (Not in Bourne shell)
The * and ?
The metacharacter *, is one of the characters of the shell’s wild card set.
It matches any number of characters (including none).
To list all files that begin with chap.
$ ls chap*
chap chap01 chap02 chap03 chap04 chap15 chap16 chap17 chapx chapy chapz
chap* matches the string chap. When the shell encounters this command line, it identifies the *
immediately as a wild card.
It then looks in the current directory and recreates the command line as below from the
filenames that match the pattern chap*:
ls chap chap01 chap02 chap03 chap04 chap15 chap16 chap17 chapx chapy chapz
Both * and ? operate with some restrictions. for example, the * doesn’t match
all files beginning with a . (dot) or the / of a pathname.
If you wish to list all hidden filenames in your directory having at least three
characters after the dot, the dot must be matched explicitly.
$ ls .???*
.bash_profile .exrc .netscape .profile
However, if the filename contains a dot anywhere but at the beginning, it need not be
matched explicitly.
Similarly, these characters don’t match the / in a pathname. So, you cannot use, $cd
/usr?local to change to /usr/local.
The character class comprises a set of characters enclosed by the rectangular brackets, [
and ], but it matches a single character in the class.
The pattern [abd] is character class, and it matches a single character – an a,b or d.
Examples:
$ ls chap0[124]
chap01 chap02 chap04
$ls chap[x-z]
chapx chapy chapz
*.[!co] - To match all filenames with a single-character extension but not the .c or .o files
[!a-zA-Z]* - To match all filenames that don’t begin with an alphabetic character
Quoting is enclosing the wild-card, or even the entire pattern, within quotes.
Anything within these quotes (barring a few exceptions) are left alone by the shell and
not interpreted.
When a command argument is enclosed in quotes, the meanings of all enclosed
specialcharacters are turned off.
Examples:
Standard input
The standard input can represent three input sources:
Standard output
Standard error:
A file is opened by referring to its pathname, but subsequent read and write
operations identify the file by a unique number called a file descriptor.
The kernel maintains a table of file descriptors for every process running in the system.
The first three slots are generally allocated to the three standard
streams as, 0 – Standard input
1 – Standard
output 2 –
Standard error
These descriptors are implicitly prefixed to the redirection symbols.
The shell can connect these streams using a special operator, the | (pipe) and avoid creation of
the disk file.
The pipe is the third source and destination of standard input and standard
output Examples
$ ls -l | wc –l Displays number of file in current directory
In a pipeline, all programs run simultaneously. A pipe has a built in mechanism to control the
flow of the stream.
Pipe is both being read and written, the reader and writer have to act in unison.
If one operates faster than the other, then the appropriate driver has to readjust the flow.
grep scans its input for a pattern displays lines containing the pattern, the line numbers or
filenames where the pattern occurs. The command uses the following syntax:
$grep options pattern filename(s)
grep searches for pattern in one or more filename(s), or the standard input if no filename is
specified.
The first argument (except the options) is the pattern and the remaining arguments are filenames.
Examples:
grep silently returns the prompt because no pattern as “president” found in file emp.lst.
when grep is used with multiple filenames, it displays the filenames along with the output.
Option Significance
-i Ignores case for matching
-v Doesn't display lines matching expression
-n Displays line numbers along with lines
-c Displays count of number of occurrences
-l Displays list of filenames only
-e exp Matches multiple patterns
-f filename Takes patterns from file, one per line
-E Treats patterns as an ERE
-F Matches multiple fixed strings
When you look for a name but are not sure of the case, use the -i (ignore) option.
The -v option selects all the lines except those containing the pattern.
It can play an inverse role by selecting lines that does not containing the pattern.
The -n(number) option displays the line numbers containing the pattern, along with
the lines.
here, first column displays the line number in emp.lst where pattern is found
With the -e option, you can match the three agarwals by using the grep like this:
You can place all the patterns in a separate file, one pattern
per line. Grep uses -f option to take patterns from a file:
$ grep -f patterns.lst emp.lst
Like the shell's wild-cards which matches similar filenames with a single expression,
grep uses an expression of a different type to match a group of similar patterns.
Unlike shell's wild-cards, grep uses following set of meta-characters to design an
expressionthat matches different patterns.
If an expression uses any of these meta-characters, it is termed as Regular Expression (RE).
The below table shows the BASIC REGULAR EXPRESSION (BRE) character set-
A RE lets you specify a group of characters enclosed within a pair of rectangular brackets, [ ],
in whichcase the match is performed for a single character in the group.
The *
Expression Significance
ch+ Matches one or more occurrences of character ch
ch? Matches zero or one occurrence of character ch
exp1 | exp2 Matches exp1 or exp2
GIF | JPEG Matches GIF or JPEG
(x1|x2)x3 Matches x1x3 or x2x3
(hard|soft)ware Matches hardware or software
The + and ?
+ - Matches one or more occurrences of the previous character
? - Matches zero or one occurrence of the previous character.
VARIABLE NAMES:
A variable is a character string to which we assign a value. The value assigned could be a
number, text, filename, device, or any other type of data.
A variable is nothing more than a pointer to the actual data. The shell enables you to create,
assign, and delete variables.
The name of a variable can contain only letters(a to z or A to Z),numbers(0 to 9) or the
underscore character( _ ).
By convention, Unix shell variables would have their names in UPPERCASE
The following examples are valid variable names:-
VAR_1
VAR_2
TOKEN_A
DEFINING VARIABLE:
Variable are defiened as follows:
Variable_name= variable_value
For example
NAME=”Sumithabha Das”
ACCESSING VARIABLES:
To access the value stored in a variable, prefix its name with the dollar sign($).
For example following script would access the value of defined variable NAME and would
ENVIRONMENT VARIABLES:
An environment variables that is available to any child process of the shell. Some programs need
environment variables in order to function correctly. Usually a shell script defines only those
environment variables that are needed by the programs that it runs
SHELL: points to the shell defined as default.
DISPLAY: contains the identifier for the display that X11 programs should use by default.
HOME: Indicates the home directory of the current user default argument for the cd built in
command
IFS: Indicates the Internal Field Separator that is used by the parser for word splitting after
expansion.
PATH: Indicates search path for commands .It is a colon separated list of directories in
which shell looks the command.
PWD: Indicates the current working directory as set by the cd command.
RANDOM: Generates a random interger between 0 and 32767 each time it s referenced.
SHLVL: Increments by one each time an instance of bash is created.
UID: Expands to the numeric user ID of the current user initialized at shell prompt.
Following is the sample example showing few environment variables
$ echo $HOME
/root
]$ echo $DISPLAY
$ echo $TERM
xterm
$ ech $PATH
/usr/local/bin:/bin:/usr/bin:/home/amrood/bin:/usr/local/bin
$
Files in a UNIX and POSIX system may be any one of the following types:
Regular file
Directory file
FIFO file
Block device file
Character device file
Symbolic link file
There are special API’s to create these types of files. There is a set of Generic API’s that can be used to
manipulate and create more than one type of files. These API’s are:
open
This is used to establish a connection between a process and a file i.e. it is used to open an existing
file for data transfer function or else it may be also be used to create a new file.
The returned value of the open system call is the file descriptor (row number of the file table), which
contains the inode information.
The prototype of open function is
#include<sys/types.h>
#include<sys/fcntl.h>
int open(const char *pathname, int accessmode, mode_t permission);
There are other access modes, which are termed as access modifier flags, and one or more of the
following can be specified by bitwise-ORing them with one of the above access mode flags to alter the
access mechanism of the file.
To illustrate the use of the above flags, the following example statement opens a file called
/usr/divya/usp for read and write in append mode:
If the file is opened in read only, then no other modifier flags can be used.
If a file is opened in write only or read write, then we are allowed to use any modifier flags along with them.
The third argument is used only when a new file is being created. The symbolic names for file
permission are given in the table in the previous page.
creat
read
The read function fetches a fixed size of block of data from a file referenced by a given file descriptor.
The prototype of read function is:
#include<sys/types.h>
#include<unistd.h>
size_t read(int fdesc, void *buf, size_t nbyte);
#include<sys/types.h>
#include<unistd.h>
ssize_t write(int fdesc, const void *buf, size_t size);
close
The close system call is used to terminate the connection to a file from a process.
The prototype of the close is
#include<unistd.h> int
close(int fdesc);
If successful, close returns 0.
If unsuccessful, close returns –1.
The argument fdesc refers to an opened file.
Close function frees the unused file descriptors so that they can be reused to reference other files.
This is important because a process may open up to OPEN_MAX files at any time and the close
function allows a process to reuse file descriptors to access more than OPEN_MAX files in the
course of its execution.
The close function de-allocates system resources like file table entry and memory buffer allocated to
hold the read/write.
fcntl
The fcntl function helps a user to query or set flags and the close-on-exec flag of any file descriptor.
The prototype of fcntl is
#include<fcntl.h>
int fcntl(int fdesc, int cmd, …);
The fcntl function is useful in changing the access control flag of a file descriptor.
For example: after a file is opened for blocking read-write access and the process needs to change the
access to non-blocking and in write-append mode, it can call:
int cur_flags=fcntl(fdesc,F_GETFL);
int rc=fcntl(fdesc,F_SETFL,cur_flag | O_APPEND | O_NONBLOCK);
The following example reports the close-on-exec flag of fdesc, sets it to on afterwards:
cout<<fdesc<<”close-on-
exec”<<fcntl(fdesc,F_GETFD)<<endl;
(void)fcntl(fdesc,F_SETFD,1); //turn on close-on-exec flag
The following statements change the standard input og a process to a file called FOO:
int fdesc=open(“FOO”,O_RDONLY); //open FOO for read
Prof. Mamatha B Dept of CSE Page 6
Module 3_UNIX Programming (18CS56) 2022-2023
char buf[256];
int rc=read(0,buf,256); //read data from FOO
The dup and dup2 functions in UNIX perform the same file duplication function
as fcntl. They can be implemented using fcntl as:
lseek
The lseek function is also used to change the file offset to a different value.
Thus lseek allows a process to perform random access of data on any opened file.
The prototype of lseek is
link
The link function creates a new link for the existing file.
The prototype of the link function is
#include <unistd.h>
int link(const char *cur_link, const char *new_link);
#include<iostream.h>
#include<stdio.h>
#include<unistd.h>
unlink
The unlink function deletes a link of an existing file.
This function decreases the hard link count attributes of the named file, and removes the file name
Prof. Mamatha B Dept of CSE Page 8
Module 3_UNIX Programming (18CS56) 2022-2023
#include <unistd.h>
int unlink(const char * cur_link);
If successful, the unlink function returns 0.
If unsuccessful, unlink returns –1.
The argument cur_link is a path name that references an existing file.
ANSI C defines the rename function which does the similar unlink operation.
The prototype of the rename function is:
#include<stdio.h>
int rename(const char * old_path_name,const char * new_path_name);
The UNIX mv command can be implemented using the link and unlink APIs as shown:
#include <iostream.h>
#include
<unistd.h>
#include<string.h
>
int main ( int argc, char *argv[ ])
{
if (argc != 3 || strcmp(argv[1],argcv[2]))
cerr<<”usage:”<<argv[0]<<””<old_link><new_link>\
n”;
else if(link(argv[1],argv[2]) == 0)
return unlink(argv[1]);
return 1;
}
stat, fstat
The stat and fstat function retrieves the file attributes of a given file.
The only difference between stat and fstat is that the first argument of a stat is a file pathname, where
as the first argument of fstat is file descriptor.
The prototypes of these functions are
#include<sys/stat.h>
#include<unistd.h>
The second argument to stat and fstat is the address of a struct stat-typed variable which is defined in
the <sys/stat.h> header.
Its declaration is as follows:
struct stat
{
dev_t st_dev; /* file system ID */
ino_t st_ino; /* file inode number */
mode_t st_mode; /* contains file type and permission
*/
nlink_t st_nlink; /* hard link count */
uid_t st_uid; /* file user ID */
gid_t st_gid; /* file group ID */
dev_t st_rdev; /*contains major and minor
device#*/
off_t st_size; /* file size in bytes */
time_t st_atime; /* last access time */
time_t st_mtime; /* last modification time */
time_t st_ctime; /* last status change time */
};
access
The access system call checks the existence and access permission of user to a named file.
The prototype of access function is:
#include<unistd.h>
int access(const char *path_name, int flag);
if(access(“/usr/divya/usp.txt”,
F_OK)==-1) printf(“file does not
exists”);
else
printf(“file exists”);
chmod, fchmod
The chmod and fchmod functions change file access permissions for owner, group & others as
well as the set_UID, set_GID and sticky flags.
A process must have the effective UID of either the super-user/owner of the file.
#include<sys/types.h>
#include<sys/stat.h>
#include<unistd.h>
int chmod(const char *pathname, mode_t flag); int fchmod(int fdesc, mode_t flag);
The pathname argument of chmod is the path name of a file whereas the fdesc argument of fchmod
is the file descriptor of a file.
The chmod function operates on the specified file, whereas the fchmod function operates on a file
that has already been opened.
To change the permission bits of a file, the effective user ID of the process must be equal to the
owner ID of the file, or the process must have super-user permissions. The mode is specified as
the bitwise OR of the constants shown below.
Mode Description
if (UID == (uid_t)-1)
cerr <<“Invalid user name”; else for (int i
= 2; i < argc ; i++)
if (stat(argv[i], &statv)==0)
{
utime Function
The utime function modifies the access time and the modification time stamps of a file.
The prototype of utime function is
#include<sys/types.h>
#include<unistd.h>
#include<utime.h>
The time_t datatype is an unsigned long and its data is the number of the seconds elapsed since the
birthday of UNIX : 12 AM , Jan 1 of 1970.
If the times (variable) is specified as NULL, the function will set the named file access and
modification time to the current time.
If the times (variable) is an address of the variable of the type struct utimbuf, the function will set the
file access time and modification time to the value specified by the variable.
Multiple processes performs read and write operation on the same file concurrently.
This provides a means for data sharing among processes, but it also renders difficulty for any process in
determining when the other process can override data in a file.
So, in order to overcome this drawback UNIX and POSIX standard support file locking mechanism.
File locking is applicable for regular files.
Only a process can impose a write lock or read lock on either a portion of a file or on the entire file.
The differences between the read lock and the write lock is that when write lock is set, it prevents the other
process from setting any over-lapping read or write lock on the locked file.
Similarly when a read lock is set, it prevents other processes from setting any overlapping write locks on
the locked region.
The intension of the write lock is to prevent other processes from both reading and writing the locked
region while the process that sets the lock is modifying the region, so write lock is termed as “Exclusive
lock”.
The use of read lock is to prevent other processes from writing to the locked region while the process that
sets the lock is reading data from the region.
Other processes are allowed to lock and read data from the locked regions. Hence a read lock is also called
as “shared lock “.
File lock may be mandatory if they are enforced by an operating system kernel.
If a mandatory exclusive lock is set on a file, no process can use the read or write system calls to access the
data on the locked region.
These mechanisms can be used to synchronize reading and writing of shared files by multiple processes.
If a process locks up a file, other processes that attempt to write to the locked regions are blocked until the
former process releases its lock.
Problem with mandatory lock is – if a runaway process sets a mandatory exclusive lock on a file and never
unlocks it, then, no other process can access the locked region of the file until the runway process is killed
or the system has to be rebooted.
If locks are not mandatory, then it has to be advisory lock.
A kernel at the system call level does not enforce advisory locks.
If a process sets a read lock on a file, for example from address 0 to 256, then sets a write lock on the file
from address 0 to 512, the process will own only one write lock on the file from 0 to 512, the previous read
lock from 0 to 256 is now covered by the write lock and the process does not own two locks on the region
from 0 to 256. This process is called “Lock Promotion”.
Furthermore, if a process now unblocks the file from 128 to 480, it will own two write locks on the file:
one from 0 to 127 and the other from 481 to 512. This process is called “Lock Splitting”.
UNIX systems provide fcntl function to support file locking. By using fcntl it is possible to impose read or
write locks on either a region or an entire file.
The prototype of fcntl is
#include<fcntl.h>
int fcntl(int fdesc, int cmd_flag, ..... );
For file locking purpose, the third argument to fctnl is an address of a struct flock type variable.
This variable specifies a region of a file where lock is to be set, unset or queried.
struct flock
{
short l_type; /* what lock to be set or to unlock file */
short l_whence; /* Reference address for the next field */
off_t l_start ; /*offset from the l_whence reference
addr*/
off_t l_len ; /*how many bytes in the locked region */
pid_t l_pid ; /*pid of a process which has locked the
file */
};
The l_whence, l_start, and l_len define a region of a file to be locked or unlocked.
The possible values of l_whence and their uses are:
A lock set by the fcntl API is an advisory lock but we can also use fcntl for mandatory locking purpose
with the following attributes set before using fcntl
1. Turn on the set-GID flag of the file.
2. Turn off the group execute right permission of the file.
Example Program
#include <unistd.h>
#include<fcntl.h> int main ()
{
int fd;
struct flock lock;
fd=open(“divya”,O_RDONLY);
lock.l_type=F_RDLCK;
lock.l_whence=0; lock.l_start=10;
lock.l_len=15;
fcntl(fd,F_SETLK,&lock);
}
To allow a process to scan directories in a file system independent manner, a directory record is defined as
struct dirent in the <dirent.h> header for UNIX.
Some of the functions that are defined for directory file operations in the above header are
#include<unistd.h>
int rmdir (const char * path_name);
UNIX systems have defined additional functions for random access of directory file records.
Function Use
telldir Returns the file pointer of a given dir_fdesc
seekdir Changes the file pointer of a given dir_fdesc to a specified address
The following list_dir.C program illustrates the uses of the mkdir, opendir, readdir, closedir and rmdir APIs:
#include<iostream.
h>
#include<stdio.h>
#include<sys/types.h
>
#include<unistd.h>
#include<string.h>
#include<sys/stat.h>
#if defined(BSD) && !_POSIX_SOURCE
#include<sys/dir.h>
typedef struct dirent Dirent;
#else
#endif
#include<dirent.h>
typedef struct dirent Dir
int main(int agc, char* argv[])
{
Dirent* dp; DIR*
dir_fdesc; while(--argc>0)
{
if(!(dir_fdesc=opendir(*++argv)))
{
if(mkdir(*argv,S_IRWXU | S_IRWXG |
S_IRWXO)==-1) perror("opendir");
continue;
}
for(int i=0;i<2;i++)
for(int cnt=0;dp=readdir(dir_fdesc);)
{
if(i) cout<<dp->d_name<<endl;
if(strcmp(dp->d_name,".") && strcmp(dp->d_name,"..")) cnt++;
}
if(!cnt)
{
#include<sys/stat.h>
#include<unistd.h>
int mknod(const char* path_name, mode_t mode, int device_id);
The first argument pathname is the pathname of a device file to be created.
The second argument mode specifies the access permission, for the owner, group and others, also
S_IFCHR or S_IBLK flag to be assigned to the file.
The third argument device_id contains the major and minor device number.
Example
mknod(“SCSI5”,S_IFBLK | S_IRWXU | S_IRWXG | S_IRWXO,(15<<8) | 3);
The above function creates a block device file “divya”, to which all the three i.e. read, write and execute
permission is granted for user, group and others with major number as 8 and minor number 3.
On success mknod API returns 0 , else it returns -1
The following test_mknod.C program illustrates the use of the mknod, open, read, write and close APIs
on a block device file.
#include<iostream.
h>
if(argc!=4)
{
cout<<"usage:"<<argv[0]<<"<file><major_no><minor_no>"; return 0;
}
int major=atoi(argv[2],minor=atoi(argv[3]);
(void) mknod(argv[1], S_IFCHR | S_IRWXU | S_IRWXG | S_IRWXO, (major<<8) | minor);
#include<unistd.h>
int pipe(int fds[2]);
The following test_fifo.C example illustrates the use of mkfifo, open, read, write and close APIs for a FIFO file:
#include<iostream.h>
#include<stdio.h>
#include<sys/types.h
>
#include<unistd.h>
#include<fcntl.h>
#include<sys/stat.h>
#include<string.h>
#include<errno.h>
int main(int argc,char* argv[])
{
if(argc!=2 && argc!=3)
{
cout<<"usage:"<<argv[0]<<"<file> [<arg>]"; return 0;
}
int fd;
char buf[256];
(void) mkfifo(argv[1], S_IFIFO | S_IRWXU | S_IRWXG | S_IRWXO ); if(argc==2)
{
fd=open(argv[1],O_RDONLY | O_NONBLOCK);
while(read(fd,buf,sizeof(buf))==-1 && errno==EAGAIN)
sleep(1); while(read(fd,buf,sizeof(buf))>0)
cout<<buf<<endl;
}
else
{
fd=open(argv[1],O_WRONLY);
write(fd,argv[2],strlen(argv[2]));
}
close(fd
);
}
int symlink(const char *org_link, const char *sym_link); int readlink(const char*
sym_link,char* buf,int size); int lstat(const char * sym_link, struct stat* statv);
The org_link and sym_link arguments to a sym_link call specify the original file path name and the
symbolic link path name to be created.
1. Introduction
2. main function,
3. Process Termination
4. Command-Line Arguments
5. Environment List
6. Memory Layout of a C Program
7. Shared Libraries
8. Memory Allocation
9. Environment Variables
10. setjmp and longjmp Functions
11. getrlimit, setrlimit Functions
12. UNIX Kernel Support for Processes.
1. INTRODUCTION
2. main FUNCTION
3. PROCESS TERMINATION
There are eight ways for a process to terminate. Normal termination occurs in five ways:
1. Return from main
2. Calling exit
3. Calling _exit or _Exit
4. Return of the last thread from its start routine
5. Calling pthread_exit from the last thread Abnormal termination occurs in three ways:
a. Calling abort
b. Receipt of a signal
c. Response of the last thread to a cancellation request
Exit Functions
Three functions terminate a program normally: _exit and _Exit, which return to the kernel immediately,
and exit, which performs certain cleanup processing and then returns to the kernel.
#include <stdlib.h>
void exit(int status);
void _Exit(int status);
#include <unistd.h>
void _exit(int status);
All three exit functions expect a single integer argument, called the exit status. Returning an
integer value from the main function is equivalent to calling exit with the same value.
Thus exit(0); is the same as return(0); from the main function.
In the following situations the exit status of the process is undefined.
1. any of these functions is called without an exit status.
atexit Function
With ISO C, a process can register up to 32 functions that are automatically called by exit. These
are called exit handlers and are registered by calling the atexit function.
#include <stdlib.h>
int atexit(void (*func)(void));
Returns: 0 if OK, nonzero on error
This declaration says that we pass the address of a function as the argument to atexit. When this
function is called, it is not passed any arguments and is not expected to return a value.
The exit function calls these functions in reverse order of their registration. Each function is
called as many times as it was registered.
#include "apue.h"
static void my_exit1(void);
static void my_exit2(void);
int main(void)
{
if (atexit(my_exit2) != 0)
err_sys("can't register my_exit2");
if (atexit(my_exit1) != 0)
err_sys("can't register my_exit1");
if (atexit(my_exit1) != 0)
err_sys("can't register my_exit1");
printf("main is done\n");
return(0);
}
static void
my_exit1(void)
{
printf("first exit handler\n");
}
static void
my_exit2(void)
{
printf("second exit handler\n");
}
Output:
$ ./a.out
main is done
first exit handler
first exit handler
second exit handler
The below figure summarizes how a C program is started and the various ways it can terminate.
4. COMMAND-LINE ARGUMENTS
When a program is executed, the process that does the exec can pass command-line arguments
to the new program.
#include "apue.h"
int main(int argc, char *argv[])
{
int i;
for (i = 0; i < argc; i++) /* echo all command-line args */
printf("argv[%d]: %s\n", i, argv[i]);
exit(0);
}
Output:
$ ./echoarg arg1 TEST foo
argv[0]: ./echoarg
argv[1]: arg1
argv[2]: TEST
argv[3]: foo
5. ENVIRONMENT LIST
Each program is also passed an environment list. Like the argument list, the environment list is an
array of character pointers, with each pointer containing the address of a null-terminated C string.
The address of the array of pointers is contained in the global variable environ:
extern char **environ;
7. SHARED LIBRARIES
Nowadays most UNIX systems support shared libraries. Shared libraries remove the common library
routines from the executable file, instead maintaining a single copy of the library routine somewhere
in memory that all processes reference.
This reduces the size of each executable file but may add some runtime overhead, either when the
program is first executed or the first time each shared library function is called. Another advantage
of shared libraries is that, library functions can be replaced with new versions without having to re-
link, edit every program that uses the library. With cc compiler we can use the option –g to indicate
that we are using shared library.
8. MEMORY ALLOCATION
#include <stdlib.h>
void *malloc(size_t size);
void *calloc(size_t nobj, size_t size);
void *realloc(void *ptr, size_t
newsize);
All three return: non-null pointer if OK, NULL on error
void free(void *ptr);
The pointer returned by the three allocation functions is guaranteed to be suitably aligned so that it
Prof. Mamatha B Dept of CSE Page 31
Module 3_UNIX Programming (18CS56) 2022-2023
can be used for any data object.
Because the three alloc functions return a generic void * pointer, if we #include <stdlib.h> (to
obtain the function prototypes), we do not explicitly have to cast the pointer returned by these
functions when we assign it to a pointer of a different type.
The function free causes the space pointed to by ptr to be deallocated. This freed space is usually put
into a pool of available memory and can be allocated in a later call to one of the three alloc
functions.
The realloc function lets us increase or decrease the size of a previously allocated area. For example,
if we allocate room for 512 elements in an array that we fill in at runtime but find that we need room
for more than 512 elements, we can call realloc. If there is room beyond the end of the existing
region for the requested space, then realloc doesn't have to move anything; it simply allocates the
additional area at the end and returns the same pointer that we passed it. But if there isn't room at the
end of the existing region, realloc allocates another area that is large enough, copies the existing 512-
element array to the new area, frees the old area, and returns the pointer to the new area.
The allocation routines are usually implemented with the sbrk(2) system call. Although sbrk can
expand or contract the memory of a process, most versions of malloc and free never decrease their
memory size.
The space that we free is available for a later allocation, but the freed space is not usually returned to
the kernel; that space is kept in the malloc pool.
It is important to realize that most implementations allocate a little more space than is requested and
use the additional space for record keeping the size of the allocated block, a pointer to the next
allocated block, and the like. This means that writing past the end of an allocated area could
overwrite this record-keeping information in a later block. These types of errors are often
catastrophic, but difficult to find, because the error may not show up until much later. Also, it is
possible to overwrite this record keeping by writing before the start of the allocated area.
Because memory allocation errors are difficult to track down, some systems provide versions of
these functions that do additional error checking every time one of the three alloc functions or free is
called. These versions of the functions are often specified by including a special library for the link
editor. There are also publicly available sources that you can compile with special flags to enable
additional runtime checking.
libmalloc
SVR4-based systems, such as Solaris, include the libmalloc library, which provides a set of
interfaces matching the ISO C memory allocation functions. The libmalloc library includes mallopt,
a function that allows a process to set certain variables that control the operation of the storage
allocator. A function called mallinfo is also available to provide statistics on the memory allocator.
vmalloc
Vo describes a memory allocator that allows processes to allocate memory using different
techniques for different regions of memory. In addition to the functions specific to vmalloc, the
library also provides emulations of the ISO C memory allocation functions.
quick-fit
Historically, the standard malloc algorithm used either a best-fit or a first-fit memory allocation
strategy. Quick-fit is faster than either, but tends to use more memory. Free implementations of
malloc and free based on quick-fit are readily available from several FTP sites.
alloca Function
The function alloca has the same calling sequence as malloc; however, instead of allocating memory
from the heap, the memory is allocated from the stack frame of the current function. The advantage
is that we don't have to free the space; it goes away automatically when the function returns. The
alloca function increases the size of the stack frame. The disadvantage is that some systems can't
support alloca, if it's impossible to increase the size of the stack frame after the function has been
called.
9. ENVIRONMENT VARIABLES
#include <stdlib.h>
char *getenv(const char *name);
Returns: pointer to value associated with name, NULL if not found.
Note that this function returns a pointer to the value of a name=value string. We should always use
getenv to
fetch a specific value from the environment, instead of accessing environ directly.
In addition to fetching the value of an environment variable, sometimes we may want to set an
environment variable. We may want to change the value of an existing variable or add a new
variable to the environment. The prototypes of these functions are
#include <stdlib.h>
int putenv(char
*str);
int setenv(const char *name, const char *value, int
rewrite); int unsetenv(const char *name);
All return: 0 if OK, nonzero on error.
The putenv function takes a string of the form name=value and places it in the environment
list. If name already exists, its old definition is first removed.
The setenv function sets name to value. If name already exists in the environment, then
(a) if rewrite is nonzero, the existing definition for name is first removed;
(b) if rewrite is 0, an existing definition for name is not removed, name is not set to the new
value, and no error occurs.
The unsetenv function removes any definition of name. It is not an error if such a definition does not
exist.
Note the difference between putenv and setenv. Whereas setenv must allocate memory to
create the name=value string from its arguments, putenv is free to place the string passed to
NOTE:
1. If we're modifying an existing name:
a) If the size of the new value is less than or equal to the size of the existing value, we can just
copy the new string over the old string.
b) If the size of the new value is larger than the old one, however, we must malloc to obtain
room for the new string, copy the new string to this area, and then replace the old pointer
in the environment list for name with the pointer to this allocated area.
2. If we're adding a new name, it's more complicated. First, we have to call malloc to allocate
room for the name=value string and copy the string to this area.
a) Then, if it's the first time we've added a new name, we have to call malloc to obtain room for a
new list of pointers. We copy the old environment list to this new area and store a pointer to the
name=value string at the end of this list of pointers. We also store a null pointer at the end of this
list, of course. Finally, we set environ to point to this new list of pointers.
b) If this isn't the first time we've added new strings to the environment list, then we know that
we've already allocated room for the list on the heap, so we just call realloc to allocate room for
one more pointer. The pointer to the new name=value string is stored at the end of the list (on
top of the previous null pointer), followed by a null pointer.
In C, we can't goto a label that's in another function. Instead, we must use the setjmp and longjmp
functions to perform this type of branching. As we'll see, these two functions are useful for handling
error conditions that occur in a deeply nested function call.
#include <setjmp.h>
int setjmp(jmp_buf env);
Returns: 0 if called directly, nonzero if returning from a call to longjmp
void longjmp(jmp_buf env, int val);
The setjmp function always returns ‘0’ on its success when it is called directly in a process (for the
first time).
The longjmp function is called to transfer a program flow to a location that was stored in the env
argument.
The program code marked by the env must be in a function that is among the callers of the current
function.
When the process is jumping to the target function, all the stack space used in the current function
and it callers, upto the target function are discarded by the longjmp function.
The process resumes execution by re-executing the setjmp statement in the target function that is
marked by env. The return value of setjmp function is the value(val), as specified in the longjmp
function call.
The ‘val’ should be nonzero, so that it can be used to indicate where and why the longjmp function
was invoked and process can do error handling accordingly.
Note: The values of automatic and register variables are indeterminate when the longjmp is called but
static and global variable are unaltered. The variables that we don’t want to roll back after longjmp are
declared with keyword ‘volatile’.
Every process has a set of resource limits, some of which can be queried and changed by the getrlimit
and setrlimit functions.
#include <sys/resource.h>
int getrlimit(int resource, struct rlimit *rlptr);
int setrlimit(int resource, const struct rlimit *rlptr);
Both return: 0 if OK, nonzero on error
Each call to these two functions specifies a single resource and a pointer to the following structure:
struct rlimit
{
rlim_t rlim_cur; /* soft limit: current limit */
rlim_t rlim_max; /* hard limit: maximum value for rlim_cur */
};
are inherited by any of its children. This means that the setting of resource limits
needs to be built into the shells to affect all our future processes.
{
struct rlimit limit;
if (getrlimit(resource, &limit) < 0)
err_sys("getrlimit error for %s", name);
printf("%-14s ", name);
if (limit.rlim_cur == RLIM_INFINITY)
printf("(infinite) ");
else
printf(FMT, limit.rlim_cur);
if (limit.rlim_max == RLIM_INFINITY)
printf("(infinite)");
else
printf(FMT, limit.rlim_max);
putchar((int)'\n');
}
The data structure and execution of processes are dependent on operating system implementation.
A UNIX process consists minimally of a text segment, a data segment and a stack segment. A
segment is an area of memory that is managed by the system as a unit.
A text segment consists of the program text in machine executable instruction code format.
The data segment contains static and global variables and their corresponding data.
A stack segment contains runtime variables and the return addresses of all active functions for a
process.
UNIX kernel has a process table that keeps track of all active process present in the system. Some of
these processes belongs to the kernel and are called as “system process”.
All processes in UNIX system expect the process that is created by the system boot code, are created
by the fork system call. After the fork system call, once the child process is created, both the parent
and child processes resumes execution. When a process is created by fork, it contains duplicated
The process will be assigned with attributes, which are either inherited from its parent or will be set by the
kernel.
A real user identification number (rUID): the user ID of a user who created the parent process.
A real group identification number (rGID): the group ID of a user who created that parent process.
An effective user identification number (eUID): this allows the process to access and create files with
the same privileges as the program file owner.
An effective group identification number (eGID): this allows the process to access and create
files with the same privileges as the group to which the program file belongs.
Saved set-UID and saved set-GID: these are the assigned eUID and eGID of the process respectively.
Process group identification number (PGID) and session identification number (SID): these
identify the
process group and session of which the process is
member.
Supplementary group identification numbers: this is a set of additional group IDs for a user who
created
the process.
Current directory: this is the reference (inode number) to a working directory file.
Root directory: this is the reference to a root directory.
Signal handling: the signal handling settings.
Signal mask: a signal mask that specifies which signals are to be blocked.
Unmask: a file mode mask that is used in creation of files to specify which accession rights should
be taken out.
Nice value: the process scheduling priority value.
Controlling terminal: the controlling terminal of the process.
Process identification number (PID): an integer identification number that is unique per process in
an entire operating system.
Parent process identification number (PPID): the parent process PID.
Pending signals: he set of signals that are pending delivery to the parent process.
Alarm clock time: the process alarm clock time is reset to zero in the child process.
File locks: the set of file locks owned by the parent process is not inherited by the chid process.
fork and exec are commonly used together to spawn a sub-process to execute a different program. The
advantages of this method are:
A process can create multiple processes to execute multiple programs concurrently.
Because each child process executes in its own virtual address space, the parent process is not
affected by the execution status of its child process.
1. Introduction
2. Process Identifiers
3. fork
4. vfork
5. exit
6. wait
7. waitpid
8. wait3
9. wait4 Functions
10. Race Conditions
11. exec
1. INTRODUCTION
Process control is concerned about creation of new processes, program execution, and process
termination.
2. PROCESS IDENTIFIERS
#include <unistd.h>
pid_t getpid(void);
Returns: process ID of calling process
pid_t getppid(void);
Returns: parent process ID of calling process
uid_t getuid(void);
Returns: real user ID of calling process
uid_t geteuid(void);
Returns: effective user ID of calling process
gid_t getgid(void);
Returns: real group ID of calling process
gid_t getegid(void);
Returns: effective group ID of calling process
3. fork FUNCTION
An existing process can create a new one by calling the fork function.
#include <unistd.h>
pid_t fork(void);
Returns: 0 in child, process ID of child in parent, 1 on error.
Example programs:
Program 1 Program 2
/* Program to demonstrate fork function /* Program name – fork2.c */
Program name – fork1.c */ #include<sys/types.h>
#include<sys/types.h> #include<unistd.h>
#include<unistd.h> int main( )
int main( ) {
{ printf(“\n 6 sem “);
fork( ); fork( );
printf(“\n hello USP”); printf(“\n hello USP”);
} }
Output : Output :
$ cc fork1.c $ cc fork1.c
$ ./a.out $ ./a.out
hello USP 6 sem
hello USP hello USP
hello USP
File Sharing
Consider a process that has three different files opened for standard input, standard output, and standard error. On
return from fork, we have the arrangement shown in Figure :
It is important that the parent and the child share the same file offset.
Consider a process that forks a child, then waits for the child to complete.
Assume that both processes write to standard output as part of their normal processing.
If the parent has its standard output redirected (by a shell, perhaps) it is essential that the parent's file
offset be updated by the child when the child writes to standard output.
In this case, the child can write to standard output while the parent is waiting for it; on completion of the
child, the parent can continue writing to standard output, knowing that its output will be appended to
whatever the child wrote.
If the parent and the child did not share the same file offset, this type of interaction would be more
difficult to accomplish and would require explicit actions by the parent.
There are two normal cases for handling the descriptors after a fork.
1. The parent waits for the child to complete. In this case, the parent does not need to do anything with its
descriptors. When the child terminates, any of the shared descriptors that the child read from or wrote to
will have their file offsets updated accordingly.
2. Both the parent and the child go their own ways. Here, after the fork, the parent closes the descriptors that
it doesn't need, and the child does the same thing. This way, neither interferes with the other's open
descriptors. This scenario is often the case with network servers.
There are numerous other properties of the parent that are inherited by the child:
Real user ID, real group ID, effective user ID, effective group ID
Supplementary group IDs
Process group ID
Session ID
Controlling terminal
The set-user-ID and set-group-ID flags
Current working directory
Root directory
File mode creation mask
Signal mask and dispositions
The close-on-exec flag for any open file descriptors
Environment
Attached shared memory segments
Memory mappings
Resource limits
4. vfork FUNCTION
The function vfork has the same calling sequence and same return values as fork.
The vfork function is intended to create a new process when the purpose of the new process is to exec a
new program.
The vfork function creates the new process, just like fork, without copying the address space of the
parent into the child, as the child won't reference that address space; the child simply calls exec (or exit)
right after the vfork.
Instead, while the child is running and until it calls either exec or exit, the child runs in the address space
of the parent. This optimization provides an efficiency gain on some paged virtual-memor
implementations of the UNIX System.
Another difference between the two functions is that vfork guarantees that the child runs first, until the
child calls exec or exit. When the child calls either of these functions, the parent resumes.
Example of vfork function
#include "apue.h"
int glob = 6; /* external variable in initialized data */
int main(void)
{
int var; /* automatic variable on the stack */
pid_t pid;
var = 88;
printf("before vfork\n"); /* we don't flush stdio */
if ((pid = vfork()) < 0) {
err_sys("vfork error");
} else if (pid == 0) { /* child */
glob++; /* modify parent's variables */
var++;
_exit(0); /* child terminates */
}
/*
* Parent continues here.
*/
printf("pid = %d, glob = %d, var = %d\n", getpid(), glob, var);
exit(0);
}
Output:
$ ./a.out
before vfork
pid = 29039, glob = 7, var = 89
5. exit FUNCTIONS
When a process terminates, either normally or abnormally, the kernel notifies the parent by sending the
SIGCHLD signal to the parent. Because the termination of a child is an asynchronous event - it can
happen at any time while the parent is running - this signal is the asynchronous notification from the
kernel to the parent.
The parent can choose to ignore this signal, or it can provide a function that is called when the signal
occurs: a signal handler.
A process that calls wait or waitpid can:
Block, if all of its children are still running
Return immediately with the termination status of a child, if a child has terminated and is
#include <sys/wait.h>
pid_t wait(int *statloc);
pid_t waitpid(pid_t pid, int *statloc, int options);
Both return: process ID if OK, 0 (see later), or 1 on error.
If a child has already terminated and is a zombie, wait returns immediately with that child's status.
Otherwise, it blocks the caller until a child terminates. If the caller blocks and has multiple children, wait
returns when one terminates.
For both functions, the argument statloc is a pointer to an integer. If this argument is not a null pointer, the
termination status of the terminated process is stored in the location pointed to by the argument.
#include "apue.h"
#include <sys/wait.h>
Void pr_exit(int status)
{
if (WIFEXITED(status))
printf("normal termination, exit status = %d\n", WEXITSTATUS(status));
else if (WIFSIGNALED(status))
printf("abnormal termination, signal number = %d%s\n",WTERMSIG(status),
#ifdef WCOREDUMP
WCOREDUMP(status) ? " (core file generated)" : "");
#else
"");
#endif
else if (WIFSTOPPED(status))
printf("child stopped, signal number = %d\n", WSTOPSIG(status));
}
#include "apue.h"
#include <sys/wait.h>
Int main(void)
{
pid_t pid;
int status;
if ((pid = fork()) < 0)
err_sys("fork error");
else if (pid == 0) /* child */
exit(7);
if (wait(&status) != pid) /* wait for child */
err_sys("wait error");
pr_exit(status); /* and print its status */
if ((pid = fork()) < 0)
err_sys("fork error");
else if (pid == 0) /* child */
abort(); /* generates SIGABRT */
if (wait(&status) != pid) /* wait for child */
err_sys("wait error");
pr_exit(status); /* and print its status */
The interpretation of the pid argument for waitpid depends on its value:
pid == 1 Waits for any child process. In this respect, waitpid is equivalent to wait.
pid > 0 Waits for the child whose process ID equals pid.
pid == 0 Waits for any child whose process group ID equals that of the calling process.
pid < 1 Waits for any child whose process group ID equals the absolute value of pid.
Macros to examine the termination status returned by wait and waitpid
Macro Description
WIFEXITED(status) True if status was returned for a child that terminated normally. In
this case, we can execute WEXITSTATUS (status) to fetch the
low-order 8 bits of the argument that the child passed to exit,
_exit,or _Exit.
WIFSIGNALED (status) True if status was returned for a child that terminated abnormally,
by receipt of a signal that it didn't catch. In this case, we can
execute WTERMSIG (status) to fetch the signal number that
caused the termination. Additionally, some implementations (but
not the Single UNIX Specification) define the macro
WCOREDUMP (status) that returns true if a core file of the
terminated process was generated.
WIFSTOPPED (status) True if status was returned for a child that is currently stopped. In
this case, we can execute WSTOPSIG (status) to fetch the signal
number that caused the child to stop.
WIFCONTINUED (status) True if status was returned for a child that has been continued after
a job control stop
Constant Description
WCONTINUED If the implementation supports job control, the status of any child
specified by pid that has been continued after being stopped, but
whose status has not yet been reported, is returned.
WNOHANG The waitpid function will not block if a child specified by pid is
not immediately available. In this case, the return value is 0.
WUNTRACED If the implementation supports job control, the status of any child
specified by pid that has stopped, and whose status has not been
reported since it has stopped, is returned. The WIFSTOPPED
macro determines whether the return value corresponds to a
stopped child process.
The waitpid function provides three features that aren't provided by the wait function.
The waitpid function lets us wait for one particular process, whereas the wait function returns the status
of any terminated child. We'll return to this feature when we discuss the popen function.
The waitpid function provides a nonblocking version of wait. There are times when we want to fetch a
child's status, but we don't want to block.
The waitpid function provides support for job control with the WUNTRACED and WCONTINUED
options.
sleep(2);
printf("second child, parent pid = %d\n", getppid());
exit(0);
}
if (waitpid(pid, NULL, 0) != pid) /* wait for first child */
err_sys("waitpid error");
/*
* We're the parent (the original process); we continue executing,
* knowing that we're not the parent of the second child.
*/
exit(0);
}
Output:
$ ./a.out
$ second child, parent pid = 1
7. waitid FUNCTION
#include <sys/wait.h>
int waitid (idtype_t idtype, id_t id, siginfo_t *infop, int options);
Returns: 0 if OK, -1 on error
The options argument is a bitwise OR of the flags as shown below: these flags indicate which state changes
the caller is interested in.
Constant Description
The resource information includes such statistics as the amount of user CPU time, the amount of system
CPU time, number of page faults, number of signals received etc. the resource information is available
only for terminated child process not for the process that were stopped due to job control.
9. RACE CONDITIONS
A race condition occurs when multiple processes are trying to do something with shared data and the final
outcome depends on the order in which the processes run.
Example: The program below outputs two strings: one from the child and one from the parent.
The program contains a race condition because the output depends on the order in which the processes are
run by the kernel and for how long each process runs.
#include "apue.h"
static void charatatime(char *);
int main(void)
{
pid_t pid;
if ((pid = fork()) < 0) {
err_sys("fork error");
} else if (pid == 0) {
charatatime("output from child\n");
} else {
charatatime("output from parent\n");
}
exit(0);
}
static void
charatatime(char *str)
{
char *ptr;
int c;
setbuf(stdout, NULL); /* set unbuffered */
for (ptr = str; (c = *ptr++) != 0; )
putc(c, stdout);
}
Output:
$ ./a.out
ooutput from child
utput from parent
$ ./a.out
ooutput from child
utput from parent
$ ./a.out
output from child
output from parent
program modification to avoid race condition
#include "apue.h"
static void charatatime(char *);
int main(void)
{
pid_t pid;
+ TELL_WAIT();
+
if ((pid = fork()) < 0) {
err_sys("fork error");
} else if (pid == 0) {
+ WAIT_PARENT(); /* parent goes first */
charatatime("output from child\n");
} else {
charatatime("output from parent\n");
+ TELL_CHILD(pid);
}
exit(0);
}
static void
charatatime(char *str)
{
char *ptr;
int c;
setbuf(stdout, NULL); /* set unbuffered */
for (ptr = str; (c = *ptr++) != 0; )
putc(c, stdout);
}
When we run this program, the output is as we expect; there is no intermixing of output from the two
processes.
#include <unistd.h>
The first difference in these functions is that the first four take a pathname argument, whereas the last two
take a filename argument. When a filename argument is specified
The above table shows the differences among the 6 exec functions.
We've mentioned that the process ID does not change after an exec, but the new program inherits
additional properties from the calling process:
Process ID and parent process ID
Real user ID and real group ID
Supplementary group IDs
Process group ID
Session ID
Controlling terminal
Time left until alarm clock
Current working directory
Root directory
File mode creation mask
File locks
Process signal mask
Pending signals
Resource limits
Values for tms_utime, tms_stime, tms_cutime, and tms_cstime.
Note that the shell prompt appeared before the printing of argv[0] from the second exec. This is because
the parent did not wait for this child process to finish.
We can make a few statements about the three user IDs that the kernel maintains.
Only a superuser process can change the real user ID. Normally, the real user ID is set by the
login(1) program when we log in and never changes. Because login is a superuser process, it sets
all three user IDs when it calls setuid.
The effective user ID is set by the exec functions only if the set-user-ID bit is set for the program
file. If the set-user-ID bit is not set, the exec functions leave the effective user ID as its current
value. We can call setuid at any time to set the effective user ID to either the real user ID or the
saved set-user-ID. Naturally, we can't set the effective user ID to any random value.
The saved set-user-ID is copied from the effective user ID by exec. If the file's set-user-ID bit is
set, this copy is saved after execstores the effective user ID from the file's user ID.
setreuidand setregidFunctions
Swapping of the real user ID and the effective user ID with the setreuid function.
#include <unistd.h>
POSIX.1 includes the two functions seteuid and setegid. These functions are similar to setuid and
setgid, but only the effective user ID or effective group ID is changed.
#include <unistd.h>
Figure: Summary of all the functions that set the various user Ids
2. INTERPRETER FILES
These files are text files that begin with a line of the form
#! pathname [ optional-argument ]
The space between the exclamation point and the pathname is optional. The most common of these
interpreter files begin with the line
#!/bin/sh
The pathname is normally an absolute pathname, since no special operations are performed on it
(i.e., PATH is not used).
The recognition of these files is done within the kernel as part of processing the exec system call.
The actual file that gets executed by the kernel is not the interpreter file, but the file specified by the
pathname on the first line of the interpreter file.
Be sure to differentiate between the interpreter filea text file that begins with #!and the interpreter,
which is specified by the pathname on the first line of the interpreter file.
Be aware that systems place a size limit on the first line of an interpreter file. This limit includes the
#!, the pathname, the optional argument, the terminating newline, and any spaces.
Output:
$ cat
/home/sar/bin/testinterp
#!/home/sar/bin/echoarg foo
$ ./a.out
argv[0]:
/home/sar/bin/echoarg
argv[1]: foo
argv[2]:
/home/sar/bin/testinterp
argv[3]: myarg1
argv[4]: MY ARG2
system FUNCTION
#include <stdlib.h>
int system(const char *cmdstring);
1. If either the forkfails or waitpid returns an error other than EINTR, systemreturns 1 with
errnoset to indicate the error.
2. If the execfails, implying that the shell can't be executed, the return value is as if the shell had
executed exit(127).
3. Otherwise, all three functions fork, exec, and waitpid succeed, and the return value from
system is the termination status of the shell, in the format specified for waitpid.
if (cmdstring == NULL)
return(1); /* always a command processor with UNIX */
return(status);
}
#include "apue.h"
#include <sys/wait.h>
int main(void)
{
int status;
if ((status = system("date")) < 0)
err_sys("system() error");
pr_exit(status);
exit(0);
}
#include "apue.h"
int status;
if (argc < 2)
err_quit("command-line argument required");
exit(0);
}
#include "apue.h"
int main(void)
{
printf("real uid = %d, effective uid = %d\n", getuid(), geteuid());
exit(0);
}
3. PROCESS ACCOUNTING
Most UNIX systems provide an option to do process accounting. When enabled, the kernel
writes an accounting record each time a process terminates.
These accounting records are typically a small amount of binary data with the name of the
command, the amount of CPU time used, the user ID and group ID, the starting time, and so on.
A super user executes accton with a pathname argument to enable accounting.
The accounting records are written to the specified file, which is usually /var/account/acct.
Accounting is turned off by executing accton without any arguments.
The data required for the accounting record, such as CPU times and number of characters
transferred, is kept by the kernel in the process table and initialized whenever a new process is
created, as in the child after a fork.
Each accounting record is written when the process terminates.
This means that the order of the records in the accounting file corresponds to the termination order
of the processes, not the order in which they were started.
The accounting records correspond to processes, not programs.
A new record is initialized by the kernel for the child after a fork, not when a new program is
executed. The structure of the accounting records is defined in the header <sys/acct.h> and
looks something like
typedef u_short comp_t; /* 3-bit base 8 exponent; 13-bit fraction */
struct acct
{
char ac_flag; /* flag */
char ac_stat; /* termination status (signal & core flag only) */
/* (Solaris only) */
uid_t ac_uid; /* real user ID */
gid_t ac_gid; /* real group ID */
dev_t ac_tty; /* controlling terminal */
time_t ac_btime; /* starting calendar time */
comp_t ac_utime; /* user CPU time (clock ticks) */
comp_t ac_stime; /* system CPU time (clock ticks) */
comp_t ac_etime; /* elapsed time (clock ticks) */
comp_t ac_mem; /* average memory usage */
comp_t ac_io; /* bytes transferred (by read and write) */
/* "blocks" on BSD systems */
comp_t ac_rw; /* blocks read or written */
/* (not present on BSD systems) */
char ac_comm[8]; /* command name: [8] for Solaris, */
/* [10] for Mac OS X, [16] for FreeBSD, and */
/* [17] for Linux */
};
The ac_flag member records certain events during the execution of the process.
ac_flag Description
AFORK process is the result of fork, but never called exec
ASU process used superuser privileges
ACOMPAT process used compatibility mode
ACORE process dumped core
AXSIG process was killed by a signal
AEXPND expanded accounting entry
4. USER IDENTIFICATION
Any process can find out its real and effective user ID and group ID.
Sometimes, however, we want to find out the login name of the user who's running the program.
We could call getpwuid(getuid()), but what if a single user has multiple login names, each with the
same user ID? (A person might have multiple entries in the password file with the same user ID to
have a different login shell for each entry).
The system normally keeps track of the name we log in and the getlogin function provides a way to
fetch that login name.
#include <unistd.h>
char *getlogin(void);
This function can fail if the process is not attached to a terminal that a user logged in to.
5. PROCESS TIMES
We describe three times that we can measure: wall clock time, user CPU time, and system CPU
time. Any process can call the times function to obtain these values for itself and any terminated
children.
#include<sys/times.h>
Clock_t times(struct tms * buf)
Note that the structure does not contain any measurement for the wall clock time.
Instead, the function returns the wall clock time as the value of the function, each time it's called.
This value is measured from some arbitrary point in the past, so we can't use its absolute value; instead, we
use its relative value.
6. I/O Redirection
It scans the command line for the occurrence of the special redirection characters <,>,or >>
Unix provides the capability to change where standard input comes from or where ouput goes using a
concept called Input/Output(I/O) redirection.
I/O redirection is accomplished using a redirection operator which allows the user to specify the input
or output data be directed to a file.
The output redirection operator is the >(greater than) symbol and general syntax:
command > output_file_spec
Spaces around the redirection is not mandatory, but to add readability to the command.
Eg: $ls > my_files [Enter] $ echo “Hello World!” > my_files [Enter]
$ cat my_files [Enter] $ cat my_files [Enter]
foo Hello World!
bar
fred
dino
$
The first output redirection creates the file if it does not exist or overwrites its content if it does
and the second redirection appends the string “Hello World!” to the end of the file.
When using the append redirection operator, if the file does not exist, >> will cause its
creation and append the output (to the empty file).
The ability also exists to redirect the standard input using the input redirection operator, the < (less
than) symbol
The general syntax of input redirection:
command < input_file_spec
INTRODUCTION
IPC enables one application to control another application, and for
several applications to share the same data without interfering with one
another. IPC is required in all multiprocessing systems, but it is not
generally supported by single-process operating systems.
The various forms of IPC that are supported on a UNIX system are as
follows :
1) Half duplex Pipes
2) FIFO’s
3) Full duplex Pipes
4) Named full duplex Pipes
5) Message queues
6) Shared memory
7) Semaphores
8) Sockets
9) STREAMS
The first seven forms of IPC are usually restricted to IPC between processes
on the same host.
The final two i.e. Sockets and STREAMS are the only two that are
generally supported for IPC between processes on different hosts.
1. PIPES
Pipes are the oldest form of UNIX System IPC. Pipes have two
limitations.
Historically, they have been half duplex (i.e., data flows in only one
direction).
Pipes can be used only between processes that have a common ancestor.
Normally, a pipe is created by a process, that process calls fork, and
the pipe is used between the parent and the child.
A pipe is created by calling the pipe function.
#include <unistd.h>
int pipe(int filedes[2]);
Returns: 0 if OK, 1 on error.
For a pipe from the child to the parent, the parent closes fd[1], and the
child closes fd[0]. When one end of a pipe is closed, the following two
rules apply.
If we read from a pipe whose write end has been closed, read
returns 0 to indicate an end of file after all the data has been
read.
If we write to a pipe whose read end has been closed, the signal
SIGPIPE is generated. If we either ignore the signal or catch it
and return from the signal handler, write returns 1 with errno set
to EPIPE.
PROGRAM: shows the code to create a pipe between a parent and its child and
to send data down the pipe.
#
i
n
c
l
u
d
e
"
a
p
u
e
.
h
"
i
n
t
m
a
i
n
(
v
o
i
d
)
{
int n;
int fd[2];
pid_t pid;
char line[MAXLINE];
i
f
(
p
i
p
e
(
f
d
)
<
0
)
e
r
r
_
s
y
s
(
"
p
i
p
e
e
r
r
o
r
"
)
;
i
f
(
(
p
i
d
f
o
r
k
(
)
)
<
0
)
e
r
r
_
s
y
s
(
"
f
o
r
k
e
r
r
o
r
"
)
;
}
e
l
s
e
i
f
(
p
i
d
>
0
)
{
/*parent */ close(fd[0]);
write(fd[1], "hello world\n", 12);
} else { /*
c
h
i
l
d
*
/
c
l
o
s
e
(
f
d
[
1
]
)
;
r
e
a
d
(
f
d
[
0
]
,
l
i
n
e
,
M
A
X
L
I
N
E
)
;
w
r
i
t
e
(
S
T
D
O
U
T
_
F
I
L
E
N
O
,
l
i
n
e
,
n
)
;
}
exit(0);
}
The function popen does a fork and exec to execute the cmdstring,
and returns a standard I/Ofile pointer.
If type is "r", the file pointer is connected to the standard output of
cmdstring
Figure 4 Result of fp = popen(cmdstring,
"r")
3. COPROCESSES
A UNIX system filter is a program that reads from standard input and
writes to standard output.
Filters are normally connected linearly in shell pipelines.
A filter becomes a coprocess when the same program generates
the filter's input and reads the filter's output.
A coprocess normally runs in the background from a shell, and its
standard input and standard output are connected to another
program using a pipe.
The process creates two pipes: one is the standard input of the
coprocess, and the other is the standard output of the coprocess.
Figure 6 shows this arrangement.
Figure 6. Driving a coprocess by writing its standard input and reading
its standard output
"
a
p
u
e
.
h
"
i
n
t
m
a
i
n
(
v
o
i
d
)
{
i
n
t
n
,
i
n
t
1
,
i
n
t
2
;
c
h
a
r
l
i
n
e
[
M
A
X
L
I
N
E
]
;
while ((n = read(STDIN_FILENO, line, MAXLINE)) > 0)
{ line[n] = 0; /* null terminate */
{
sprintf
(line,
"%d\n",
int1 +
int2);n
=
strlen(
line);
if
(write(S
TDOUT_FI
LENO,
line, n)
!= n)
err_sys(
"write
error");
} else {
if (write(STDOUT_FILENO,
"invalid args\n", 13)
!= 13)err_sys("write
error");
}
}
exit(0);
}
4. FIFOs
FIFOs are sometimes called named pipes. Pipes can be used
only between related processes when a common ancestor has
created the pipe.
#include <sys/stat.h>
int mkfifo(const char *pathname, mode_t mode);
Returns: 0 if OK, -1 on error
Once we have used mkfifo to create a FIFO, we open it using open. When
we open a FIFO, the nonblocking flag (O_NONBLOCK) affects what
happens.
In the normal case (O_NONBLOCK not specified), an open for
read-only blocks until some other process opens the FIFO for
writing. Similarly, an open for write-only blocks until some other
process opens the FIFO for reading.
If O_NONBLOCK is specified, an open for read-only returns
immediately. But an open for write-only returns 1 with errno set to
ENXIOif no process has the FIFO open for reading.
With a FIFO and the UNIX program tee(1), we can accomplish this
procedure without using a temporary file. (The tee program copies its
standard input to both its standard output and to the file named on its
command line.)
m
k
f
i
f
o
f
i
f
o
1
p
r
o
g
3
< fifo1 &
We create the FIFO and then start prog3 in the background, reading
from the FIFO. We then start prog1 and use tee to send its input to
both the FIFO and prog2. Figure shows the process arrangement.
FIGURE : Using a FIFO and tee to send a stream to two different
processes
Example Client-Server Communication Using a FIFO
FIFO’s can be used to send data between a client and a server. If we
have a server that is contacted by numerous clients, each client can
write its request to a well-known FIFO that the server creates. Since
there are multiple writers for the FIFO, the requests sent by the clients
to the server need to be less than PIPE_BUF bytes in size.
This prevents any interleaving of the client writes. The problem in
using FIFOs for this type of client server communication is how to
send replies back from the server to each client.
A single FIFO can’t be used, as the clients would never know when to
read their response versus responses for other clients. One solution is
for each client to send its process ID with the request. The server then
creates a unique FIFO for each client, using a pathname based on the
client’sprocess ID.
For example, the server can create a FIFO with the name /vtu/
ser.XXXXX, where XXXXX is replaced with the client’s process ID.
This arrangement works, although it is impossible for the server to tell
whether a client crashes. This causes the client-specific FIFOs to be
left in the file system.
The server also must catch SIGPIPE, since it’s possible for a client to
send a request and terminate before reading the response, leaving the
client-specific FIFO with one writer (the server) and no reader.
Figure : Clients sending requests to a server
using a FIFO
5. System V IPC
The client and the server can agree on a key by defining the key in
a common header, for example. The server then creates a new IPC
structure specifying this key. The problem with this approach is
that it's possible for the key to already be associated with an IPC
structure, in which case the get function (msgget, semget, or
shmget) returns an error. The server must handle this error, deleting
the existing IPC structure, and try to create it again.
The client and the server can agree on a pathname and project ID
(the project ID is a character value between 0 and 255) and call the
function ftok to convert these two values into a key. This key is
then used in step 2. The only service provided by ftok is a way of
generating a key from a pathname and project ID.
#include <sys/ipc.h>
key_t ftok(const char *path, int id);
Returns: key if OK, (key_t)-1 on error
The path argument must refer to an existing file. Only the lower 8 bits
of id are used when generating the key.
The key created by ftok is usually formed by taking parts of the
st_dev and st_ino fields in the stat structure corresponding to the
given pathname and combining them with the project ID.
If two pathnames refer to two different files, then ftok usually
returns two different keys for the two pathnames. However,
because both i-node numbers and keys are often stored in long
integers, there can be information loss creating a key. This means
that two different pathnames to different files can generate the same
key if the same project ID isused.
Permission Structure
XSI IPC associates an ipc_perm structure with each IPC structure.
This structure defines the permissions and owner and includes at
least the following members:
struct ipc_perm
{
uid_t uid; /* owner's
effective user id */
gid_t gid; /* owner's
effective group id */
uid_t cuid; /*
creator's effective
user id */ gid_t cgid;
/* creator's effective
group id */mode_t mode;
/* access modes */
.
.
};
All the fields are initialized when the IPC structure is created. At a
later time, we can modify the uid, gid, and mode fields by calling
msgctl, semctl, or shmctl. To change these values, the calling
process must be either the creator of the IPC structure or the
superuser. Changing these fields is similar to calling chown or
chmod for a file.
Permission Bit
user-read 0400
user-write (alter) 0200
group-read 0040
group-write (alter) 0020
other-read 0004
other-write (alter) 0002
XSI IPC permissions
6. MESSAGE QUEUES
struct msqid_ds
{
Struct ipc_perm msg_perm;
msgqnum_t msg_qnum; /* #
of messages on queue */ msglen_t
msg_qbytes; /*
max # of bytes on queue */ pid_t
msg_lspid; /* pid
of last msgsnd() */
pid_t msg_lspid; /*
pid of last msgrcv() */time_t
msg_stime; /*
last-msgsnd() time */ time_t
msg_rtime; /*
last-msgrcv() time */ time_t
msg_ctime; /*
last-change time */
.
.
};
msgget
#include <sys/msg.h>
int msgget(key_t key, int flag);
Returns: message queue ID if OK, 1 on error
When a new queue is created, the following members of the msqid_ds structure
are initialized.
The ipc_perm structure is initialized. The mode member of this
structure is set to the corresponding permission bits of flag.
msg_qnum, msg_lspid, msg_lrpid, msg_stime, and msg_rtime are all set
to 0.
msg_ctimeis set to the current time.
msg_qbytesis set to the system limit.
On success, msgget returns the non-negative queue ID. This value is then used
with the other threemessage queue functions.
Msgctl
The msgctlfunction performs various operations on a queue.
#include <sys/msg.h>
int msgctl(int msqid, int cmd, struct msqid_ds *buf );
Returns: 0 if OK, -1 o
Msgsnd
#include <sys/msg.h>
int msgsnd(int msqid, const void *ptr, size_t nbytes, int flag);
#include <sys/msg.h>
ssize_t msgrcv(int msqid, void *ptr, size_t nbytes, long type, int flag);
7. SEMAPHORES
A semaphore is a counter used to provide access to a shared data object for
multiple processes.
To obtain a shared resource, a process needs to do the following:
1. Test the semaphore that controls the resource.
2. If the value of the semaphore is positive, the process can use the
resource. In this case, the process
decrements the semaphore value by 1, indicating that it has used one unit
of the resource.
3. Otherwise, if the value of the semaphore is 0, the process goes
to sleep until the semaphore value is greater than 0. When the
process wakes up, it returns to step 1.
semget
The first function to call is semget to obtain a semaphore ID.
#include <sys/sem.h>
sem_otime is set to 0.
sem_ctime is set to the current time.
Semctl
#include <sys/sem.h>
int semctl(int semid, int semnum, int cmd,... /* union semun arg */);
{
intval; /* for SETVAL */
struct semid_ds *buf; /*
for IPC_STAT and IPC_SET */unsigned
short *array; /*
for GETALL and SETALL */
};
The cmd argument specifies one of the above ten commands to be performed
on the set specified by semid.
semop
#include <sys/sem.h>
int semop(int semid, struct sembuf semoparray[ ], size_t nops);
Returns: 0 if OK, -1 on error.
semaphore set.
When a signal is sent to a process, it is pending on the process to handle it. The process can react to
pending signals in one of three ways:
Accept the default action of the signal, which for most signals will terminate the process.
Ignore the signal. The signal will be discarded and it has no affect whatsoever on the recipient
process. Invoke a user-defined function. The function is known as a signal handler routine and the
signal is said to be caught when this function is called.
2. SIGNAL
The formal arguments of the API are: sig_no is a signal identifier like SIGINT or SIGTERM. The
handler argument is the function pointer of a user-defined signal handler function.
The following example attempts to catch the SIGTERM signal, ignores the SIGINT signal, and
accepts the default action of the SIGSEGV signal. The pause API suspends the calling process until
it is interrupted by a signal and the corresponding signal handler does a return:
#include<iostream.h>
#include<signal.h>
/*signal handler function*/
void catch_sig(int sig_num)
{
signal (sig_num,catch_sig);
cout<<”catch_sig:”<<sig_num<<endl;
}
/*main function*/
int main()
{
signal(SIGTERM,catch_sig);
signal(SIGINT,SIG_IGN);
signal(SIGSEGV,SIG_DFL);
pause( ); /*wait for a signal interruption*/
}
The SIG_IGN specifies a signal is to be ignored, which means that if the signal is generated to the
process, it will be discarded without any interruption of the process.
The SIG_DFL specifies to accept the default action of a signal.
3. SIGNAL MASK
A process initially inherits the parent’s signal mask when it is created, but any pending signals for the parent
process are not passed on. A process may query or set its signal mask via the sigprocmask API:
#include <signal.h>
int sigprocmask(int cmd, const sigset_t *new_mask, sigset_t *old_mask);
Returns: 0 if OK, 1 on error
The new_mask argument defines a set of signals to be set or reset in a calling process signal mask,
and the cmd argument specifies how the new_mask value is to be used by the API.
The BSD UNIX and POSIX.1 define a set of API known as sigsetops functions:
#include<signal.h>
The sigemptyset API clears all signal flags in the sigmask argument.
The sigaddset API sets the flag corresponding to the signal_num signal in the sigmask argument. The
sigdelset API clears the flag corresponding to the signal_num signal in the sigmask argument. The
sigfillset API sets all the signal flags in the sigmask argument.
[ all the above functions return 0 if OK, -1 on error ]
The sigismember API returns 1 if flag is set, 0 if not set and -1 if the call fails.
The following example checks whether the SIGINT signal is present in a process signal mask and adds it to
the mask if it is not there.
#include<stdio.h>
#include<signal.h>
int main()
{
sigset_t sigmask;
sigemptyset(&sigmask); /*initialise set*/
A process can query which signals are pending for it via the sigpending API:
#include<signal.h>
int sigpending(sigset_t* sigmask);
Returns 0 if OK, -1 if fails.
The sigpending API can be useful to find out whether one or more signals are pending for a process
and to set up special signal handling methods for these signals before the process calls the
sigprocmask API to unblock them.
The following example reports to the console whether the SIGTERM signal is pending for the process:
#include<iostream.h>
#include<stdio.h>
#include<signal.h>
int main()
{
sigset_t sigmask;
sigemptyset(&sigmask);
if(sigpending(&sigmask)==-1)
perror(“sigpending”);
else cout << “SIGTERM signal is:”
<< (sigismember(&sigmask,SIGTERM) ? “Set” : “No Set”) << endl;
}
In addition to the above, UNIX also supports following APIs for signal mask manipulation:
#include<signal.h>
4. SIGACTION
The sigaction API blocks the signal it is catching allowing a process to specify additional signals to
be blocked when the API is handling a signal.
The sigaction API prototype is:
#include<signal.h>
int sigaction(int signal_num, struct sigaction* action, struct sigaction* old_action);
Returns: 0 if OK, 1 on error
The struct sigaction data type is defined in the <signal.h> header as:
struct sigaction
{
void (*sa_handler)(int);
sigset_t sa_mask;
int sa_flag;
}
The sigsetjmp and siglongjmp are created to support signal mask processing.
Specifically, it is implementation-dependent on whether a process signal mask is saved and restored
when it invokes the setjmp and longjmp APIs respectively.
The only difference between these functions and the setjmp and longjmp functions is that sigsetjmp has
an additional argument.
If savemask is nonzero, then sigsetjmp also saves the current signal mask of the process in env. When
siglongjmp is called, if the env argument was saved by a call to sigsetjmp with a nonzero savemask,
then siglongjmp restores the saved signal mask.
The siglongjmp API is usually called from user-defined signal handling functions. This is because a
process signal mask is modified when a signal handler is called, and siglongjmp should be called to
ensure the process signal maskisrestoredproperlywhen“jumpingout”fromasignalhandlingfunction.
The following program illustrates the uses of sigsetjmp and siglongjmp APIs.
#include<iostream.h>
#include<stdio.h>
#include<unistd.h>
#include<signal.h>
#include<setjmp.h>
sigjmp_buf env;
void callme(int sig_num)
{
sigemptyset(&sigmask);
if(sigaddset(&sigmask,SIGTERM)==-1) || sigprocmask(SIG_SETMASK,&sigmask,0)==-1)
perror(“set signal mask”);
sigemptyset(&action.sa_mask);
sigaddset(&action.sa_mask,SIGSEGV);
action.sa_handler=(void(*)())callme;
action.sa_flags=0;
if(sigaction(SIGINT,&action,&old_action)==-1)
perror(“sigaction”);
if(sigsetjmp(env,1)!=0)
{
cerr<<”return from signal interruption”;
return 0;
}
else
cerr<<”return from first time sigsetjmp is called”;
pause();
}
7. KILL
A process can send a signal to a related process via the kill API. This is a simple means of
inter-process communication or control. The function prototype of the API is:
#include<signal.h>
int kill(pid_t pid, int signal_num); Returns: 0 on success, -1 on failure.
The signal_num argument is the integer value of a signal to be sent to one or more processes
designated by pid. The possible values of pid and its use by the kill API are:
pid > 0 The signal is sent to the process whose process ID is pid.
pid == 0 The signal is sent to all processes whose process group ID equals the process group ID of
the sender and for which the sender has permission to send the signal.
pid < 0 The signal is sent to all processes whose process group ID equals the absolute value of pid
and for which the sender has permission to send the signal.
pid == 1 The signal is sent to all processes on the system for which the sender has permission to send
the signal.
The following program illustrates the implementation of the UNIX kill command using the kill API:
#include<iostream.h>
#include<stdio.h>
#include<unistd.h>
#include<string.h>
#include<signal.h>
int main(int argc,char** argv)
{
int pid, sig =
SIGTERM;
if(argc==3)
{
if(sscanf(argv[1],”%d”,&sig)!=1)
{
cerr<<”invalid number:” << argv[1] <<
endl; return -1;
}
argv++,argc--;
}
while(--argc>0)
if(sscanf(*++argv, “%d”, &pid)==1)
{
if(kill(pid,sig)==-1)
perror(“kill”);
}
else
cerr<<”invalid pid:” << argv[0] <<endl;
return 0;
Where signal_num can be an integer number or the symbolic name of a signal. <pid> is process ID.
8. ALARM
The alarm API can be called by a process to request the kernel to send the
SIGALRM signal after a certain number of real clock seconds. The function
#include<signal.h>
Unsigned int alarm(unsigned int time_interval); Returns: 0 or number of seconds until previously set alarm
9. INTERVAL TIMERS
The interval timer can be used to schedule a process to do some tasks at a fixed time
interval, to time the execution of some operations, or to limit the time allowed for
the execution of some tasks.
In addition to alarm API, UNIX also invented the setitimer API, which can be used to
define up to three different types of timers in a process:
Real time clock timer
Timer based on the user time spent by a process
Timer based on the total user and system times spent by a process
The getitimer API is also defined for users to query the timer values that are set
by the setitimer API. The setitimer and getitimer function prototypes are:
int setitimer(int which, const struct itimerval * val, struct itimerval * old);
int getitimer(int which, struct itimerval * old);
The arguments to the above APIs specify which timer to process. Its possible values and
the corresponding timer types are:
Example program:
#include<stdio.h>
#include<unistd.h>
#include<signal.h>
#define INTERVAL
5 void callme(int
sig_no)
{
/*do scheduled tasks*/
}
int main()
{
struct
itimerval
val; struct
sigaction
action;
sigemptyset(&action.sa_ma
sk);
action.sa_handler=(void(*
)( )) callme;
action.sa_flags=SA_RESTAR
if(setitimer(ITIMER_REAL,
&val , 0)==-1)
perror(“alarm”);
else while(1)
{
/*do normal operation*/
}
return 0;
}
The setitimer and getitimer APIs return a zero value if they succeed or a -1 value if they fail.
DAEMON PROCESSES
11. INTRODUCTION
Daemons are processes that live for a long time. They are often started when the
system is bootstrapped and terminate only when the system is shut down.
int daemon_initialise( )
{
pid_t pid;
if (( pid =
for() ) < 0)
return –1;
else if ( pid != 0)
exit(0); /* parent exits */
/* child
continue
s */
setsid(
);
chdir(“/
”);
umask(0)
;
return 0;
}
Normally, the syslogd daemon reads all three forms of log messages. On start-up,
this daemon reads a configuration file, usually /etc/syslog.conf, which determines
where different classes of messages are to be sent. For example, urgent messages
can be sent to the system administrator (if logged in) and printed on the console,
whereas warnings may be logged to a file. Our interface to this facility is through
the syslog function.
#include <syslog.h>
void openlog(const char *ident, int option, int facility);
void syslog(int priority, const char *format, ...);
void closelog(void);
int setlogmask(int maskpri);