0% found this document useful (0 votes)
22 views47 pages

Unit Iii

Uploaded by

rahulraj1152004
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views47 pages

Unit Iii

Uploaded by

rahulraj1152004
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

UNIT-III

FILES
Working with Files
In this chapter we learn how to create, open, read, write, and close files.

UNIX File Structure


In UNIX, everything is a file.

Programs can use disk files, serial ports, printers and other devices in the exactly the same
way as they would use a file.
Directories, too, are special sorts of files.

File types
Most files on a UNIX system are regular files or directories, but there are additional
types of

Files:

1. Regular files: The most common type of file, which contains data of some form. There is
no distinction to the UNIX kernel whether this data is text or binary.

2. Directory file: A file contains the names of other files and pointers to information on
these files. Any process that has read permission for a directory file can read the contents
of the directory, but only the kernel can write to a directory file.

3. Character special file: A type of file used for certain types of devices on a system.

4. Block special file: A type of file typically used for disk devices. All devices on a system
are either character special files or block special files.

5. FIFO: A type of file used for interprocess communication between processes. It’s
sometimes called a named pipe.

6. Socket: A type of file used for network communication between processes. A socket can
also be used for nonnetwork communication between processes on a single host.

7. Symbolic link: A type of file that points to another file.


The argument to each of different file types is defined as follows_

Macro Type of file

S_ISREG() Regular file

S_ISDIR() Directory file

S_ISCHR() Character special file

S_ISBLK() Block special file

S_ISFIFO() Pipe or FIFO

S_ISLNK() Symbolic link

S_ISSOCK() Socket

File System Structure


Files are arranged in directories, which also contain subdirectories.

A user, neil, usually has his files stores in a 'home' directory, perhaps /home/neil.
Files and Devices
Even hardware devices are represented (mapped) by files in UNIX. For example, as root,
you mount a CD-ROM drive as a file,
$ mount -t iso9660 /dev/hdc /mnt/cd_rom $
cd /mnt/cd_rom

/dev/console - this device represents the system console.

/dev/tty - This special file is an alias (logical device) for controlling terminal (keyboard and
screen, or window) of a process.

/dev/null - This is the null device. All output written to this device is discarded.

File Metadata

I-nodes
• A structure that is maintained in a separate area of the hard disk.
• File attributes are stored in the inode.
• Every file is associated with a table called the inode.
• The inode is accessed by the inode number.
• Inode contains the following attributes of a file: file type, file permissions , no. of links
UID of the owner, GID of the group owner, file size date
and time of last modification, last access, change.

File attributes
Attribute value meaning

File type type of the file

Access permission file access permission for owner, group and others

Hard link count no.of hard links of a file.

UID file owner user ID.

GID the file group ID.

File size file size in bytes.

Inode number system inode number of the file.

File system ID file system ID where the file is stored.


Kernel Support For Files:
UNIX supports the sharing of open files between different processes. Kernel has three data
structures are used and the relationship among them determines the effect one process has on
another with regard to file sharing.

1. Every process has an entry in the process table. Within each process table entry is a table
of open file descriptors, which is taken as a vector, with one entry per descriptor.
Associated with each file descriptor are

a. The file descriptor flags.


b. A pointer to a file table entry.

2. The kernel maintains a file table for all open files. Each file table entry contains

a. The file status flags for the file(read, write, append, sync, nonblocking, etc.),
b. The current file offset,
c. A pointer to the v-node table entry for the file.

3. Each open file (or device) has a v-node structure. The v-node contains information about
the type of file and pointers to functions that operate on the file. For most files the v-
node also contains the i-node for the file. This information is read from disk when the file
is opened, so that all the pertinent information about the file is readily available.

The arrangement of these three tables for a single process that has two different files open one
file is open on standard input (file descriptor 0) and the other is open standard output (file
descriptor 1).

Here, the first process has the file open descriptor 3 and the second process has file open descriptor
4. Each process that opens the file gets its own file table entry, but only a single v-node table entry.
One reason each process gets its own file table entry is so that each process has its own current
offset for the file.

After each ‘write’ is complete, the current file offset in the file table entry is incremented
by the number of bytes written. If this causes the current file offset to exceed the current
file size, the current file size, in the i-node table the entry is to the current file offset(Ex:
file is extended).

If a file is opened with O_APPEND flag, a corresponding flag is set in the file status flags of
the file table entry. Each time a ‘write’ is performed for a file with this append flag
set, the current file offset in the file table entry is first set to the current file size from
the i-node table entry. This forces every ‘write’ to be appended to the current end of
file.

The ‘lseek’ function only modifies the current offset in the file table entry. No I/O table
place.
If a file is positioned to its current end of file using lseek, all that happens is the current file
offset in the file table entry is set to the current file size from the i-node table entry.

It is possible for more than a descriptor entry to point to the same file table only. The file descriptor
flag is linked with a single descriptor in a single process, while file status flags are descriptors in
any process that point to given file table entry.

System Calls and Device Drivers


System calls are provided by UNIX to access and control files and devices.

A number of device drivers are part of the kernel.

The system calls to access the device drivers include:

Library Functions

To provide a higher level interface to device and disk files, UNIIX provides a number
of standard libraries.
Low-level File Access
Each running program, called a process, has associated with it a number of file descriptors.

When a program starts, it usually has three of these descriptors already opened. These are:

The write system call arranges for the first nbytes bytes from buf to be written to the file
associated with the file descriptor fildes.
With this knowledge, let's write our first program, simple_write.c:

Here is how to run the program and its output.

$ simple_write Here is
some data
$
read
The read system call reads up to nbytes of data from the file associated with the file
decriptor fildes and places them in the data area buf.
This program, simple_read.c, copies the first 128 bytes of the standard input to the standard
output.

If you run the program, you should see:

$ echo hello there | simple_read


hello there
$ simple_read < draft1.txt
Files

open

To create a new file descriptor we need to use the open system call.

open establishes an access path to a file or device.

The name of the file or device to be opened is passed as a parameter, path, and the
oflags parameter is used to specify actions to be taken on opening the file.
The oflags are specified as a bitwise OR of a mandatory file access mode and other optional
modes. The open call must specify one of the following file access modes:

The call may also include a combination (bitwise OR) of the following optional modes in the
oflags parameter:

Initial Permissions

When we create a file using the O_CREAT flag with open, we must use the three parameter
form. mode, the third parameter, is made form a bitwise OR of the flags defined in the header
file sys/stat.h. These are:
For example

Has the effect of creating a file called myfile, with read permission for the owner and execute
permission for others, and only those permissions.

umask

The umask is a system variable that encodes a mask for file permissions to be used when a file
is created.
You can change the variable by executing the umask command to supply a new value.

The value is a three-digit octal value. Each digit is the results of ANDing values from 1, 2, or 4
For example, to block 'group' write and execute, and 'other' write, the umask would be:

Values for each digit are ANDed together; so digit 2 will have 2 & 1, giving 3. The resulting
umask is 032.
close

We use close to terminate the association between a file descriptor, fildes, and its file.

ioctl

ioctl is a bit of a rag-bag of things. It provides an interface for controlling the behavior of
devices, their descriptors and configuring underlying services.
ioctl performs the function indicated by cmd on the object referenced by the descriptor fildes.

Try It Out - A File Copy Program

We now know enough about the open, read and write system calls to write a low-level
program, copy_system.c, to copy one file to another, character by character.
Running the program will give the following:

We used the UNIX time facility to measure how long the program takes to run. It took 2 and one
half minutes to copy the 1Mb file.
We can improve by copying in larger blocks. Here is the improved copy_block.c program.
Now try the program, first removing the old output file:

The revised program took under two seconds to do the copy.

Other System Calls for Managing Files

Here are some system calls that operate on these low-level file descriptors.

lseek

The lseek system call sets the read/write pointer of a file descriptor, fildes. You use it to set
where in the file the next read or write will occur.
The offset parameter is used to specify the position and the whence parameter specifies how the
offset is used.
whence can be one of the following:
dup and dup2

The dup system calls provide a way of duplicating a file descriptor, giving two or more, different
descriptors that access the same file.
File Status Information-Stat Family: fstat, stat and lstat

The fstat system call returns status information about the file associated with an open file
descriptor.
The members of the structure, stat, may vary between UNIX systems, but will include:
The permissions flags are the same as for the open system call above. File-type flags include:

Other mode flags include:

Masks to interpret the st_mode flags include:

There are some macros defined to help with determining file types. These include:
To test that a file doesn't represent a directory and has execute permisson set for the owner and no
other permissions, we can use the test:

File and record locking-fcntl function

• File locking is applicable only for regular files.

• It allows a process to impose a lock on a file so that other processes can not modify the
file until it is unlocked by the process.

• Write lock: it prevents other processes from setting any overlapping read / write locks on
the locked region of a file.

• Read lock: it prevents other processes from setting any overlapping write locks on the
locked region of a file.

• Write lock is also called a exclusive lock and read lock is also called a shared lock.

• fcntl API can be used to impose read or write locks on either a segment or an entire file.

• Function prototype:
#include<fcntl.h>

int fcntl (int fdesc, int cmd_flag, ….);

• All file locks set by a process will be unlocked when the process terminates.

File Permission-chmod

You can change the permissions on a file or directory using the chmod system call. Tis forms the
basis of the chmod shell program.
chown

A superuser can change the owner of a file using the chown system call.

Links-soft link and hard link

Soft link(symbolic links):Refer to a symbolic path indicating the abstract location of another
file.

Used to provide alternative means of referencing files.

Users may create links for files using ln command by specifying –s option.
hard links : Refer to the specific location of physical data.

A hard link is a UNIX path name for a file.

Most of the files have only one hard link. However users may create additional hard links for
files using ln command.

Limitations:

Users cannot create hard links for directories unless they have super user privileges.
Users cannot create hard links on a file system that references files on a different systems.

unlink, link, symlink

We can remove a file using unlink.

The unlink system call decrements the link count on a file.

The link system call cretes a new link to an existing file.

The symlink creates a symbolic link to an existing file.


Directories
As well as its contents, a file has a name and 'administrative information', i.e. the file's
creation/modification date and its permissions.
The permissions are stored in the inode, which also contains the length of the file and where on
the disc it's stored.
A directory is a file that holds the inodes and names of other files.

mkdir, rmdir

We can create and remove directories using the mkdir and rmdir system calls.

The mkdir system call makes a new directory with path as its name.

The rmdir system call removes an empty directory.

chdir

A program can naviagate directories using the chdir system call.

Current Working Directory- getcwd

A program can determine its current working directory by calling the getcwd library function.

The getcwd function writes the name of the current directory into the given buffer, buf.
Scanning Directories

The directory functions are declared in a header file, dirent.h. They use a structure, DIR, as a
basis for directory manipulation.
Here are these functions:

opendir

The opendir function opens a directory and establishes a directory stream.

readdir

The readdir function returns a pointer to a structure detailing the next directory entry in the
directory stream dirp.
The dirent structure containing directory entry details included the following entries:
telldir

The telldir function returns a value that records the current position in a directory stream.
seekdir

The seekdir function sets the directory entry pointer in the directory stream given by dirp.

closedir

The closedir function closes a directory stream and frees up the resources associated with it.

Try It Out - A Directory Scanning Program

1. The printdir, prints out the current directory. It will recurse for subdirectories.
2. Now we move onto the main function:
The program produces output like this (edited for brevity):

How It Works

After some initial error checking, using opendir, to see that the directory exists, printdir
makes a call to chdir to the directory specified. While the entries returned by readdir aren't
null, the program checks to see whether the entry is a directory. If it isn't, it prints the file entry
with indentation depth.

159
Here is one way to make the program more general.

You can run it using the command:

$ printdir /usr/local | more


PROCESSES
Processes and Signals
Processes and signals form a fundamental part of the UNIX operating environment, controlling
almost all activities performed by a UNIX computer system.
Here are some of the things you need to understand.

3.1 What is a Process

The X/Open Specification defines a process as an address space and single thread of control that
executes within that address space and its required system resources.
A process is, essentially, a running program.

3. 2 Layout of a C program

Here is how a couple of processes might be arranged within the operationg system.

Each process is allocated a unique number, a process identifier, or PID.


The program code that will be executed by the grep command is stored in a disk file.

The system libraries can also be shared.

A process has its own stack space.

Image in main memory

The UNIX process table may be though of as a data structure describing all of the processes that
are currently loaded.
Viewing Processes

We can see what processes are running by using the ps command.

Here is some sample output:

The PID column gives the PIDs, the TTY column shows which terminal started the process,
the STAT column shows the current status, TIME gives the CPU time used so far and
the COMMAND column shows the command used to start the process.

Let's take a closer look at some of these:

The initial login was performed on virtual console number one (v01). The shell is running bash.
Its status is s, which means sleeping. Thiis is because it's waiting for the X Windows sytem to
finish.
X Windows was started by the command startx. It won't finished until we exit from X. It too is
sleeping.

The fvwm is a window manager for X, allowing other programs to be started and windows to be
arranged on the screen.

This process represents a window in the X Windows system. The shell, bash, is running in the
new window. The window is running on a new pseudo terminal (/dev/ptyp0) abbreviated pp0.

This is the EMACS editor session started from the shell mentioned above. It uses the pseudo
terminal.

This is a clock program started by the window manager. It's in the middle of a one-minute wait
between updates of the clock hands.

Process environment

Let's look at some other processes running on this Linux system. The output has been
abbreviated for clarity:
Here we can see one very important process indeed:

In general, each process is started by another, known as its parent process. A process so started
is known as a child process.
When UNIX starts, it runs a single program, the prime ancestror and process number one: init.

One such example is the login procedure init starts the getty program once for each terminal that
we can use to long in.
These are shown in the ps output like this:

When interacting with your server through a shell session, there are many pieces of information
that your shell compiles to determine its behavior and access to resources. Some of these settings
are contained within configuration settings and others are determined by user input.

One way that the shell keeps track of all of these settings and details is through an area it maintains
called the environment. The environment is an area that the shell builds every time that it starts a
session that contains variables that define system properties.

In this guide, we will discuss how to interact with the environment and read or set environmental
and shell variables interactively and through configuration files. We will be using an Ubuntu
12.04 VPS as an example, but these details should be relevant on any Linux system.

Every time a shell session spawns, a process takes place to gather and compile information that
should be available to the shell process and its child processes. It obtains the data for these settings
from a variety of different files and settings on the system.

Basically the environment provides a medium through which the shell process can get or set
settings and, in turn, pass these on to its child processes.

Environment List

The environment is implemented as strings that represent key-value pairs. If multiple values are
passed, they are typically separated by colon (:) characters. Each pair will generally will look
something like this:
KEY=value1:value2:...

If the value contains significant white-space, quotations are used:

KEY="value with spaces"

The keys in these scenarios are variables. They can be one of two types, environmental variables
or shell variables.

Environmental variables are variables that are defined for the current shell and are inherited by

any child shells or processes. Environmental variables are used to pass information into

processes that are spawned from the shell.

Shell variables are variables that are contained exclusively within the shell in which they were

set or defined. They are often used to keep track of ephemeral data, like the current working

directory.

By convention, these types of variables are usually defined using all capital letters. This helps

users distinguish environmental variables within other contexts.

Environment variables- getenv, setenv

Every process has an environment block that contains a set of environment variables and their
values. There are two types of environment variables: user environment variables (set for each
user) and system environment variables (set for everyone).

By default, a child process inherits the environment variables of its parent process. Programs
started by the command processor inherit the command processor's environment variables. To
specify a different environment for a child process, create a new environment block and pass a
pointer to it as a parameter to the CreateProcess function.

The command processor provides the set command to display its environment block or to create
new environment variables. You can also view or modify the environment variables by selecting
System from the Control Panel, selectingAdvanced system settings, and clicking Environment
Variables.

Each environment block contains the environment variables in the following format:

Var1=Value1\0

Var2=Value2\0

Var3=Value3\0

...
VarN=ValueN\0\0

The name of an environment variable cannot include an equal sign (=).

The GetEnvironmentStrings function returns a pointer to the environment block of the calling
process. This should be treated as a read-only block; do not modify it directly. Instead, use the
SetEnvironmentVariable function to change an environment variable. When you are finished

with the environment block obtained from GetEnvironmentStrings,call the


FreeEnvironmentStrings function to free the block. Calling SetEnvironmentVariable has no effect
on the system environment variables.

Kernel support for process


The kernel runs the show, i.e. it manages all the operations in a Unix flavored environment. The
kernel architecture must support the primary Unix requirements. These requirements fall in two
categories namely, functions for process management and functions for file management (files
include device files). Process management entails allocation of resources including CPU, memory,
and offers services that processes may need. The file management in itself involves handling all
the files required by processes, communication with device drives and regulating transmission of
data to and from peripherals. The kernel operation gives the user processes a feel of synchronous
operation, hiding all underlying asynchronism in peripheral and hardware operations (like the time
slicing by clock). In summary, we can say that the kernel handles the following operations :

1. It is responsible for scheduling running of user and other processes.


2. It is responsible for allocating memory.
3. It is responsible for managing the swapping between memory and disk.
4. It is responsible for moving data to and from the peripherals.
5. it receives service requests from the processes and honors them.

Process Identification:

Every process has a unique process ID, a non-negative integer. There are two special processes.

Process ID0 is usually the schedule process and is often known as the ‘swapper’. No program on
disk corresponds to this process – it is part of the kernel and is known as a system process,

process ID1 is usually the ‘init’ process and is invoked by the kernel at the end of the bootstrap
procedure. The program files for this process loss /etc/init in older version of UNIX and is

/sbin/init is newer version. ‘init’ usually reads the system dependent initialization files and
brings

the system to a certain state. The ‘init’ process never dies. ‘init’ becomes the parent process of

any orphaned child process.


Process control
One further ps output example is the entry for the ps command itself:

This indicates that process 192 is in a run state (R) and is executing the command ps-ax.

We can set the process priority using nice and adjust it using renice, which reduce the priority of
a process by 10. High priority jobs have negative values.
Using the ps -l (forlong output), we can view the priority of processes. The value we are
interested in is shown in the NI (nice) column:

Here we can see that the oclock program is running with a default nice value. If it had been
stated with the command,

it would have been allocated a nice value of +10.

We can change the priority of a ruinning process by using the renice command,

So that now the clock program will be scheduled to run less often. We can see the modified nice
value with the ps again:

Notice that the status column now also contains N, to indicate that the nice value has changed
from the default.

Process Creation
Starting New Processes

We can cause a program to run from inside another program and thereby create a new process by
using the system. library function.

The system function runs the command passed to it as string and waits for it to complete.

The command is executed as if the command,

has been given to a shell.

Try It Out - system

1. We can use system to write a program to run ps for us.

2. When we compile and run this program, system.c, we get the following:
3. The system function uses a shell to start the desired program.

We could put the task in the background, by changing the function call to the following:

Now, when we compile and run this version of the program, we get:

How It Works

In the first example, the program calls system with the string "ps -ax", which executes

the ps program. Our program returns from the call to system when the ps command is finished.
In the second example, the call to system returns as soon as the shell command finishes. The
shell returns as soon as the ps program is started, just as would happen if we had typed,

at a shell prompt.

Replacing a Process Image

There is a whole family of related functions grouped under the exec heading. They differ in the
way that they start processes and present program arguments.
The exec family of functions replace the current process with another created according to the
arguments given.
If we wish to use an exec function to start the ps program as in our previous examples, we have
the following choices:

Try It Out - exclp

Let's modify our example to use an exexlp call.


Now, when we run this program, pexec.c, we get the usual ps output, but no Done. message at
all.
Note also that there is no reference to a process called pexec in the output:

How It Works

The program prints its first message and then calls execlp, which searches the directories given
by the PATH environment variable for a program called ps.
It then executes this program in place of our pexec program, starting it as if we had given the
shell command:

Waiting for a Process

We can arrange for the parent process to wait until the child finishes before continuing by
calling wait.
The wait system call causes a parent process to pause until one of its child processes dies or is
stopped.
We can interrogate the status information using macros defined in sys/wait.h. These include:

Try It Out - wait


1. Let's modify our program slightly so we can wait for and examine the child process exit
status. Call the new program wait.c.
2. This section of the program waits for the child process to finish:
When we run this program, we see the parent wait for the child. The output isn't confused and
the exit code is reported as expected.

How It Works

The parent process uses the wait system call to suspend its own execution until status
information becomes available for a child process.

Zombie Processes
When a child process terminates, an association with its parent survives until the parent in turn
either terminates normally or calls wait.
This terminated child process is known as a zombie process.
Try It Out - Zombies

fork2.c is jsut the same as fork.c, except that the number of messages printed by the child and
paent porcesses is reversed.
Here are the relevant lines of code:

How It Works

If we run the above program with fork2 & and then call the ps program after the child has
finished but before the parent has finished, we'll see a line like this:

There's another system call that you can use to wail for child processes. It's called waitpid and
youu can use it to wait for a specific process to terminate.

If we want to have a parent process regularly check whether a specific child process had
terminated, we could use the call,
which will return zero if the child has not terminated or stopped or child_pid if it has.

Orphan Process
• When the parent dies first the child becomes Orphan .
• The kernel clears the process table slot for the parent.
System call interface for process management

In addition to the process ID, there are other identifiers for every process. The following
functions return these identifiers

#incldue<sys/types.h>

#include<unistd.h>

pid_t getpid(void); Returns: process ID of calling process

pid_t geppid(void); Returns: parent process ID OF calling process

uid_t getuid(void); Returns: real user ID of calling process

uid_t geteuid(void); Returns: effective user ID of calling process

gid_t getgid(void); Returns: real group ID of calling process

gid_t getegid(void); Returns: effective group ID of calling process

fork Function

The only way a new process is created by the UNIX kernel is when an existing process calls the
fork function.

#include<sys/types.h>

#include<unistd.h>

pid_t fork(void);

Return: 0 is child, process ID of child in parent, -1 on error

The new process created by fork is called child process. This is called once, but return twice that
is the return value in the child is 0, while the return value in the parent is the process ID of the new
child. The reason the child’s process ID is returned to the parent is because a process can have
more than one child, so there is no function that allows a process to obtain the process IDs of its
children. The reason fork return 0 to the child is because a process can have only a single parent,
so that child can always call getppid to obtain the process ID of its parent.

Both the child and parent contain executing with the instruction that follows the call to fork. The
child is copy of the parent. For example, the child gets a copy of the parent’s data space, heap and
stack. This is a copy for the child the parent and children don’t share these portions of memory.
Often the parent and child share the text segment, if it is read-only.

There are two users for fork:

1. When a process wants to duplicate itself so that the parent and child can each execute
different sections of code at the same time. This is common for network servers_ the parent
waits for a service requests from a client. When the request arrives, the parent calls fork
and lets the child handle the request. The parent goes back to waiting for the next service
request to arrive.

When a process wants to execute a different program, this is common for shells. In this
case the child does an exec right after it returns from the fork.

vfork Function

The function vfork has the same calling sequence and share return values as fork. But the semantics
of the two functions differ. vfork is intended to create a new process when the purpose of the new
process is to exec a new program. vfork creates the new process, just like fork, without fully
copying the address space of the parent into the child, since the child won’t reference the address
space – the child just calls exec right after the vfork. Instead, while the child is running, until it
calls either exec or exit, the child runs in the address space of the parent. This optimization provides
an efficiency gain on some paged virtual memory implementations of UNIX.

Another difference between the two functions is that vfork guarantees that the child runs first,
until the parent resumes.

exit Function

There are three ways for a process to terminate normally, and two forms of abnormal
termination.
1. Normal termination:

a. Executing a return from the main function. This is equivalent to calling exit

b. Calling the exit function

c. Calling the _exit function

2. Abnormal termination

a. Calling abort: It generates the SIGABRT signal

b. When the process receives certain signals. The signal can be generated by the
process itself

Regardless of how a process terminates, the same code in the kernel is eventually executed. This
kernel code closes all the open descriptors for the process, releases the memory that it was using,
and the like.

For any of the preceding cases we want the terminating process to be able to notify its parent how
it terminated. For the exit and _exit functions this is done by passing an exit status as the argument
to these two functions. In the case of an abnormal termination however, the kernel generates a
termination status to indicate the reason for the abnormal termination. In any case, the parent of
the process can obtain the termination status from either the wait or waitpid function.The exit
status is converted into a termination status by the kernel when _exit is finally called. If the child
terminated normally, then the parent can obtain the exit status of the child.

If the parent terminates before the child, then init process becomes the parent process of any
process, whose parent terminates; that is the process has been inherited by init. Whenever a process
terminates the kernel goes through all active processes to see if the terminating process is the parent
of any process that still exists. If so, the parent process ID of the still existing process is changed
to be 1 to assume that every process has a parent process.

When a child terminates before the parent, and if the child completely disappeared, the parent
wouldn’t be able to fetch its termination status, when the parent is ready to seek if the child had
terminated. But parent get this information by calling wait and waitpid, which is maintained by
the kernel.
wait and waitpid Functions

When a process terminates, either normally or abnormally, the parent is notified by the kernel
sending the parent SIGCHLD signal. Since the termination of a child is an asynchronous event,
this signal is the asynchronous notification from the kernel to the parent. The default action for
this signal is to be ignored. A parent may want for one of its children to terminate and then accept
it child’s termination code by executing wait.

A process that calls wait and waitpid can

1. block (if all of its children are still running).


2. return immediately with termination status of a child ( if a child has terminated and is
waiting for its termination status to be fetched) or
3. return immediately with an error (if it down have any child process).
If the process is calling wait because it received SIGCHLD signal, we expect wait to return
immediately. But, if we call it at any random point in time, it can block.

#include<sys/types.h>

#include<sys/wait.h>

pid_t wait(int *statloc);

pid_t waitpid(pid_t pid, int *statloc, int options);

Both return: process ID if OK, o or -1 on error

The difference between these two functions is

1. wait can block the caller until a child process terminates, while waitpid has an option that
prevents it from blocking.

2. waitpid does not wait for the first child to terminate, it has a number of options that
control which process it waits for.

If a child has already terminated and is a zombie, wait returns immediately with that child’s status.
Otherwise, it blocks the caller until a child terminates: if the caller blocks and has multiple children,
wait returns when one terminates, we can know this process by PID return by the function.
For both functions, the argument statloc is pointer to an integer. If this argument is not a null
pointer, the termination status of the terminated process is stored in the location pointed to by the
argument.

If we have more than one child, wait returns on termination of any of the children. A function that
waits for a specific process is waitpid function.

The interpretation of the pid argument for waitpid depends on its value:

pid == -1 waits for any child process. Here, waitpid is equivalent to wait

pid > 0 waits for the child whose process ID equals pid

pid == 0 waits for any child whose process group ID equals that of the calling

process

pid < -1 waits for any child whose process group ID equals the absolute value of

pid

waitpid returns the process ID of the child that terminated, and its termination status is returned
through statloc. With wait the only error is if the calling process has no children. With waitpid

however, it’s also possible to get an error if the specified process or process group does not exist
or is not a child of the calling process.

The options argument lets us further control the operation of waitpid. This argument is either 0
or is constructed from the bitwise OR of the following constants.

WNOHANG waitpid will not blink if a child specified by pid is not immediately

available. In this case, the return value is 0.

WUNTRACED if the status of any child specified by pid that has stopped, and whose

status has not been updated since it has stopped, is returned

The waitpid function provides these features that are not provided by the wait function are:

1. waitpid lets us to wait for one particular process

2. waitpid provides a non-blocking version of wait


3. waitpid supports job control (wit the WUNTRACED option)

exec Function

The fork function can create a new process that then causes another program to be executed by
calling one of the exec functions. When a process calls one of the exec functions, that process is
completely replaced by the new program and the new program starts executing at its main function.
The process ID doesn’t change across an exec because a new process is not created. exec merely
replaces the current process with a brand new program from disk.

There are six different exec functions. These six functions round out the UNIX control primitives.
With fork we can create new processes, and with the exec functions we can initiate new programs.
The exit function and the two wait functions handle termination and waiting for termination. These
are the only process control primitives we need.

#include<unistd.h>

int execl(const char *pathname, const char *arg0, . . . /*(char *) 0*/

int execv(const char *pathname, char *const argv[]);

int execle(const char *pathname, const char *arg0, . . . /* (char *) 0, char envp[]*/);

int execve(const char *pathname, char *const argv[], char *const envp[]);

int execlp(const char *pathname, const char *arg0, . . . /* (char *) 0*/);

int execvp(const char *filename, char *const argv[]);

All six returns: -1 on error, no return on success.

The first difference in these functions is that the first four take a pathname argument, while the
last two take a filename argument. When a filename argument is specified:

if filename contains a slash, it is taken as a pathname.


Otherwise, the executable file is a searched for in directories specified by the PATH
environment variable.
The PATH variable contains a list of directories (called path prefixes) that are separated by colors.
For example, the name=value environment string
PATH=/bin:/usr/bin:usr/local/bin/:.

Specifies four directories to search, where last one is current working directory.

If either of the two functions, execlp or execvp finds an executable file using one of the path
prefixes, but the file is not a machine executable that was generated by the link editor, it assumes
the file is a shell script and tries to invoke /bin/sh with filename as input to the shell.

The next difference concerns the passing of argument list. The function execl, execlp and execle
require each of the command-line arguments to the new program to be specified as separate
arguments. The end of the argument should be a null pointer. For the other three functions execv,
execvp and execve, we have to build an array of pointers to the arguments, and the address of this
array is the argument to these three functions.

The final difference is the passing of the environment list to the new program. The two functions
execle and execve allow us to pass a pointer to an array of pointer to an array of pointer to an array
of pointers to the environment strings. The other four functions, however, use the environ variable
in the calling process to copy the existing environment for the new program.

Differences Between Threads and Processes

UNIX processes can cooperate; they can send each other messages and they can interrupt one
another.
There is a class of process known as a thread which are distinct from processes in that they are
separate execution streams within a single process.

Key difference: Thread and Process are two closely related terms in multi-threading. The main difference
between the two terms is that the threads are a part of a process, i.e. a process may contain one or more
threads, but a thread cannot contain a process.
In programming, there are two basic units of execution: processes and threads. They both
execute a series of instructions. Both are initiated by a program or the operating system. This
article helps to differentiate between the two units.
A process is an instance of a program that is being
executed. It contains the program code and its current activity. Depending on the operating
system, a process may be made up of multiple threads of execution that execute instructions
concurrently. A program is a collection of instructions; a process is the actual execution of those
instructions.

A process has a self-contained execution environment. It has a complete set of private basic run-
time resources; in particular, each process has its own memory space. Processes are often
considered similar to other programs or applications. However, the running of a single
application may in fact be a set of cooperating processes. To facilitate communication between
the processes, most operating systems use Inter Process Communication (IPC) resources, such as
pipes and sockets. The IPC resources can also be used for communication between processes on
different systems. Most applications in a virtual machine run as a single process. However, it can
create additional processes using a process builder object.
In computers, a thread can execute even the smallest sequence of programmed instructions that
can be managed independently by an operating system. The applications of threads and processes
differ from one operating system to another. However, the threads are made of and exist within a
process; every process has at least one. Multiple threads can also exist in a process and share
resources, which helps in efficient communication between threads.

On a single processor, multitasking takes place as the processor switches between different
threads; it is known as multithreading. The switching happens so frequently that the threads or
tasks are perceived to be running at the same time. Threads can truly be concurrent on a
multiprocessor or multi-core system, with every processor or core executing the separate threads
simultaneously.

In summary, threads may be considered lightweight processes, as they contain simple sets of
instructions and can run within a larger process. Computers can run multiple threads and
processes at the same time.

Comparison between Process and Thread:

Process Thread

An executing instance of a A thread is a subset of the


Definition
program is called a process. process.

It has its own copy of the data It has direct access to the data
Process
segment of the parent process. segment of its process.

Processes must use inter-process Threads can directly


Communication communication to communicate communicate with other threads
with sibling processes. of its process.

Processes have considerable Threads have almost no


Overheads
overhead. overhead.
New processes require
Creation duplication of the parent New threads are easily created.
process.

Threads can exercise


Processes can only exercise
Control considerable control over
control over child processes.
threads of the same process.

Any change in the parent Any change in the main thread


Changes process does not affect child may affect the behavior of the
processes. other threads of the process.

Memory Run in separate memory spaces. Run in shared memory spaces.

File descriptors
Most file descriptors are not
It shares file descriptors.
shared.

There is no sharing of file


File system It shares file system context.
system context.

It does not share signal


Signal It shares signal handling.
handling.

Process is controlled by the Threads are controlled by


Controlled by
operating system. programmer in a program.

Dependence Processes are independent. Threads are dependent.

You might also like