0% found this document useful (0 votes)
156 views40 pages

Cps 303 Note

This document discusses concurrency and parallel processing in operating systems. It covers how concurrency can be represented in UML diagrams like activity diagrams, sequence diagrams, and state machine diagrams. Key aspects of concurrency include implicit and explicit parallel splitting and joining of tasks. Context switching allows multiple processes to share a CPU by saving and restoring process states. Schedulers determine which processes run on the CPU and when to perform context switches between them.

Uploaded by

HABIBU
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
156 views40 pages

Cps 303 Note

This document discusses concurrency and parallel processing in operating systems. It covers how concurrency can be represented in UML diagrams like activity diagrams, sequence diagrams, and state machine diagrams. Key aspects of concurrency include implicit and explicit parallel splitting and joining of tasks. Context switching allows multiple processes to share a CPU by saving and restoring process states. Schedulers determine which processes run on the CPU and when to perform context switches between them.

Uploaded by

HABIBU
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 40

CPS 303: OPERATING SYSTEM II

The Concept of Concurrency: Concurrency is a property of a system in which


several behaviors can overlap in time – the ability toperform two or more tasks at
once. In the sequential paradigm, the next step in a process can be performed only
after the previous has completed; in a concurrent system some steps are executed
in parallel.

Figure1. Sequential flow

Figure 2. Concurrent flow (parallel split)

UML and Concurrency: UML supports concurrency, and makes it possible to


represent the concept in different kinds of diagrams. This article covers the three
most commonly used – the activity diagram, sequence diagram, and state machine
diagram. Note that, the OCUP 2 Foundation level examination covers concurrency
only in the activity diagram; concurrency in sequence and state machine diagrams
is covered at the Intermediate and Advanced levels.
Activity diagram: In activity diagrams, concurrent execution can be shown
implicitly or explicitly. If there are two or more outgoing edges from an action it is
considered an implicit split. Two or more incoming edges signify an implicit join.

Figure 3. Implicit concurrency

The action at an implicit join will not execute until at least one token is offered on
every incoming control flow. When the action begins execution, it will consume all
tokens offered on all incoming control flows. Concurrent execution can also be
drawn explicitly using fork and join nodes:

Figure 4. Explicit concurrency using fork and join nodes

Sequence diagram: Concurrency can be shown in a sequence diagram using a


combined fragment with the par operator or using a coregion area. A coregion can
be used if the exact order of event occurrences on one lifeline is irrelevant or
unknown. Coregion is short hand for parallel combined fragment within a
singlelifeline.
Figure 5. Parallel combined fragment covering one lifeline

Figure 6.Coregion

Figures 5 and 6 describe exactly the same situation where the order of event
occurrences on the firstlifeline (a) is not significant, but the sequence on the second
lifeline (b) is fixed and cannot bechanged.A combined fragment with the par
operator denotes parallel execution of operands. The order ofmessage occurrences
of the different operands can be interleaved in any way as long as the
orderingimposed by each operand is preserved:
Figure 7. Parallel combined fragment with two operands

In Figure 7, while m3 must be sent before m4, and m5 must be received before m6
is sent; theparallel operator indicates that the messages of the two operands may be
interleaved. This allows each lifeline to see six possible orders of the message-
send/message-arrive events. (It is left as anexercise for the reader to list and count
them). In addition, because the messages may betransmitted at different speeds, the
order seen by lifeline c is independent of the order seen bylifeline d.

State machine diagram: Concurrency on a state machine diagram can be


expressed by an orthogonal state (a composite statewith multiple regions). If an
entering transition terminates on the edge of the orthogonal state, thenall of its
regions are entered.When exiting from an orthogonal state, each of its regions is
exited.
Figure 8. Orthogonal state

Concurrency can be shown explicitly using fork and join pseudo states. A fork is
represented by a barwith one or more outgoing arrows terminating on orthogonal
regions (i.e. states in different regions); a join merges one or more transitions.

Figure 9. Fork and Join Pseudostates


States: A state represents a time period during which a predicate is true, e.g.,
budget - expenses >0, an action is being performed, e.g., check inventory for order
items, or someone waits for an event to happen, e.g., arrival of a missing order
item. A state can be “on” or “off”. When a state is “on”, all its outgoing transitions
are eligible to fire. For a transition to fire, its event must occur and its condition
must be true. When a transition does fire, its action is carried out. States can have
associated activities. Special activity constructs include:
 do/state Diagram Name(parameter List ) -- “calls” another state diagram;
 entry/action -- carry out the action when entering the activity;
 exit/action -- carry out the action when exiting;

States
• Stable state represents a condition in which an object may exist for some
identifiable period of time. When an event occurs, the object may move from state
to state (a transition). Events may also trigger self- and internal transitions, in
which the source and the target of the transition are the same state. In reaction to an
event or a state change, the object may respond by dispatching an action.
A Five-State Process Model
The not-running state in the two-state model has now been split into a ready state
and a blocked state
 Running — currently being executed
 Ready — prepared to execute
 Blocked — waiting for some event to occur (for an I/O operation to
complete, or a resource to become available, etc.)
 New — just been created
 Exit — just been terminated
State transition diagram:

New Admit Ready Dispatch Running release Exit


Timeout

Event
Occurs event
Wait
Blockedde
d

Context Switch: Stopping one process and starting another is called a context
switch when the OS stops a process, it stores the hardware registers (PC, SP, etc.)
and any other state information in that process’ PCB When OS is ready to execute
a waiting process, it loads the hardware registers (PC, SP, etc.) with the values
stored in the new process’ PCB, and restores any other state information
Performing a context switch is a relatively expensive operation However, time-
sharing systems may do100–1000 context switches a second. Also, context switch
is the process of storing and restoring the state (more specifically, the execution
context) of a process or thread so that execution can be resumed from the same
point at a later time. This enables multiple processes to share a single CPU and is
an essential feature of a multitasking operating system. The precise meaning of
"context switch" varies significantly in usage, most often to mean "thread switch or
process switch" or "process switch only", either of which may be referred to as a
"task switch". More finely, one can distinguish thread switch (switching between
two threads within a given process), process switch (switching between two
processes), mode switch (domain crossing: switching between user mode and
kernel mode within a given thread), register switch, a stack frame switch, and
address space switch (memory map switch: changing virtual memory to physical
memory map). The computational cost of context switches varies significantly
depending on what precisely it entails, from little more than a subroutine call for
light-weight user processes, to very expensive, though typically much less than that
of saving or restoring a process image. Context switching occurs when
single process provisionally discontinues execution and an
additional process resumes execution in its position. Context switching is
performed by the scheduler.
To give each process on a multi-programmed machine a light contribute to of the
CPU, hardware clock generates interrupts every so often. This allows the operating
system to program every process in core memory (via scheduling algorithm) to run
on the CPU at the same intervals. Every time a clock interrupt occurs, the interrupt
handler checks how much time the recent running process has used. If it has used
up its total time segment, then the CPU scheduling algorithm (in kernel) picks a
dissimilar process to run. Each switch of the CPU from one process to a new is
called a context switch.

What actions are taken by a kernel to context switch: Actions are taken by a
kernel to context switch surrounded by threads. The threads contribute to a lot of
resources with more peer threads belonging to the equal process. So a context
switch along with threads for the similar process is effortless. It involves switch of
register position, the program counter along with the stack. It is comparatively easy
for the kernel to achieve this task. Actions are taken by a kernel to context switch
among Processes. Context switches among processes are exclusive. Ahead of a
process can be switched its PCB (process control block) should be saved by the
operating system. The PCB consists of the subsequent information: The process
state, the program counter, the principles of the different registers, The CPU
scheduling information for the process, Memory organization information
concerning the process, Possible accounting information for this process, I/O status
information of the process. When the PCB of the presently executing process is
saved the operating system loads the PCB of the subsequently process that has to
be performing on CPU. This is an important job and it takes a lot of time.

Schedulers:
 Long-term scheduler (job scheduler) Selects job from spooled jobs, and
loads it into memory Executes infrequently, maybe only when process
leaves system Controls degree of multiprogramming. The essence of Long-
term scheduler is to have good mix of CPU-bound and I/O bound processes
which does not really exist on most modern timesharing systems

 Medium-term scheduler on time-sharing systems, does some of what long-


term scheduler used to do May swap processes out of memory temporarily
May suspend and resume processes in order to balance load for better
throughput.
 Short-term scheduler (CPU scheduler) Executes frequently, about one
hundred times per second (every 10ms) Runs whenever: Process is created
or terminated Process switches from running to blocked Interrupt occurs
Selects process from those that are ready to execute, allocates CPU to that
process. Short-term scheduler Minimize response time (e.g., program
execution, character to screen), Minimize variance of average response time
predictability may be important, Maximize throughput Minimize overhead
(OS overhead, context switching, etc.), Efficient use of resources, Fairness
share CPU in an equitable fashion

Interrupts in Operating: To successfully control several processes the core of


operating system makes utilize of what is known as interrupt. Interrupt is a
machine used for implementing the multitasking concept. It is a signal from
hardware or software to point out the incidence of the event. When one or more
process running and at similar time if user give an additional process then interrupt
is take place. If the CPU does not poll the control bit, but instead receives an
interrupt when the device is ready for the next byte, the data transfer is said to be
interrupt driven.

Hardware Interrupt: A hardware interrupt occurs, when an I/O operation is done


such as analysis some data into the computer from a tape drive. In additional term
hardware interrupts are used by devices to communicate that they need awareness
from the operating system. Some familiar examples are a hard disk signaling that is
has read a sequence of data blocks, or that a network device has processed a buffer
containing network packets. Interrupts are also worn for asynchronous events, such
as the appearance of new data from an exterior network. Hardware interrupts are
delivered straight to the CPU via a little network of interrupt administration and
routing devices.

Hardware interrupts are referenced by an interrupt numeral. These statistics are


mapped back to the section of hardware that produced the interrupt. This enables
the system to monitor which device formed the interrupt and when it occurred. In
most computer systems, interrupts are handled as speedily as possible. When an
interrupt is acknowledged, any recent action is blocked and an interrupt handler is
executed. The handler will anticipate any supplementary running programs and
system actions, which can time-consuming the entire system down, and generate
latencies. MRG Real-time modifies the way interrupts are handled in array to
progress performance, and reduce latency.
Software interrupts: A software interrupt occurs when an application program
terminates or needs definite services from the operating system. Software interrupt
generated contained by a processor by executing an instruction. Software interrupt
are frequently used to implemented system calls because they implemented a
subroutine call with a CPU ring stage modify.

Timed interrupts: The timed interrupt is worn when a convinced event MUST
happen at a specified frequency.

Interrupt vector: An interrupt vector is the memory address of an interrupt


handler, or directory into a group called an interrupt vector table or dispatch table.
Interrupt vector tables include the memory addresses of interrupt handlers. When
an interrupt is generated, the processor saves its completing state through a context
switch, and begins effecting of the interrupt handler at the interrupt vector.

The Problem: Users must face a number of new problems on systems which allow
programs to be constructed from multiple concurrent processes. Some of these
problems have been widely studied and a number of solutions are known. Among
these, the mutual exclusion problem and the producer-consumer problem are
particularly well known and should be understood by all programmers attempting
to write programs consisting of multiple processes. These problems have actually
been hinted at in previous sections. The queues used to communicate between an
interrupt service routine and a user program is an example of a special case of the
producer-consumer problem. The mutual exclusion problem was also mentioned in
the context of interrupt driven input/output; the special solution used in that
context was to disable interrupts during a critical section.

Although most operating systems support only a few basic mechanisms to aid in
solving the problems of concurrent programming, many experimental
programming languages have incorporated such mechanisms. Some of these
languages have advanced beyond the experimental stage and may become widely
used; among these, the Unix shell and the Ada programming language are
particularly important.
The Mutual Exclusion Problem

When two or more processes must share some object, an arbitration mechanism is
needed so that they do not try to use it at the same time. The particular object being
shared does not have a great impact on the choice of such mechanisms. Consider
the following examples: Two processes sharing a printer must take turns using it; if
they attempt to use it simultaneously, the output from the two processes may be
mixed into an arbitrary jumble which is unlikely to be of any use. Two processes
attempting to update the same bank account must take turns; if each process reads
the current balance from some database, updates it, and then writes it back, one of
the updates will be lost.

Both of the above examples can be solved if there is some way for each process to
exclude the other from use of the shared object during critical sections of code.
Thus the general problem is described as the mutual exclusion problem. The
mutual exclusion problem was recognized (and successfully solved) as early as
1963 in the Burroughs AOSP operating system, but the problem is sufficiently
difficult, that it was not widely understood for some time after that. A significant
number of attempts to solve the mutual exclusion problem have suffered from two
specific problems, the lockout problem, in which a subset of the processes can
conspire to indefinitely lock some other process out of a critical section, and the
deadlock problem, where two or more processes simultaneously trying to enter a
critical section lock each other out.

On a uniprocessor system with non-preemptive scheduling, mutual exclusion is


easily obtained: The process which needs exclusive use of a resource simply
refuses to relinquish the processor until it is done with the resource. A similar
solution works on a preemptively scheduled uniprocessor: The process which
needs exclusive use of a resource disables interrupts to prevent preemption until
the resource is no longer needed. These solutions are appropriate and have been
widely used for short critical sections, such as those involving updating a shared
variable in main memory. On the other hand, these solutions are not appropriate for
long critical sections, for example, those which involve input/output. As a result,
users are normally forbidden to use these solutions; when they are used, their use is
restricted to system code.
Semaphores can be used to solve the mutual exclusion problem at the user level to
control access to a shared variable, but the semaphore could just as well be used to
control access to a shared file or device. Because the initial count in the semaphore
is 1, the first process to execute a wait operation on that semaphore will be allowed
into the critical section. While a process is in the critical section, on the other hand,
the value in the semaphore will be zero, forcing other processes to wait. If there are
multiple shared resources, each may be protected by a separate semaphore. If no
process ever attempts to use more than one shared resource at a time, this solution
is safe and free of risk. On the other hand, if two or more processes must obtain
exclusive use of more than one resource at some point during their execution, the
use of separate semaphores to guard each resource may lead to deadlock. Such
deadlocks may be avoided by complex solutions such as the banker's algorithm, or
by requiring that multiple resources be obtained in a fixed order.

The statement that semaphores can be used to solve the mutual exclusion problem
is somewhat misleading because it assumes that the semaphores themselves have
been implemented. In a multiprocessor environment, or in a uniprocessor
environment where disabling interrupts is not allowed, other solutions to the
mutual exclusion problem must be found. A very common solution rests on the use
of special machine instructions, variously known as test-and-set (TS on the IBM
360/370, BSET on the Motorola 68000, SBITL on the National 32000), branch-
and-set (BBSSI on the DEC VAX) or exchange (EXCH on the DEC PDP-10,
LOCK XCHG on the Intel 8086). Whatever their name, these instructions allow a
program to both inspect the old value of a variable and set a new value in that
variable in a single indivisible operation. In the case of test-and-set instructions,
the condition codes are set to indicate the value of a Boolean variable prior to
setting it. Branch-and-set instructions do a conditional branch on a Boolean
variable prior to setting it. Exchange instructions allow the old value of a variable
to be fetched from memory and replace it with a new value; having fetched the old
value, it may be inspected at any later time.

A simple or binary semaphore is a semaphore which has only two values, zero and
one; when used to solve the mutual exclusion problem, the value zero indicates
that some process has exclusive use of the associated resource and the value one
indicates that the resource is free. Alternately, the states of the semaphore are
sometimes named "claimed" and "free"; the wait operation claims the semaphore,
the signal operation frees it. Binary semaphores which are implemented using busy
waiting loops are sometimes called spin-locks because a process which is waiting
for entry to a critical section spends its time spinning around a very tight polling
loop.

This implementation would be quite appropriate for use on a multiprocessor system; in fact, it
appears to have been understood by the designers of the Burroughs AOSP
operating system in 1963. This is also an appropriate implementation on a
uniprocessor if a call to a relinquish service is included in the busy waiting loop.

The machine code needed to implement spin-locks frequently involves only two or
three instructions for the code of "spinwait" and one instruction for the code of
"spinsignal" shown in Figure 4; as a result, they take very little time when no
waiting is required because the semaphore was already available. On the other
hand, this implementation wastes processor time if there is much contention for
entry to critical sections. As a result, spin-locks are preferred for entry to critical
sections which have low contention, but they are usually discouraged for high
contention critical sections.

When general semaphores are implemented using a first-in first-out queue of


waiting processes, they guarantee that processes contending for entry to a critical
section will be granted access in the order they arrived at the entrance to the critical
section; this, in turn, guarantees that no waiting process will be locked out by other
processes. Spin-locks provide no such guarantee of the order in which waiting
processes will be granted access to the critical section. However, spin-locks are not
inherently unfair; on most multiprocessor systems, the order in which contending
processes are granted access to a critical section will usually appear to be random.
This random order, in turn, is sufficient to prevent a waiting process from being
intentionally locked out, and it makes the probability of an accidental lockout
infinitesimal.

It is worth noting that instructions such as test-and-set or exchange solve the


mutual exclusion problem by shifting the burden to the hardware. The hardware
designer must, in turn, solve the same problem! The reason is that the test-and-set
operation is implemented as a sequence of two primitive memory operations, first
reading the value from memory, and then writing a new value back. If multiple
processors are sharing one memory, a processor executing this sequence must first
claim exclusive use of the memory. The original motivation for the inclusion of
instructions such as exchange in the instruction sets of early computers was that,
with core memory, they were particularly easy to implement. The reason for this is
that, with the commonly used core memory technologies, read operations had the
side effect of clearing the location from which the value was read. Thus, hardware
designers were forced to build memory units which automatically followed each
read operation by a write operation to restore what had been read, and these
frequently allowed the write operation to optionally use a value provided by the
central processor instead of the value just read from memory. The advent of
semiconductor memory has removed the original justification for instructions such
as exchange, but their utility in solving the mutual exclusion problem ensures that
future machines will continue to support them.

Mutual exclusion can be assured even when there is no underlying mechanism


such as the test-and-set instruction. This was first realized by T. J. Dekker and
published (by Dijkstra) in 1965. Dekker's algorithm uses busy waiting and works
for only two processes. The basic idea is that processes records their interest in
entering a critical section (in Boolean variables called "need") and they take turns
(using a variable called "turn") when both need entry at the same time.

Dekker’s solution to the mutual exclusion problem requires that each of the
contending processes have a unique process identifier which is called "me" which
is passed to the wait and signal operations. Although none of the previously
mentioned solutions require this, most systems provide some form of process
identifier which can be used for this purpose. It should be noted that Dekker's
solution does rely on one very simple assumption about the underlying hardware; it
assumes that if two processes attempt to write two different values in the same
memory location at the same time, one or the other value will be stored and not
some mixture of the two. This is called the atomic update assumption. The
atomically updatable unit of memory varies considerably from one system to
another; on some machines, any update of a word in memory is atomic, but an
attempt to update a byte is not atomic, while on others, updating a byte is atomic
while words are updated by a sequence of byte updates.

The Producer-Consumer Problem: Whenever data flows between concurrent


processes, some variant of the producer-consumer problem must be solved.
Although this problem is usually discussed in the context of the flow of data
between concurrent processes on a single computer system, perhaps with multiple
processors, the problem generalizes to the flow of data between programs and
peripheral devices, or to the flow of data between programs on different computers
in a network. It is frequently possible to decompose large programs into multiple
processes in such a way that all inter process communication is formulated in terms
of producer-consumer relationships. For example, a compiler might be broken into
a lexical analysis process which feeds a stream of lexemes to a syntax analysis
process; the syntax analyzer might, in turn, feed a stream of syntactic information
to a code generator, which would feed a stream of assembly code to a one-pass
assembler. Linear arrangements of processes such as this are sometimes described
as pipelines, with each process acting as a filter in the pipe.

Some multiprocessor systems, such as the Intel IPSC computer, are based on
networks of microprocessors which are interconnected by high speed serial data
links. Unlike multiprocessors with shared memory, such systems limit all
interprocess communication to producer-consumer relationships; as a result,
programs decomposed into multiple processes which communicate in this way
should be able to be run on a wider class of machines than those which use a
shared memory model of interprocess communication.

Although it is fairly obvious that first-in first-out queues can be used to solve the
producer-consumer problem, this is actually an oversimplification! A first-in first-
out queue shared between two processes is a shared data structure, and care must
be taken to avoid mutual exclusion problems. Because of this, it is worth
examining some very crude solutions to the producer-consumer problem before
examining how it is solved with full-fledged queues. For example, consider the very
simple solution shown in Figure 6.

type item = ... { type of data items passed from producer to consumer };

var buffer: item;


status: (empty, full) { initially, empty };

procedure produce( x: item );


{ called by the producer when an item is ready to pass to the consumer }
begin
while status \(!= empty do { nothing };
buffer := x;
status := full;
end { produce };

procedure consume( var x: item );


{ called by the consumer when an item from the producer is needed }
begin
while status \(!= full do { nothing };
x := buffer;
status := empty;
end { consume };

Figure 6. A very simple solution to the producer-consumer problem.

There are many similarities between this solution and the use of simple polling loops in
processing input/output, as was discussed in Chapter 9.

The solution shown in Figure 6 is correct as long as there is only a single producer and a single
consumer, an acceptable limitation for many applications. The problem is, it provides very little
buffering between the producer and consumer because calls to produce and consume must
alternate in strict lock step, even if some items take much longer to produce than to consume or
visa versa. In order to overcome this, we must use more buffers. A simple solution to this would
involve an array of buffers, each with its own status indicator; the producer would cycle through
this array, filling one buffer after another, and the consumer would cycle through, emptying
buffers. With two buffers, this is called double buffered communication.

When generalized to a relatively large array of buffers, the solution outlined above is very
similar to the bounded buffers discussed in Section 10.4; the difference is that an attempt to
enqueue a newly produced item will simply cause the producer to wait until there is space
instead of causing an error. As a result, there is little use for the special "full" and "empty"
functions used in Chapter 4. Of course, it is not necessary to have a separate flag for each buffer
in the queue; as suggested in Figure 10.10, the state of the queue can be inferred from the values
of the head and tail pointers. This leads to the bounded buffer solution to the producer-consumer
problem shown in Figure 7.

const size = ... { number of items to allow in the buffer };

var buffer: array [1..size] of item;


head, tail: 1..size { initially 1 };

procedure produce( x: item );


{ called by the producer when an item is ready to pass to the consumer }
begin
while ((tail mod size) + 1) = head do { nothing };
buffer[ tail ] := x;
tail := (tail mod size) + 1;
end { produce };

procedure consume( var x: item );


{ called by the consumer when an item from the producer is needed }
begin
while head = tail do { nothing };
x := buffer[ head ];
head := (head mod size) + 1;
end { consume };

Figure 7. The bounded buffer solution to the producer-consumer problem.

On a multiprocessor system, it may be appropriate to use the code shown in Figures 17.6 and
17.7 as shown; on the other hand, on a uniprocessor, espeically one with a non-preemptive
scheduler, these should relinquish the processor. Although it is simple enough to put a call to a
relinquish service into each polling loop, some processor time will still be consumed, at each
time slice given to a waiting process, checking the condition and relinquishing again. This can all
be avoided by using semaphores, as long as the semaphores put the waiting process on a waiting
queue. The semaphore solution to this problem was hinted at in Figure 15.13, and is shown here
in Figure 8.

const size = ... { number of items to allow in the buffer };

var buffer: array [1..size] of item;


head, tail: 1..size { initially 1 };
free: semaphore { the count should be initialized to size };
data: semaphore { the count should be initially zero };

procedure produce( x: item );


{ called by the producer when an item is ready to pass to the consumer }
begin
wait( free ) { wait for free space to be available };
buffer[ tail ] := x;
tail := (tail mod size) + 1;
signal( data ) { signal that there is data };
end { produce };

procedure consume( var x: item );


{ called by the consumer when an item from the producer is needed }
begin
wait( data ) { wait for data to be available in the queue };
x := buffer[ head ];
head := (head mod size) + 1;
signal( free ) { signal that there is free space };
end { consume };

Figure 8. The bounded buffer solution with semaphores.

As with all of the previously mentioned solutions to the producer-consumer problem, this
solution works for only a single producer process and a single consumer.
There are many cases where multiple producers must communicate with multiple consumers in
solving a problem. For example, consider a time-sharing system with multiple line printers. Each
user terminal has an associated process, and it is appropriate to associate one process with each
printer. When a user program submits a request to print some file, that request is put in the print
request queue; when a printer process finishes dealing with some request, it reads the next
request from the print request queue. Each request might consist of a record containing the name
of a file to be printed, information about who to bill for the cost of paper, and the "banner line" to
print on the break page.

The problem with multiple producers and multiple consumers is that the simple code given in
Figures 17.6 through 17.8 contains critical sections where mutual exclusion must be assured. It
should be obvious, for example, that the incrementing of head and tail pointers in Figures 17.7
and 17.8 must be done within critical sections. Furthermore, it is important that, when two
processes try to produce or consume data at the same time, each uses a different value of the
head or tail pointer; if this was not the case, both producers could fill the same slot in the queue,
or both consumers could obtain copies of the same item. Considerations such as these lead to the
code shown in Figure 9:

const size = ... { number of items to allow in the buffer };

var buffer: array [1..size] of item;


head, tail: 1..size { initially 1 };
free: semaphore { initially size };
data: semaphore { initially zero };
mutexhead, mutextail: semaphore { initially one };

procedure produce( x: item );


{ called by the producer when an item is ready to pass to the consumer }
begin
wait( free ) { wait for free space to be available };
wait( mutextail ) { begin critical section };
buffer[ tail ] := x;
tail := (tail mod size) + 1;
signal( mutextail ) { end critical section };
signal( data ) { signal that there is data };
end { produce };

procedure consume( var x: item );


{ called by the consumer when an item from the producer is needed }
begin
wait( data ) { wait for data to be available in the queue };
wait( mutexhead ) { begin critical section };
x := buffer[ head ];
head := (head mod size) + 1;
signal( mutexhead ) { end critical section };
signal( free ) { signal that there is free space };
end { consume };

Figure 9. Bounded buffers for multiple producers and consumers.


Note, in this solution, that there are two separate mutual exclusion semaphores, one for the head
of the queue, and one for the tail; thus, producers do not compete with consumers, but only with
each other.

On multiprocessor systems, the code shown in Figure 9 should probably be coded using spin-
locks to solve the mutual exclusion problem. The reason is that these critical sections are very
short and it is unlikely that many processes will be contending for them. On the other hand, the
semaphores used to track the free space and data in the queue should probably not use busy
waiting unless the likelihood of the queue being either full or empty for very long is very low.
The reason for this is that a full or empty queue could cause all of the processes at one end or the
other of the queue to wait for some time, locking up a significant fraction of the processors
which might have other uses.

Structured Concurrent Programming: Semaphores provide a much more


organized approach to controlling the interaction of multiple processes than would
be available if each user had to solve all interprocess communications using simple
variables, but more organization is possible. In a sense, semaphores are something
like the goto statement in early programming languages; they can be used to solve
a variety of problems, but they impose little structure on the solution and the
results can be hard to understand without the aid of numerous comments. Just as
there have been numerous control structures devised in sequential programs to
reduce or even eliminate the need for goto statements, numerous specialized
concurrent control structures have been developed which reduce or eliminate the
need for semaphores. One of the oldest proposals for a clean notation for parallel
programming is based on the idea that concurrent programs should be described in
terms of flowcharts where single flow lines may fork into more than one flow line,
and where multiple lines may be joined into one. These basic fork and join
operations are illustrated in Figure 10.
| ----- -----
----- ---| D |--- ---| E |---
| A | ----- | | -----
----- _V_V_
| Fork Join |
__V__ -----
----- | | ----- | F |
---| B |--- ---| C |--- -----
----- ----- |

Figure 10. The fork and join operations.


In the fork operation illustrated here, operations B and C may begin concurrently
as soon as A finishes, while in the join operation, operation F may only begin when
both D and E have finished. The notation shown in Figure 10 was developed in an
era when graphical flowcharting was the dominant approach to high level program
documentation, but it is hard to capture in textual form. Furthermore, the use of
flowchart notation does nothing to prevent disorderly control structures, it simply
makes them somewhat more obvious than they would be if written in terms of
assembly language branches and labels. Much of the work done on concurrent
programming since the mid 1960's has centered on the search for the ideal solution
to the problem of interprocess communication. This has resulted in proposals for a
diverse variety of control structures, mostly proposed as additions to languages
descended from Algol '60 such as Algol '68, Pascal, and later Ada. Among the
more important proposals were the concurrent block, the monitor, and the Ada
rendezvous. These will be discussed below, but the reader should be warned that,
although these structures are valuable contributions to concurrent programming,
none of them are implemented in more than a few languages, and of these, only
Ada has much chance of becoming widely used.

The reader should also beware that the search for the ideal solution to the problem
of interprocess communication is as futile as the search for the ideal sequential
control structure. It has been known for a long time that any sequential program
can be reformulated totally in terms of while loops, although this sometimes
requires the addition of auxillary variables. This knowledge in no way changes our
desire for a variety of expressive control structures in the languages we use,
including if-then, if-then-else and case statements, for loops, until loops, and
others. Similarly, although semaphores can be used to implement and can usually
be implemented by any of the more recently proposed concurrent control
structures, different problems are more naturally using different control structures.

Concurrent Blocks: Concurrent blocks provide a clean notation for one of the
simplest common uses of concurrent programming where a number of independent
operations can be done in parallel. A concurrent block consists of a fork where one
process splits into many; these processes perform their computations and then join
back into a single process. Concurrent block notation provides no help in dealing
with interprocess interactions, in fact, some languages which provide this notation
define it as being illegal for the processes to interact.

Textual representations of the concurrent blocks in a number of languages are


illustrated in Figure 11, along with examples of sequential blocks in the same
languages.

LANGUAGE CONCURRENT BLOCK SEQUENTIAL BLOCK

Algol '60 pseudocode cobegin A; B; C coend begin A; B; C end

Algol '68 begin A, B, C end begin A; B; C end

the Unix shell (A&B&C) (A;B;C)

C with Unix if (fork()==0) { {A;B;C}


A; exit();
} else if (fork()==0) {
B; exit();
} else {
C; wait(); wait();
}

Figure 11. Concurrent blocks in various languages.

The keywords cobegin and coend have been widely used in pseudocode, so every
programmer should recognize them; even so, they do not appear in any major
programming language. The Algol '68 notation is interesting because it applies to
parameter lists as well as blocks. The Unix shell notation wasn't originally intended
as a programming language, but it also fits well in this context. The last example
illustrates how these ideas can be expressed, clumsily, in a common programming
environment, C (or C++) under Unix. Here, the fork() system call creates
processes, the exit() call is used to self-terminate a process, and the wait() call
allows a process to await termination of a process it created. The combination of
exit() and wait() is effectively a join.
Concurrent block notation does not help with interprocess communication; its
merely allows programmers to avoid expressing unneeded sequencing
relationships within a program so parts could be done in parallel. In fact, most
Algol '68 compilers did not start concurrent processes when they encountered a
concurrent block; instead, sucy blocks were treated as giving permission reorder
code arbitrarily for faster execution by a single sequential process. All of these
notations for concurrent blocks can be implemented identically, since all describe
the same flowchart. Most implementations follow the Unix model, having the
parent execute one statement while creating a new child process to run each of the
others. If each child signals a semaphore as it terminates, the parent process can
use this to wait for the children to finish.

C. A. R. Hoare's quicksort algorithm provides a good example of the use of


concurrent blocks to utilize multiple processors. On a sequential machine,
quicksort divides the array to be sorted into two subarrays, where all elements of
one subarray are less than all elements of the other, and then calls itself recursively
to sort the subarrays. Since the two subarrays are disjoint, there is no reason not to
sort them in parallel.

The overhead of starting a new process and waiting for it to finish is not
insignificant. Since these costs are implicit in the "cobegin coend" block, such
blocks should not be used except when the processes involved will take some time.
The first concurrent block in the above program, for example, should probably be
done sequentially. Although it contains two loops, these will usually terminate very
quickly. The second concurrent block in the above program poses more difficult
problems. When the sub-arrays are large, there will be significant advantages to
sorting them concurrently, but when they are small, it will probably be faster to
sort them sequentially.

The Rendezvous Construct

There are times when the different branches of a parallel program must
communicate. When branches can be easily identified as either producers or
consumers, queues could be used, but this is not always possible. The inventors of
the Ada programming language have provided another model for process
interactions, the rendezvous.

The French word rendezvous (pronounced ron-day-voo) literally means "give (to)
you," but meeting is the most appropriate one-word translation into English, and
the idomatic get together may come even closer.
A rendezvous can be described by the following Flowchart shown in Figure 13.
Process X Process Y -- code of
process X
A;
| ----- ----- | accept R
do
--| A |-- --| D |-- B;
----- | | ----- end;
_V_V_ C;
|
-----
| B |
-----
|
__V__ -- code of
process Y
----- | | ----- D;
--| C |-- --| E |-- R; --
looks like a call
| ----- ----- | E;

Figure 13. An Ada rendezvous.

Essentially, a rendezvous is a temporary joining of two streams of control after


which they fork to continue independently. The intent is that the operations done
during the rendezvous will be used to move information between the processes
which have joined. Although the flowchart notation is symmetrical, such symmetry
cannot be obtained by a textual notation, so the body of the rendezvous must be
coded in the body of one of the tasks, as the body of an "accept" statement, while
the other task has what looks syntactically like a procedure call to the "accept"
statement. In Ada, the code of the body of an "accept" statement may directly
access the local variables of the process in which it is embedded, and the accept
statement may have parameters passed to it from the other process.

A rendezvous may be implemented with the aid of two binary semaphores, one to
make sure that the first process waits until the second calls the rendezvous, and one
to suspend the second process until the rendezvous has completed. In fact, the Ada
rendezvous is more complex than this, since it has options to allow timed entry
calls and accept statements, where if the rendezvous is not executed after a certain
interval alternate code is executed, and conditional entry calls and accept
statements, where, if the rendezvous cannot be immediately executed, alternate
code is executed. The implementation of these constructs is beyond the scope of
this work.

Monitors: An early proposal for organizing the operations required to establish


mutual exclusion is the explicit critical section statement. In such a statement,
usually proposed in the form "critical x do y", where "x" was the name of a
semaphore and "y" was a statement, the actual wait and signal operations used to
ensure mutual exclusion were implicit and automatically balanced. This allowed
the compiler to trivially check for the most obvious errors in concurrent
programming, those where a wait or signal operation was accidentally forgotten.
The problem with this statement is that it is not adequate for many critical sections,
A common observation about critical sections is that many of the procedures for
manipulating shared abstract data types such as files have critical sections making
up their entire bodies. Such abstract data types have come to be known as
monitors, a term coined by C. A. R. Hoare. Hoare proposed a programming
notation where the critical sections and semaphores implicit in the use of a monitor
were all implicit. All that this notation requires is that the programmer enclose the
declarations of the procedures and the representation of the data type in a monitor
block; the compiler supplies the semaphores and the wait and signal operations that
this implies. Using Hoare's suggested notation, shared counters might be
implemented.
Calls to procedures within the body of a monitor are done using record notation;
thus, to increment one of the counters declared in Figure 14, one would call
"i.increment". This call would implicitly do a wait operation on the semaphore
implicitly associated with "i", then execute the body of the "increment" procedure
before doing a signal operation on the semaphore. Note that the call to
"i.increment" implicitly passes a specific instance of the monitor as a parameter to
the "increment" procedure, and that fields of this instance become global variables
to the body of the procedure, as if there was an implicit "with" statement.

There are a number of problems with monitors which have been ignored in the
above example. For example, consider the problem of assigning a meaning to a call
from within one monitor procedure to a procedure within another monitor. This
can easily lead to deadlock, for example, when procedures within two different
monitors each call procedures in the other. It has sometimes been proposed that
such calls should never be allowed, but they are sometimes useful!

The most important problem with monitors is that of waiting for resources when
they are not available. For example, consider implementing a queue monitor with
internal procedures for the enqueue and dequeue operations. When the queue
empties, a call to dequeue must wait, but this wait must not block further entries to
the monitor through the enqueue procedure. In effect, there must be a way for a
process to temporarily step outside of the monitor, releasing mutual exclusion
while it waits for some other process to enter the monitor and do some needed
action.

Hoare's suggested solution to this problem involves the introduction of condition


variables which may be local to a monitor, along with the operations wait and
signal. Essentially, if s is the monitor semaphore, and c is a semaphore
representing a condition variable, "wait c" is equivalent to
"signal(s);wait(c);wait(s)" and "signal c" is equivalent to "signal(c)". The details of
Hoare's wait and signal operations were somewhat more complex than is shown
here because the waiting process was given priority over other processes trying to
enter the monitor, and condition variables had no memory; repeated signaling of a
condition had no effect and signaling a condition on which no process was waiting
had no effect.
Deadlocks: Deadlock refers to a situation in which a set of two or more processes
are waiting for other members of the set to complete an operation in order to
proceed, but none of the members is able to proceed. For instance in the figure
below, processes are blocked, waiting on events that could never happen.
System Model

 A system consists of a finite number of resources to be distributed among a


number of competing processes.
 Resources can only be used by a single process at any single instance of time
 Resources can be of different types, e.g. memory space, CPU cycles, files,
I/O devices
 A process must request a resource before using it, and must release the
resource after using it
 A normal operation consists of a sequence of events:

 Request: if request cannot be granted immediately, process must wait


until it can acquire the resource
 Use: operate on the resource ( e.g. printing )
 Release: release the resource

 A set of processes is in a deadlock state when every process in the set is


waiting for an event that can be caused by onother process in the set.
Deadlock modeling

Four necessary conditions for deadlock:

 Mutual exclusion -- only one process at a time can use the resource
 Hold and wait -- there must exist a process that is holding at least one
resource and is waiting to acquire additional resources that are
currently being held by other processes.
 No preemption -- resources cannot be preempted; a resource can be
released only voluntarily by the process holding it.
 Circular wait --

{P0, P1, ..., Pn }

P0 waits for resources held P1


P1 waits for resources held P2
......
Pn-1 waits for resources held Pn
Pn waits for resources held P0

e.g. Resources are two books: Java Programming, OS Principles,


need the two books to finish project

Resource-allocation graphs

the process is holding the the process is requesting the


resource resource
Deadlock conditions

 if graph contains no cycles => no process is in deadlock


 if graph contains cycles => deadlock may exist

Examples

Example 1

1. Concurrent processes: A, B, C
2. Shared resources: R, S, T
3. Left to run on their own, each process might issue requests as follows:

PROCESS A B C
request R S T
request S T R
release R S T
release S T R

4. Depending on how the OS schedules the processes A, B and C, this


may or may not cause deadlock
5. If the OS runs A to completion, then B, then C, there is no problem,
but this is not regularly the case (efficiency)
6. OS blocks a process waiting for a resource that is not available and
runs some other process
7. A deadlock can occur if OS scheduling causes requests to occur as
follows:

1 A→R
2 B→S
3 C→T
4 A→S
5 B→T
6 C→R

8. After request 4 has been made, process A blocks waiting for S which
is used by process B
9. In the next two steps B and C also block, as they request resources
held by C and A, respectively
10.A cycle A-S-B-T-C-R-A is formed as shown in figure, implying
deadlock

11.Could OS avoid deadlock in the example?

 OS can schedule A, B and C in a different order based on some


analysis of the allocation graph
 OS could run only A and C together, avoiding closed loop and hence
the deadlock
 Afterwards, B can be run on its own, with no danger of deadlock
Reduction of Resource Graph: If a process's resource requests can be
granted, then we say that the graph can be reduced by that process
(i.e. the process is allowed to complete its execution and return all its
resources)

 If a graph can be reduced by all the processes, then there is no


deadlock.
 If a graph cannot be reduced by all the processes, the
irreducible processes constitute the set of deadlock processes in
the graph
An example of reduction of an RAG (Resource Allocation Graph )

Example of RAG with cycle but no deadlock


Deadlock handling strategies:

 Ignore the problem altogether


 Prevention -- use a protocol to ensure that the system will never enter a
deadlock state
 Detection and recovery -- allow the system to enter a deadlock state and then
recover
 Dynamic avoidance -- by careful resource allocation

a) The Ostrich Algorithm

Pretend there's no problem ( Windows, UNIX )


undetected deadlock may result in deterioration of the system
performance; eventually, the system will stop functioning and will
need to be restarted manually

b) Detection and Recovery

 System monitors requests and releases resources


 Check resource graph to see if cycle exists; kill a process if
necessary or if a process has been blocked for, say one hour,
kill it

c) Deadlock Prevention

 mutual exclusion
printer cannot be shared by 2 processes
read-only file -- sharable
A process never has to wait for a sharable resource
In general, it is not possible to prevent deadlocks by denying mutual-
exclusion condition
 Hold and wait
have all processes request all their resources before start execution --
inefficient, also, process don't know in advance resources needed before a
process can request any additional resources, it must release temporarily all
the resources that it currently holds, if successful, it can get the original
resources back may have starvation
 No preemption
allows preemption
OK, what if its printer
 Circular wait order resource types
process requests resources in an increasing order

Let R = { R1, R2, ..., Rm } be the set of resource types

F: R → N ( set of natural numbers )

e.g.

F ( tape drive ) = 1
F ( disk drive ) = 5
F ( printer ) = 12

Suppose a process first requests Ri, it can request Rj only if


F(Rj) > F(Ri)

Under this constraint, circular-wait condition cannot hold

Proof

by contradiction

Suppose circular wait holds for


{ P0, P1, ..., Pn }

Pi is waiting for Ri which is held by Pi+1

Because Pi+1 is holding Ri while requesting Ri+1, we have

F( Ri ) < F( Ri+1 ) for all i


=> F( R0 ) < F( R1 ) < ... < F( Rn ) < F( R0 )
=> F( R0 ) < F( R0 )

which is impossible.
Thus circular wait does not exist.
Memory Management : Memory is an important resource that needs to be
managed by the OS. Memory manager is the component of the
OS responsible for managing memory, Memory management systems fall into two
classes:
 Memory managers that move processes back and forth between memory and
disk during execution.
 Memory managers that require all processes are in memory at all times. we
will study both techniques. The former requires swapping and paging
techniques. The latter has only been used in simple operating systems.

Memory Manager Issues: Often refers to the number of processes that an operating
system needs to manage the memory requirements that exceed the amount of
available memory. This can be achieve by:
• Using disk space as an extended or virtual memory
• Move process images to and from disk into main memory
• Disk access is much slower than memory access
• Need efficient memory management algorithms to efficiently manage all system
Processes.
The Common techniques use in memory management process are:
 Swapping
 Paging
Swapping: When the number of processes has memory requirements that exceed the
amount of main memory, some processes need to be kept on disk, Moving processes
from main memory to disk and back again is called swapping. Often swapping
systems use variable-sized partitions, using variable sized partitions reduces the
wasted memory associated with “fitting” processes into fixed sized partitions that
must be large enough to hold the process memory image with variable partitions the
size and number of memory partitions dynamically varies.

Variable Partitions
In variable partitioning, there must be;
1. Allocating of just large enough
2. Good idea to make the variable partitions a little larger then needed for
“growing” memory requirements.
3. Memory can be compacted to consolidate holes ;
- Move memory partitions down as far as possible
- Compaction is inefficient because of excessive CPU requirements to reorganize
the Memory partitions.
4. If memory requirements grow beyond a processes partition size, move the
partition to a new partition (relocation).
5. The operating system must keep track of allocated and free areas of memory
through;
I. Bit Maps
II. Linked Lists
III. Buddy System
Variable Partitions with Bit Maps
• Memory is divided into allocation units
- Few words to several K per allocation unit
• Each allocation unit is tracked by a bit map
- 1 indicates allocation unit is in use
- 0 indicates allocation unit is free
• Size of allocation unit is inversely proportional to the size of the bit map.
• Searching a bit map for free allocation units is slow
- Linear search of the bitmap array
• Heavy fragmentation over time
• Must balance allocation unit/bitmap size.

Memory Management with Linked Lists


•Maintain a linked list of allocated and free memory segments
- Each node in the linked list represents a processes memory image or a hole
between two processes
• Advantages of maintaining the list by sorted addresses:
- Easy to consolidate adjacent “small” holes into a single large hole

• Linked lists often take less storage to manage then bit maps
- Maintain a record per process rather than a bit per allocation unit.

Allocating Memory Using Linked Lists


 Algorithms:
- First Fit: Scan linked list, use first block large enough to hold the process
• Break the hole into two pieces
- Next Fit: Same as first fit, but do not start the search from the beginning of the list
• Maintain the location of the last allocation
• Wrap around when the last record in the list is reached
• Spreads out memory allocation
- Best Fit: Search the entire linked list and choose the smallest hole that is large
enough to hold the process memory image
- Worst Fit: Search the entire linked list and choose the largest hole to split and
produces large holes
 Simulation has shown that first fit is the most efficient.
Memory Management with Linked Lists

• Optimization: Maintain separate linked lists for the allocated and free memory
blocks.

• Prevents OS from having to search entries in the linked list that are
allocated when trying to find a free block.
Fragmentation
• Internal Fragmentation: When the allocated partition is larger than the needed
memory requirements
- Allocation with fixed partitions

• External Fragmentation: When there are holes between the allocated partitions but
no wasted space within the partitions
- Bit Maps
- Linked Lists
- Also known as checker boarding.
• Excessive fragmentation reduces memory utilization

Swapping: In swapping systems the disk is used to back main memory when there is
not enough memory to hold all of the running processes, Disk space is allocated to
each process so that there is a place on disk to hold the process when it is removed
from memory and put onto disk. Swapping systems require that the entire memory
image is in main memory in order to run the process. It is often a subset of processes
where the entire memory image is required to run a process as the working set, It is
desirable to keep only the a processes working set in memory during execution
Using Virtual Memory and Paging.
Overlays: In the early days when programs were too large to fit into their partitions,
programmers created overlays. Overlays were physical divisions of the program that
were established by the programmer When the program started running overlay 0
would be loaded into memory As the program progressed, the program
dynamically loaded and unloaded its overlays Using overlays is complex, error prone
and yields hard to debug programs.
Desire: Have the OS, not the programmer, manage the process of splitting up the
program and dynamically loading its overlays. This technique is known as virtual
memory.
Virtual Memory: The size of the program data and stack may exceed the available
amount of physical memory Keep the parts of the program in use in main memory
and the remaining parts on disk. Virtual memory manager dynamically moves
process pieces between memory and disk Virtual memory systems must be efficient
to avoid excessive disk to memory and memory to disk activities.
Virtual Memory Architecture

Physical
memory

Virtual
Memory
Manager

Disk
Swap
Area
Virtual Memory: With virtual memory systems, programs run with a large virtual
memory address space 4 GB in Windows NT, The virtual memory manager abstracts
the location of the physical address from the running program. A process in a virtual
memory system generates and uses virtual addresses in a virtual address space,
Typically the translation of a virtual address to a physical address is handled by the
MMU hardware.
Pages: the virtual address space is divided into equally sized units called pages,
pages are typically 1to 8 k (2 -4k is Common) the physical memory is also divided
into Page frames the page frames in physical memory (and On disk) are the same size
as pages in the Virtual address space virtual memory system maps pages from The
virtual address space into page frames In physical memory or on disk these systems
allow the process working Set to be in memory by only keeping the Active page
frames resident in physical memory.
Segmentation: Most processors support a flat memory model Virtual memory is one
dimensional one virtual address space per process. Segmentation is when the
hardware supports many completely independent address spaces each address space
is called segment .Segments grow and shrink independent of each other elegant
solution for dynamic systems that have growing and shrinking memory requirements.

You might also like