0% found this document useful (0 votes)
121 views38 pages

Multiprogramming: Concurrency Improves Throughput

Concurrency improves system throughput by allowing multiple programs or processes to utilize CPU and I/O resources simultaneously. Multiprogramming allows the main memory to contain more than one program so their execution phases like input, CPU processing, and output can overlap, improving efficiency. For statements in a program to execute concurrently while producing the same results, their read and write sets cannot intersect per Bernstein's conditions. Language constructs like fork/join and parbegin/parend allow specifying concurrency by executing statements in parallel.

Uploaded by

Firoj Ansari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
121 views38 pages

Multiprogramming: Concurrency Improves Throughput

Concurrency improves system throughput by allowing multiple programs or processes to utilize CPU and I/O resources simultaneously. Multiprogramming allows the main memory to contain more than one program so their execution phases like input, CPU processing, and output can overlap, improving efficiency. For statements in a program to execute concurrently while producing the same results, their read and write sets cannot intersect per Bernstein's conditions. Language constructs like fork/join and parbegin/parend allow specifying concurrency by executing statements in parallel.

Uploaded by

Firoj Ansari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

Concurrency improves Throughput

Multiprogramming

 A single user cannot always keep CPU or I/0 devices busy at all times.
 Multiprogramming offers a more efficient approach to increase system
performance.
 In order to increase the resource utilisation, systems supporting
multiprogramming approach allow more than one job (program) to utilize CPU
time at any moment.
 More number of programs competing for system resources, better will be
resource utilisation.

The idea is implemented as follows. The main memory of a system contains more
than one program .
The time needed to process n programs is I + CLE _ O, where

I = the sum of the input processing times


CLE = the sum of the compile, load and execute time
O = the sum of the output processing times.

The throughput is n/t = n/(I + CLE +O).

 The fact that the three devices CPU, CR, and LP operate almost independently makes an
important improvement possible.
 We get a much more efficient system if the input and output phases overlap in time.
 An even better result is obtained if all three phases overlap.

Throughput of Sequential system :


n

I + O =  (ik + ok )
K=1

With
n n
--- = -----------------
t I + CLE + O

Throughput of overlapped system :


Let ik and ok represent the processing time of input and output of the kth program, and let
mk = MAX(ik+1, ok) for k = 1, ……….., n-1.

Input and output processing for n programs using overlapping can be as low as:
n-1

M = i1 +  mk + on
K=1

with a throughput of
n n
--- = ---------------
t CLE + M

Illustration :

i1 c1 o1 i2 c2 o2 i3 c3 o3 i4 c4 o4

/////// /////// /////// /////// Computer phases


i1 c1 o1 i3 c3 o3

Time saved

i2 c2 o2 i4 c4 o4

Throughput of overlapped system


Concurrent Processes

Two processes are said to be concurrent if they over lap in their execution.

Precedence Graphs

Consider the following program segment, which performs some simple


arithmetic operations.

Program1 :
a : = x + y;
b : = z + 1;
c : = a – b;
w : = c + 1;
 These have data dependence and so precedence constraints,

 The point of this example is that, within a single program. There are
precedence constraints among the various statements.

Precedence Graph :

A precedence graph is a directed acyclic graph whose nodes correspond to


individual statements. An edge form node Si to node Sj means that
statement Si can be executed only after statement Si has completed
execution.
S1

S2 S3

S4

S6
S5

S7

Precedence graph
 S2 and S3 can be executed after S1 completes.
 S4 can be executed after S2 completes.
 S5 and S6 can be executed after S4 completes.
 S7 can execute only after S5’ S6’ and S3 complete.

Note that S3 can be executed concurrently with S2’ S4’ S5’ and S6’

Concurrency Conditions :

Qs. When can two statements in a program be executed


concurrently and still produce the same results?

The following three conditions must be satisfied for two successive


statements S1 and S2 to be executed concurrently and still produce the same
result. These conditions were first stated by Bernstein [1966] and are
commonly known as Bernstein’s conditions.

1. R(S1)  W(S2) = {}.


2. W(S1)  R(S2) = {}.
3. W (S1) W(S2) ={}.

 R(SI) = {a1’ a2’ …’ am}’ the read set for Si ’ is the set of all variables
whose values are reference in statement Si ‘ during its execution.

 W(Si) = {b1’ b2’ …’ bn}’ the write set for Si ’ is the set of all variables
whose values are changed (written) by the execution if statement Si ’
Ilustrations of READ and WRITE sets:

Example 1 : Consider the statement c := a – b.

The values of the variables a and b are used to compute the new value of c.
Hence a and b are in the read set. The (old) value of c is not used in the statement, but
a new value is defined as a result of the execution of the statement. Hence c is in the
write set, but not the read set.

R(c := a – b) = {a,b}
W(c := a – b) = {c}

Example 2 : For the statements, w := c + 1 the and write sets are:

R (w:= c + 1) = {c}
W (w:= c + 1) = {w}

Example 3 : Consider the statement read(a).

a is being read into, this its value is changing. The read and write sets are:
R (read (a)) = {}
W (read (a)) = {a}

Example 5 : S 1: a := x + y
S2: b := z + 1
S3: c := a – b

S1 and S2 statements can be executed concurrently because :

R(S1) = {x, y}
R(S2) = {z}
W(S1) = {a}
W((S2)= {b}

However, S2 cannot be executed concurrently with, since


W(S2)  R(S3) = {b}
Implementing Concurrency :

1. The Fork and Join Constructs


2. Concurrent Statements

1. The Fork and Join Constructs :

The fork and join instructions were introduced by Conway [1963] and
Dennis and Van Horn [1966]. They were one of the first language
notations for specifying concurrency.

S1
S1;
fork L;
S 2;
.
Fork
.
.
L : S 3;

S2
S3

Precedence graph for the fork construct

The fork L instruction produces two concurrent executions in a program.


 One execution starts at the statement labeled L, while the other is the
continuation of the execution at the statement following the fork
instruction.
 When the fork L statement is executed, a new computation is started at
S3. This new computation executes concurrently with the old
computation, which continues at S2.
JOIN :

 The join instruction provides the means to recombine two concurrent


computations into one.
 Each of the two computations must request to be joined with the other.
 Since computations may execute at different speeds, one may execute
the join before the other.

S2 S1 count := 2;
fork L1;
.
.
joi .
n S1;
go to L2;
L1: S2;
S3 L2: join count;

Precedence graph for the join construct

We need to know the number of computations which are to be joined, so


that we can terminate all but the last one. The join instruction has a
parameter to specify the number of computations to join instruction with a
parameter count has the following effect:

count : = count – 1;
+ if count  0 then quit;

where count is a non-negative integer variable, and quit is an instruction


which results in the termination of the execution. For 2 computations, the
variable count would be initialized to 2.
The join instruction must be executed atomically; that is, the concurrent
execution of two join statement is equivalent to the serial execution of
these two statement, in some undefined order.
Note that the execution of the join statement merges several concurrent
executions; hence the name join.

Consider again Program 1.

S1: a := x + y
S2: b := z + 1
S3: c := a – b
S4: w : = c + 1;

To allow the concurrent execution of the first two statements, this program
could be rewritten using fork and join instructions:

count: = 2;
fork L1;
a := x + y;
Go to L2;
L1: b: = z + 1;
L2: join count;
c := a – b;
w := c + 1;
2. Concurrent Statements:
A higher-level language construct for specifying concurrency is the
parbegin/parend statement of Dijkstra [1965a], which has the following form:
parbegin S1; S2; ……Sn parend;

 Each Si is a single statement.


 All statements enclosed between parbegin and parend can be executed
concurrently.
S0

S1
S2 ………… Sn

Sn+1

Precedence graph for the concurrent statement

Earlier example written using concurrent statements :

begin
parbegin
a : = x + y;
b : = z + 1;
parend;
c : = a – b;
w : = c + 1;
end.
Illustration :
S1

S2 S3

S4

S6
S5

S7

Implementation of Precedence Graph :

Using FORK/JOIN Using Concurrent Statements


S1; S1;
count := 3; parbegin
fork L1; S3;
S2; begin
S4; S2;
fork L2; S4;
S5; parbegin
go to L3; S5;
L2: S6; S6;
go to L3; parend
L1: S3; end
L3: join count; parend
S7 S7
EXAMPLE :

Copy of sequential file f to another file g by using double-


buffering with s and t, this program can read from f
concurrently with writing to file g.

var f, g : file of T;
r, s: T;
count: integer;
begin
reset(f);
read(f,s);
while not eof(f)
do begin
count : = 2;
t := s ;
fork L1;
write(g,t);
go to L2;
L1: read(f,s);
L2: join count;
end;
write(g,t);
end.
Using Concurrent Statement :

var f, g : file of T;
s,t,: T;
begin
reset(f);
read(f,s);
while not eof(f)
do begin
t : = s;
parbegin
write(g,t);
read(f,s);
parend;
end;
write(g,t);
end.
PROCESS MANAGEMENT

Process Concept :

 Process may be appreciated as the program in execution. However many


other definitions are also available for this.

 However a program is a passive entity whereas a process is an active entity.

 The key idea about a process is that it is an activity of some kind and
consists of pattern of bytes (that the CPU interprets as machine instruction,
data, register and stack).

 A single processor may be shared among several processes with some


scheduling policy being used by the processor to allocate one process and
deal locate another one.

Process hierarchies:

 Operating system needs some way to create and kill processes. When a
process is created, it creates another process(es) which in turn crates some
more process(es) and so on, thus forming process hierarchy or process tree.

 In UNIX operating system, a process is created by the fork system call and
terminated by exit.

Process States:

The lifetime of a process can be divided into several stages as states, each with
certain characteristics that describe the process. It means that as a process starts
executing, it goes through one state to another state.

Each process may be in one of the following states:

1. New -The process has been created.


2. Ready -The process is waiting to be allocated to a processor.
All ready processes (kept in queue) keeps waiting for CPU time to be allocated
by operating system in order to run. A program called scheduler which is a
part of operating system, pick-ups one ready process for execution by passing a
control to it.

3. Running- Instructions are being executed

4. Blocked/Suspended: A suspended process lacks some resource


other than the CPU.

5. Terminate: When the process finally stops. A process terminates


when it finishes executing its last statement or under some
exceptional cases by OS. A parent may terminate the execution of
one of its children for a variety of reasons, such as:
i. The task assigned to the child is no longer required.
ii. The child has exceeded its usage of some of resources it has been
allocated.

A general form of process state diagram is illustrated:

Process State Diagram


Process Implementation

 The operating system groups all information that it needs about a particular
process into a data structure called a Process Control Block (PCB).
 It simply serves as the storage for any information for processes.

 When a process is created, the operating system creates a corresponding


PCB and when it terminates, its PCB is released to the pool of free memory
locations from which new PCBs are drawn.

 A process is eligible to compete for system resources only when it has


active PCB associated with it. A PCB is implemented as a record containing
many pieces of information associated with a specific process, including:
1. Process number: Each process is identified by its process number,
called process ID. (PID)
2. Priority.
3. Process state: Each process may be in any of these states: new, ready,
running, suspended and terminated.
4. Program counter: It indicates the address of the next instruction to be
executed for this process.
5. Registers: They include accumulator, general purpose registers, index
registers etc. Whenever a processor switches over from one process to
another process, information about current status of the old process is
saved in the register along with the program counter so that the process
be allowed to continue correctly afterwards.
6. Accounting information: It includes actual CPU time used in executing
a process.
7. I/O status information: It includes outstanding I/O requests, list of
open files information about allocation of peripheral devices to
processes.
8. Processor scheduling details: It includes a priority of a process, address
to scheduling queues and any other.
Inter Process Communication &
The Critical Section Problem

Producer Consumer problem is quite common in operating systems.

 A Producer process produces information that is consumed by a Consumer


process.

 A line Printer driver produces characters which are consumed by the Line
Printer.

 A Compiler may produce assembly code which is consumed by an assembler.


The assembler in turn may produce Load modules which are consumed by the
loader.

Producer-Consumer Processes:

 For maximising throughput and better system performance, approaches should


be made to allow P-C processes to run concurrently.

 Producer should produce into Buffer and Consumer should consume from
buffer.

 A Producer can produce into one buffer while the consumer is consuming from
other buffer. P-C must be synchronized so that Consumer does not try to
consume items which are not yet been produced.

Unbounded Buffer Producer-Consumer :


 No limit on the number of buffers.
 The consumer may have to wait for new items, but the producer can always
produce new items i.e. buffers are of unlimited capacity there are always
empty buffers.

Bounded Buffer Producer-Consumer : Assumes that there is a fixed number


‘n’ of buffer.

Solution to Bounded Buffer Problem :


Data Structure : 1. Circular Array ( shared Pool of Buffers)
2. Two Logical Pointers : ‘in’ and ‘out’
in points to the next free buffer.
out points to the first full buffer.
3. integer variable counter initialised to 0 .
counter is incremented everytime a new full buffer is
added to the pool and decremented whenever one is
consumed

Type item = ….;


Var buffer : array[0..n-1] of item;
In, out : 0..n-1
Nextp, nextc : item;
In := 0, out := 0;

Parbegin
Producer: begin
repeat
…..
produce an item in nextp
…..
while counter =n do skip;
buffer[in] := nextp;
in := in+1 mod n;
counter := counter + 1;
until false;
end;

Consumer: begin
repeat
…..
while counter = 0 do skip;
nextc := buffer[out];
out := out+1 mod n;
counter := counter - 1;
consume the item in nextc;
until false;
end;
parend;
This implementation may lead to wrong values, if concurrent execution is uncontrolled

e.g. If counter value is currently 5, and Producer is executing the statement counter
:= counter +1 and Consumer is executing the statement counter:= counter -1
concurrently, then the value of counter may be 4,5 or 6. ( correct answer should be 5
because one produced one consumed)

How counter may be incorrect ?

i) counter := counter +1 is implemented in machine language as


R1 := counter
R1 := R1 +1
Counter := R1

ii) counter := counter -1 is implemented in machine language as


R2 := counter
R2 := R2-1
Counter := R2

Concurrent execution of statements counter := counter +1 and


counter := counter –1 is equivalent to a sequential execution where the
lower level statements presented above are interleaved in some arbitrary order. If one
such interleaving is :

T0: Producer executes R1 := counter {R1=5}


T1: Producer executes R1 := R1+1 {R1=6}
T2: Consumer executes R2 := counter {R2=5}
T3: Consumer executes R2 := R2-1 {R2=4}
T4: Producer executes Counter := R1 {counter=6}
T5: Consumer executes Counter := R2 {counter=4}

 counter = 4 while answer should be 5.


 If T4 and T5 reversed then incorrect counter = 6.
WHY INCORRECT ? Because we allowed processes to manipulate
the variable counter concurrently.

REMEDY : We need to ensure that only one process at a


time may be manipulating the variable counter.

This observation leads us to problem of CRITCIAL SECTION (CS) and the


part of the program code where process is executing shared variable is known
as its Critical Section.

Problem Definition:

 Consider a system consisting of n cooperating processses { P1, P2,.. Pn}

 Each process has a segment of code called a Critical Section (CS)


 When one process is executing in its CS no other process is to be
allowed to execute in its CS.
 Thus the execution of CS by the processes in Mutually Exclusive in
time.

Critical Section Problem :

To design a protocol which the processes may use to cooperate.

General structure may be:


……

Entry Section
……
Critical Section
…….

Exit Section

Remainder Section
A solution to the Mutual Exclusion (ME) problem must satisfy the following 3 requirements
:

1. Mutual Exclusion : If process Pi is executing in its CS then no


other process can be executing in its CS.

2. Progress : If no process is executing in its CS and there


exist some processes that wish to enter their CS
then only those processes that are not executing in
their Remainder Sction can participate in the
decision as to who will enter the CS next, and this
selection can’t be postponed indefinitely

3. Bounded Waiting : There must exist a bound on the times that


other processes are allowed to enter their
CS after a process has made a request to enter its
CS and before that is granted.

It is assumed that each process is executing at a non zero speed, and


No assumption can be made on the relative speed of n processes.

TWO PROCESS SOFTWARE SOLUTION :

General structure :
begin
common variable decl.
parbegin
P0;
P1;
parend
end.
1. Algorithm 1:

First approach is to let the processes share a common integer variable


turn initialised to 0(or 1). If turn = i then process Pi is
allowed to execute in its CS.

Pi :
repeat

while turn <> i do skip;

C.S

turn := j ;

Remainder section
until false

Analysis:
1. Ensures only one process at a time can be in CS.
2. However it does not satisfy the progress requirement.
STRICT ALTERNATION of processes in CS.
If turn=0 and P1 wants to enter its CS, it can’t even though P0 in
remainder section.

Algorithm 2:

Problem with Algo.1 is that it fails to remember the state of each process but
remembers only which process is allowed to enter its CS.

As a solution :

var flag : array[0..1] of boolean ; (initial. to 0)

If flag[i] is true then Pi is executing in its CS.


Pi : repeat
while flag[j] do skip;
flag[i] := true;

CS;

Flag[i] := false;

Remanider section ;
until false

Analysis : 1. First checks if other process is in its CS.


2. Else Sets itself and enters CS.
3. When exits CS, resets flag to false, allowing others.
ME is not Ensured :
T0 : P0 enters the while statement and finds flag[1] = false
T1 : P1 enters the while statement and finds flag[0] = false
T2 : P1 sets flag[1] and enters CS.
T3 : P0 sets flag[0] and enters CS.
Algorithm 3:
The Problem with Algorithm 2 is that process Pi made a decision concerning
the state of Pj before Pj had the opportunity to change the state of the variable
flag[j]. We can try to correct this problem. As in Algorithm 2, we still maintain
the array flag. This time, however, the setting of flag[i] = true indicates only
that Pi wants to enter the critical section.
repeat
flag[i] := true;
while flag[j] do skip;

critical section

Flag[i] := false;

remainder section ;
until false
 So, in this algorithm, we first set our flag[i] to be true, signaling that we
want to enter our critical section.

 Then we check that if the other process also wants to enter its critical section. If
so, we wait. Then we enter our critical section.

 When we exit the critical section, we set our flag to be false, allowing the
other process (if it is waiting) to enter its critical section.

In this solution, unlike Algorithm 2, the mutual–exclusion requirement is


satisfied. Unfortunately, the progress requirement is not met. To illustrate this
problem, consider the following execution sequence.

T0: P0 set flag [0] = true.


T1: P1 set flag [1] = true.

Now P0 and P1 are looping forever in their respective while statements

Algorithm 4 :

It is probably convinced that there is no simple solution to the critical section


problem. It appears that every time we fix one bug in a solution, another bug appears.
However, we now (finally) present a correct solution, due to Peterson [1981].
This solution is basically a combination of Algorithm 3 and a slight modification of
Algorithm 1.
The processes share two variables in common:

var flag : array[0..1] of boolean ;


turn : 0..1;

Initially flag[0] = flag[1] = false and the value of turn is immaterial


(but either 0 or 1). The structure of process Pi is:

repeat
flag[i] := true;
turn := j;
while (flag[j] and turn=j) do skip;

critical section

Flag[i] := false;

remainder section
until false;
To enter our critical section, we first set our flag[i] to be true, and assert that it is
the other process’ turn to enter if it wants to (turn = j). If both processes try to
enter at the same time, turn will be set to both i and j at roughly the same time.
Only one of these assignments will last; the other will occur, but be immediately
overwritten. The eventual value of turn decides which of the two processes is
allowed to enter its critical section first.

Hardware Solutions :

Many machines provide special hardware instructions that allow one to either test and
modify the content of a word, or to swap the contents of two words, in one memory
cycle. These special instructions can be used to solve the critical section problem. Let
us abstract the main concepts behind these types of instructions by defining the
Test-and Set instruction as follows:

Function Test-and Set(var target: boolen): Boolean;


begin
Test-and Set := target;
target := true;
end;

The important characteristic is that these instructions are executed atomically; that is,
in one memory cycle. Thus if two Test-and Set instructions are executed
simultaneously (each on a different cpu), they will be executed sequentially in some
arbitrary order.
If the machine supports the Test-and Set instruction, then mutual
exclusion can be implemented by declaring a boolean variable lock, initialized to
false.
repeat

while Test-and Set(lock) do skip;

critical section

lock := false;

remainder section
until false;

Hardware Swap instruction as follows:


Procedure Swap (var a, b: boolean);
var temp: Boolean;
begin
temp := a:
a := b;
b := t one memory mp;
end;
If the machine support the Swap instruction, then mutual exclusion can be provided
in a similar manner. A global boolean variable lock is initialized to false. Each
process also has a local boolean variable key.
repeat
key := true;
repeat
Swap(lock, key)
Until key = false;

critical section

lock := false;

remainder section
until false;
N-Process Software Solutions:

Solves the critical section problem for ‘n’ processes.

1. Lamports Bakery Algorithm [1974]: The bakery algorithm is base upon a


scheduling algorithm commonly used in bakeries, ice cream stores, etc.
Upon entering the store, each customer receives a number. The customer
with the lowest number is served next. (Tie may be resolved)
2. Eisenberg and McGuire [1972]

Semaphores:

 The solutions to the mutual exclusion problem presented in the last section are
not easy to generalize to more complex problems.

 To overcome this difficulty, a new synchronization tool, called a semaphore,


was introduced by Dijkstra [1965a].

 A semaphore S is an integer variable that, apart from initialization, can be


accessed only through two standard atomic operations: P and V. The classical
definitions of P and V are:

P(S): while S <= 0 do skip;


S : = S – 1;

V(S): S: = S + 1;

Modifications to the integer value of the semaphore in the P and V operations


are executed indivisibly. That is, when one process modifies the semaphore value, no
other process can simultaneously modify that same semaphore value. In addition, in
the case of the P(S), the testing of the integer value of S(S ≤ 0), and its possible
modification (S := S – 1) must also be executed without interruption.
Usage:
1. Semaphores can be used in dealing with the n-process critical section
problem. The n processes share a common semaphore, mutex, initialized to 1.
Each process Pi is organized as follows:

repeat

P(mutex);

critical section

V(mutex);

remainder section
until false;

3. Semaphores as Synchronization Tool:

S
1

S
2
Consider two concurrently running processes: P1 with a statement S1, and P2
with a statement S2. Suppose that we require that S2 be executed only after S1 has
completed. This scheme can be readily implemented by letting P1 and P2 share a
common semaphore synch, initialized to 0, and by inserting the statements:
S1;
V(synch); in process P1, and the statements
P(synch);
S2;
in process P2. Since synch is initialized to 0, P2 will execute S2 only after
P1 has invoked V(synch), which is after S1.
Further Example:
Semaphore definition with out busy-waiting:

To implement semaphore for avoiding busy-waiting we define a semaphore as


a record.

type semaphore = record


Value: integer;
L: list of process;
end;

Each semaphore has an integer value and a list of processes. When a process must
wait on a semaphore, it is added to the list of processes. A V operation removes one
process from the list of waiting processes and awakens it.

The semaphore operations can now be defined as:

P(S): S. value := S. value – 1;

if S. value < 0
then begin
add this process to S.L;
block;
end;

V(S): S. value := S. value + 1;

if S.value ≤ 0
then begin
remove this process P from S.L;
wakeup(P);
end;
Table shows a sequence of 14 states of 6 processes invoking P and V operations on
semaphores. Initially, none of the processes is in its critical

Action Resulting Status


invoking operation running blocked value
process invoked in c. r. on s of s

1 A P(s) A 1
2 A V(s) 0
3 B P(s) B 1
4 C P(s) B C 0
5 D P(s) B C,D 0
6 E P(s) B C,D,E 0
7 B V(s) D C,E 0
8 F P(s) D C,E,F 0
9 D V(s) C,F 0
11 none of A,..,F E C,F 0
12 E V(s) F C 0
13 F V(s) C 0

P(s): if s= 1
then s 0 /* lower semaphore */
else BLOCK calling process on s
DISPATCH a ready process

V(s): if the list of processes waiting on s is nonempty


then WAKE UP a process waiting on s
else s 1 /* raise semaphore */
Classical Process Coordination Problems of Computer Science:

1. Dining Philosophers Problem


2. Bounded-Buffer Problem
3. Readers/Writers Problem
4. Cigarette Smokers Problem [Patil 1971]
5. Sleepy Barber Problem [Dijkstra 1965 a]

1. Dining Philosophers Problem


2. Bounded-Buffer Problem
3. Readers/Writers Problem

You might also like