Homework 2
1. It is necessary to change our program to construct all possible interleavings in an orderly way,
rather than rely on the true concurrent execution in generating them. Let us go further and
write a new version of the program.
This program uses a recursive approach to generating all possible paths of execution. Here's
what you need to do:
1. Set a maximum path length of 6 (3 for p, 2 for q, and a null terminator).
2. The function generate_paths builds all possible paths in a recursive manner:
o It opens with the current path, indices for processes p and q, and current path
index.
o If the current path is complete, print it.
o Otherwise, add the next statement from p, if it is available, and go on
recursively.
o Then add the next statement from q, if available, and go on recursively.
3. The main function initializes for the processes and prints all paths.
When you run this program, you should see output like this:
The output shows all 10 possible execution paths, exactly as we have discussed before. Each
of the lines represents different possible interleavings of statements from processes p(ABC)
and q(DE).
2. Overall, careful consideration should be made of the concurrent execution carried out by the
two total() functions and of how the shared variable tally is incremented. The problem could
be analyzed step by step.
Analysis of the given program
The program, working in parallel, increments a shared variable tally with the help of two
independent processes. The function total() stakes its claim on the variable tally by
incrementing it 60 times. The function main(), having set tally to zero, initiates both the
processes at the same time, so that both are carrying out total simultaneously.
1. Program structure:
o Two concurrent calls to total() in main().
o Each total() function increments tally 60 times(n = 60).
o Increment operation (tally++) consists of three separate machine instructions: a. Load
from memory b. Add c. Store back to memory
2. Ideal case: Assuming complete isolation and no interference of two concurrent functions
total(), a sum of 60 + 60 = 120 increments are expected; thus, the upper bound equals 120.
3. Lower bound analysis process: The lower bound creates the worst case when the two
concurrently executing processes interfere with each other. Consider this case:
o Process one loads register with value of tally(0).
o Process two loads register with value of 0.
o Process one adds 1 to register value to get 1.
o Process two adds 1 to register value to get 1.
o Process one stores the value 1 back in tally.
o Process two stores 1 back in tally.
For such a case, there will be two more increments due to this pattern, increasing rather than
missing. If such behavior is observed in every subsequent stage until all the 120 intended
increments-from both processes together-there will be 60 incremented.
i. Determining the true lower bound: However, the lower bound is actually not 60 for the
simple reason explained below:
o Each processes must complete its loop 60 times.
o In the above worst-case description, if the first process executed 59 loops (118 total
intended increments), the tally would stand at 59.
o After the last iteration of each process-2 more intended increments-even if they
interfere, tally must be raised by at least one unit.
Then, the true lower bound is therefore 60 (as from first 59 iterations)+1(last one)=61.
ii. Conclusion:
o Lower Bound: 61
o Upper Bound: 120
This "casual quick answer" that people may first assume intuitively because of its subtlety is
that the lower bound shall be taken as 60. What it ignores is actually, even contrary to the
very aggressive scenario, the last increment from each process cannot be completely lost;
thus, tally has to stand at least at 61.
To arrive at the lower and upper bounds of the tally's final value, we ought to evaluate the
machinations of any interleaving caused by the execution of instructions of the two parallel
processes.
Lower Bound: 60
The lower bound of 60 will be reached if one process finishes executing all decrements
before the other begins. If such occurs, the first process could increment the tally 60 times
and the second 0, giving a final tally value of 60.
Upper Bound: 120
The upper bound of 120 can be realized by interposing the operations of both processes, such
as to have the maximum number of increments upon tally. In other words, if we perform the
load instruction of one process, then the increment instruction of the other process, then the
store instruction of the first, tally gets incremented twice during every iteration of the loop. In
this case, both processes executed the increment operation on tally for 60 times, leading to
the final tally of 120.
Corrected code to achieve the desired bounds
Here is the corrected code in C:
OpenMP helps the user to create and program on a multithreaded C code. The total() function
is made to run two instances in parallel through the # pragma omp parallel constructs, while
each of the two instances uses the temporary variable for the value of the tally variable.
She sounds from below: the parallel run is permitted to call total() two times in order to
obtain the sum by way of the parallel directive # pragma omp parallel, thus invoking the
program.
This program will output:
A hasty, uninformed, and casual answer, therefore, would be 120, as each process must have
incremented the tally 60 times. This answer is fundamentally flawed in that it tends to ignore
interleaving of machine instructions.
To repeat: The increment operation is made of three distinct machine instructions: load, add,
and store. If two processes interleave their machine instructions, then one increment
operation of one process may overwrite that of the other, thus making a lost update possible.
However, the output may even just be less than 120.
This mechanism, by doing the temporary variable to ensure atomicity, guarantees a means of
doing away with one possibility of the lost-update problem in terms of the last tally between
the recorded limits of 60 to 120 between possible outputs obtained.
3.
(A) Is it possible to have concurrency but not parallelism? Explain.
Yes, you can have concurrency without parallelism. Briefly, here's what these words mean:
1. Concurrency:
• Concurrency is when a system can support two or more processes at the same time.
• It is about dealing with multiple things at once, but not necessarily doing tasks
simultaneously.
• A programming language is concurrent if it's built in a way that allows lots of things to
happen at once. Typically this happens in an "interleaved" way.
2. Parallelism:
• Parallelism is when a system can execute several threads in parallel.
• It is about doing lots of things at the same time.
• We say a programming language is parallel if it provides language-level constructs to
facilitate parallel execution.
3. Concurrency without parallelism:
• On a single-core processor, you can have concurrency without parallelism.
• Although at any moment in time, the processor is executing the instructions of only one
task (there is no parallelism).
• The operating system can rapidly switch contexts of execution from one process to
another, creating the illusion of parallelism (concurrency).
4. Chef analogy:
• Scene is a single-core "chef" that takes instructions from a recipe book--either stir the
sauc, chop the onions, or check the oven.
• As the chef has one set of hands, it takes turns performing the individual actions.
• This is concurrent work as the chef has many ongoing tasks and will regularly clean up
the kitchen, stir a pot, check the oven, or chop the onions.
(B) There is an environment wherein blocking system calls are executed for one or more threads
within a process while other threads continue running. This means that we are to consider the
mapping that takes place in a one-to-one manner between user and kernel threads.
1. A one-to-one threading model:
-Clear mapping between user-level threads and kernel-level threads.
-Threads could be scheduled independently by the operating system.
2. A blocking system call:
-Under its model, when one thread issues a blocking system call, other threads are free to
execute.
-For the single-threaded program, a blocking call would deprive the entire process of
running.
3. Uniprocessor environment:
We are speaking of a computer with only one processor, hence true parallelism is not
possible.
4. An environment which may offer advantages:
-Blocking calls can, under less-than-optimal circumstances, speed things up even on a
uniprocessor.
This happens:
a) First, in the case of I/O-bound tasks, some threads could be waiting for I/O operations,
and others can then make use of the CPU.
b) Then, latency is reduced: hence, the program remains rather snappy, for some
operations do become blocked.
c) Furthermore, better CPU utilization means that CPU can be working even when some
threads are stuck.
5. Limitations:
-This improvement shall not come because we do not have parallel execution.
-The benefit will come out to be because of better CPU utilization during blocking
operations.
-For CPU-bound tasks without any blocking, the time gain might be negligible or none at
all.
Results: Given the fact that many threads cannot run on the same processor in a genuine
multithreaded execution mode, this model can speed up their runs compared to single-threaded
ones on a uniprocessor machine, particularly when I/O or blocking calls characterize their
main operation. This improvement does not arise from any true parallel execution; it is because
of better resource utilization that some of the threads can progress while the others are blocked
awaiting the release of services.
However, the extent of the improvement will depend on the nature of the program and the
balance between CPU-bound and I/O-bound operations. Programs that combine both
computation and I/O are expected to benefit the most from this approach on uniprocessor
systems.
4. 1. NEW to READY (Admit):
- The NEW state is for a process that has been created. and admitted into the system but OS is
yet to allocate a machine cycle to the process.
- The "Admit" transition happens when
a) Enough system resources such as memory and CPU time, are available with OS to execute
the new process.
b) Memory has been assigned for the process control block (PCB) and memory other than
process itself.
c) process's PCB, memory allocation data is loaded into memory.
- This transition is very important to control the degree of multiprogramming in the system.
2. READY to RUNNING (Dispatch):
- The READY state is the state where process has been loaded into memory and is ready for
the execution.
- The "Dispatch" transition occurs when CPU scheduler in OS calls the process. Generally
different types of scheduling algorithms implemented in CPU scheduler are Round Robin,
Priority scheduling, FCFS scheduling.
- The dispatch SHOULD happen only when
a) Existing process is no longer in the RUNNING state and new process is supposed to be
executed.
b) Context of previous process is stored and context of the new process is loaded. Context
means program counter, registers, Accumulator, etc.
- It is the single most important point in deciding system performance, as efficient dispatching
will lead to better overall system throughput.
3. RUNNING to EXIT (Release):
- The RUNNING state is the state in which process is currently under execution i.e., it is on
CPU.
- The "Release" transition is triggered when
a) The process had undergone its normal course and has completed its execution.
b) The process is to be terminated as it has encountered irrecoverable error and OS decided to
kill the process gracefully.
c) Either user/program explicitly kills the process.
- During this transition:
a) All the process resources like memory allocation, open files etc will be released by the OS.
b) process's PCB is erased from the memory.
c) Relevant statistics like process creation time, termination time and total time that process
was present in the RUNNING state also gathered for accounting purposes.
4. RUNNING to READY (Timeout):
- transition is important in time-sharing systems to give fair share of CPU to all processes.
- The Timeout transition is the one which is most important in terms of time sharing
environment when the process waiting for the time slice (quantum) assigned to it for
execution by CPU scheduler.
- The "Timeout" happens in two there conditions
a) Time slice (quantum) assigned to the process happens.
b) New process is introduced which is has high priority.
5. RUNNING to BLOCKED/WAITING (Event wait):
- This is an event wait situation whereby a process running is supposed to wait for an event
or resource.
- Typically, the most common cases of this type would include:
a) Waiting on I/O to be complete.
b) Waiting on a shared resource to be free.
c) Waiting for a signal from another process.
- It relieves the CPU for other jobs to run where the process of interest itself would
relinquish control over the CPU.
6. BLOCKED/WAITING to READY (Event occur):
- This is a transition that occurs when an event for which the process was waiting has
occurred.
- The process moves from the blocked queue to the ready queue.
- The process is not immediately put in execution; it waits for the short-term scheduler to
select it.
7. READY to SUSPEND and BLOCKED/WAITING to SUSPEND (Suspend):
- The SUSPENDED state is for processes that have been swapped out of main memory onto
secondary storage.
- This transition occurs when:
a) A system is overloaded and requires a limited degree of multiprogramming.
b) Some other higher priority process needs to be brought into memory.
- The suspended processes are then completely taken off main memory.
8. SUSPENDED to READY (Activate):
- This is the transition that would occur thereafter with bringing a suspended process back
into main memory.
- Happens when:
a) More resources become available to the system.
b) Process priority increases.
c) Termination of processes leaves room for the process.
9. RUNNING to SUSPEND (not explicitly shown but possible):
- In some systems, a running process might directly suspend its execution for:
a) A user-requested suspension (for example, debugging).
b) A system action to balance or optimize the load or system resources.
The state transition model allows that the operating systems manage multiple processes
effectively, balancing the system load and ensuring fair allocation of resources. It is a
fundamental notion in today's multitasking operating systems, allowing them to juggle lots
of processes while providing the illusion that those processes all execute at once on a single
CPU.
5. Let's go through this code segment step by step to ascertain a unique number of processes
and threads created.
Code continued.
a. The number of unique processes created:
1. The very first fork() creates one child process, yielding a total of 2 processes-the parent
and child.
2. Inside the if statement (executed only by the child process):
- Another fork() is made, generating a grandchild. Thus we presently have 3 processes.
3. Outside the if statement (executed by both the parent and the child):
- Another fork() is made. Two more processes will be created.
The total number of unique processes equals:
1 (original)+ 1 (first fork) + 1 (fork inside if) + 2 (last fork) = 5 processes
b. The number of unique threads created:
The thread_create() function is only called once, inside the if statement. This is only
executed by the child process of the first fork.
So, the total number of unique threads created equals: 1 thread.
In nutshell:
• 5 unique processes are created.
• 1 unique thread is created.