Objectives • Characteristics of a distributed system • Differentiation between network and distributed operating system • Various issues in distributed operating system • IPC methods in distributed operating system • Concept of global clock and logical clock • Lamport’s Logical clock algorithm for synchronization • Global distributed system state • Achieving global state through Chandy-Lamport algorithm • Algorithms to resolve mutual exclusion problem • Deadlock detection through centralized and distributed algorithms • Deadlock prevention through wait-die and wound-wait algorithms • Distributed process scheduling algorithms • Distributed shared memory
Introduction to distributed systems • A distributed system is a loosely coupled architecture wherein processors are inter-connected by a communication network. • Distributed systems are multiprocessor systems but with the following differences: ▪ Distributed system works in a wide area network involving much more communication as compared to computation. ▪ Each node in a distributed system is a complete computer having full set of peripherals including memory, communication hardware, possibly different operating system and different file system, etc. ▪ The users of a distributed system have an impression that they are working on a single machine.
Difference between network operating system & distributed operating system • Network operating system are loosely coupled operating system software on a loosely coupled hardware that allows nodes and users of a distributed system to be quite independent of one another but interacts in a limited degree. • Distributed operating systems are tightly coupled operating system software on loosely coupled hardware, i.e. distributed system. • Since in a distributed system, each node is a complete computer system, there is no global memory and even no global clock.
Communication in distributed operating system • Inter Process Communication is achieved in Distributed Systems is implemented in two ways: ▪ Message Passing Model ▪ Remote Procedure Call
Clock synchronization in distributed system Synchronizing Logical Clocks • Lamport defined a relation known as Happens-before relation. This relation is basically for capturing the underlying dependencies between events. It may be denoted as x → y. There may be following two situations for this relation: • If x and y are events in the same process and x occurs before y. • If x is the message sending event by one process and y is the message receiving event. In this case x will always precede y. • Happens-before relation is a transitive relation, i.e., • If x → y, y → z, then x → z.
Clock synchronization in distributed system • In distributed systems, it is better to know the sequence of events using this relation and by ascertaining the order of events help in designing, debugging, and understanding the sequence of execution in distributed computation. It is clear that when one event changes the system state, it may affect its related future events that will happen after this. This influence among causally-related events satisfying the Happens-before relation, are known as causal affects.
Lamport’s Logical Clock • To implement the logical clock, Lamport introduced the concept of timestamp to be attached with each process in the system. The timestamp may be a function that assigns a number Ti(x) to any event x to process Pi. But these timestamps have nothing to do with actual real time clock. Now as per the timestamp the following conditions must be met: • For any two events x and y in a process Pi, If x → y , then T(x) < T(y) • If x is the message sending event in process Pi and y is the message receiving event in process Pj then Ti(x) < Tj(y)
Lamport’s Logical Clock • The timestamp value T must always be increasing and never decreasing. Thus, for the implementation, the clock is incremented between any two successive events with a positive integer always, i.e., Ti(y) = Ti(x) + d where d >0 • Using all the conditions mentioned above, we can assign timestamp to all events in the system and thereby provide a total ordering of all the events in the system.
Global State • In a distributed system, there is no global memory, a process if wishes to know the state of an entire system, is not available to it. • However, the up-to-date state of the full system is required for system’s behavior, debugging, fault recovery, synchronization, etc. • But in the absence of global clock and global memory, a coherent global state of the system is difficult to achieve. • Chandy Lamport have given an algorithm to achieve the same.
Chandy-Lamport Consistent State Recording Algorithm Marker sending rule for a process Ps 1.Ps records its state. 2.for (i=1; i =n; ++i) { If marker not sent to Ci Ps sends a marker on each outgoing Ci } Marker receiving rule for a process Q •Process Q receives the marker message through a Ci. •If Q has not recorded its state earlier Call Marker sending rule to record its state and to send the marker to each channel else Record the state of the channel as sequence of messages received after Q’s state was recorded and before Q received the current marker.
Mutual Exclusion To provide mutual exclusion among processes in distributed system the following algorithms are used: ▪ Centralized algorithm: In this algorithm, a process on one node of the distributed system is assigned as coordinator to manage the mutual exclusion problem. ▪ Ricart-Agarwala algorithm: It is a fully distributed algorithm. According to this algorithm, a process wishing to enter its critical section sends a time-stamped request messages to all other processes and waits. ▪ Token-ring algorithm: This algorithm assumes a logical structure of all the processes in a ring, i.e. all the processes are connected together in a ring sequence and every process knows who the next process in that sequence is.
DeadLock Detection • To provide deadlock detection in distributed system the following algorithms are used: – Centralized algorithm: In this algorithm a process is assigned as a central coordinator that will maintain the resource allocation state of the entire system. – Distributed algorithms: All the nodes in the system collectively cooperate to detect a cycle.
DeadLock Prevention • To provide deadlock prevention in distributed system the following algorithms are used: – Wait-die algorithm: The older process waits for a younger process but a younger process is killed if tries to wait for an older process – Wound-wait algorithm: The older process preempts the younger one but if younger one requests, it is allowed to wait.
Distributed Process Scheduling • To perform distributed process scheduling, Load balancing is performed by transferring some processes from heavily loaded node to lightly loaded node. This is known as process migration. • There are two types of schedulers. ▪ Local scheduler ▪ Global load scheduler • Global load scheduler is of two types: static and dynamic. Static scheduler assigns the processes to processors at their compile time only while dynamic scheduler assigns the processors when the processes start execution.
Distributed file systems • In distributed system, to have a distributed file system, some nodes are dedicated to store the files only. These nodes perform the storage and retrieval operation on the files. These nodes are known as file servers. The other nodes used for computational purposes are known as clients.