Message Passing Synchronous & Asynchronous
Message Passing Synchronous & Asynchronous
Asynchronous Execution:
A communication among processes is considered asynchronous, when every
communicating process can have a different observation of the order of the messages being
exchanged. In an asynchronous execution:
there is no processor synchrony and there is no bound on the drift rate of processor clocks
message delays are finite but unbounded
no upper bound on the time taken by a process
Synchronous Execution:
A communication among processes is considered synchronous when every process
observes the same order of messages within the system. In the same manner, the execution is
considered synchronous, when every individual process in the system observes the same total
order of all the processes which happen within it. In an synchronous execution:
processors are synchronized and the clock drift rate between any two processors is
bounded
message delivery times are such that they occur in one logical step or round
upper bound on the time taken by a process to execute a step.
Logical time is relative and eliminates the overheads of providing physical time for
applications .Logical time can
(i) capture the logic and inter-process dependencies
(ii) track the relative progress at each process
Maintaining the global state of the system across space involves the role of time
dimension for consistency. This can be done with extra effort in a coordinated manner.
Deriving appropriate measures of concurrency also involves the time dimension, as the
execution and communication speed of threads may vary a lot.
Synchronization/coordination mechanisms
Synchronization is essential for the distributed processes to facilitate concurrent
execution without affecting other processes.
The synchronization mechanisms also involve resource management and concurrency
management mechanisms.
Some techniques for providing synchronization are:
Physical clock synchronization: Physical clocks usually diverge in the values due to
hardware limitations. Keeping them synchronized is a fundamental challenge to maintain
common time.
Leader election: All the processes need to agree on which process will play the role of a
distinguished process or a leader process. A leader is necessary even for many distributed
algorithms because there is often some asymmetry.
Mutual exclusion: Access to the critical resource(s) has to be coordinated.
Deadlock detection and resolution: This is done to avoid duplicate work, and deadlock
resolution should be coordinated to avoid unnecessary aborts of processes.
Termination detection: cooperation among the processes to detect the specific global
state of quiescence.
Garbage collection: Detecting garbage requires coordination among the processes.
Group communication, multicast, and ordered message delivery
A group is a collection of processes that share a common context and collaborate on a
common task within an application domain. Group management protocols are needed for
group communication wherein processes can join and leave groups dynamically, or fail.
The concurrent execution of remote processes may sometimes violate the semantics and
order of the distributed program. Hence, a formal specification of the semantics of ordered
delivery need to be formulated, and then implemented.
Monitoring distributed events and predicates
Predicates defined on program variables that are local to different processes are used for
specifying conditions on the global system state.
On-line algorithms for monitoring such predicates are hence important.
An important paradigm for monitoring distributed events is that of event streaming,
wherein streams of relevant events reported from different processes are examined collectively
to detect predicates.
The specification of such predicates uses physical or logical time relationships.
Distributed program design and verification tools
Methodically designed and verifiably correct programs can greatly reduce the overhead of
software design, debugging, and engineering. Designing these is a big challenge.
Debugging distributed programs
Debugging distributed programs is much harder because of the concurrency and replications.
Adequate debugging mechanisms and tools are need of the hour.
Data replication, consistency models, and caching
Fast access to data and other resources is important in distributed systems.
Managing replicas and their updates faces concurrency problems.
Placement of the replicas in the systems is also a challenge because resources usually
cannot be freely replicated.
World Wide Web design – caching, searching, scheduling
WWW is a commonly known distributed system.
The issues of object replication and caching, pre fetching of objects have to be done on
WWW also.
Object search and navigation on the web are important functions in the operation of the
web.
Distributed shared memory abstraction
A shared memory is easier to implement since it does not involve managing the
communication tasks.
Self-stabilizing systems: All system executions have associated good(or legal) states and
bad (or illegal) states; during correct functioning, the system makes transitions among the good
states. A self-stabilizing algorithm guarantee to take the system to a good state even if a bad
state were to arise due to some error. Self-stabilizing algorithms require some in-built
redundancy to track additional variables of the state and do extra work.
Checkpointing and recovery algorithms: Check pointing is periodically recording the
current state on secondary storage so that, in case of a failure. The entire computation is not lost
but can be recovered from one of the recently taken checkpoints. Check pointing in a distributed
environment is difficult because if the checkpoints at the different processes are not
coordinated, the local checkpoints may become useless because they are inconsistent with the
checkpoints at other processes.
Failure detectors: The asynchronous distributed do not have a bound on the message
transmission time. This makes the message passing very difficult, since the receiver do not
know the waiting time. Failure detectors probabilistically suspect another process as having
failed and then converge on a determination of the up/down status of the suspected process.
Load balancing: The objective of load balancing is to gain higher throughput, and
reduce the user perceived latency. Load balancing may be necessary because of a variety off
actors such as high network traffic or high request rate causing the network connection to be a
bottleneck, or high computational load. The following are some forms of load balancing:
Data migration: The ability to move data around in the system, based on the access
pattern of the users
Computation migration: The ability to relocate processes in order to perform a
redistribution of the workload.
Distributed scheduling: This achieves a better turnaround time for the users by using
idle processing power in the system more efficiently.
Real-time scheduling
Real-time scheduling becomes more challenging when a global view of the system state is
absent with more frequent on-line or dynamic changes. The message propagation delays which
are network-dependent are hard to control or predict. This is an hindrance to meet the QoS
requirements of the network.
Performance
User perceived latency in distributed systems must be reduced. The common issues in
performance:
Metrics: Appropriate metrics must be defined for measuring the performance of
theoretical distributed algorithms and its implementation.
Measurement methods/tools: The distributed system is a complex entity appropriate
methodology and tools must be developed for measuring the performance metrics.
o They are low cost equipment with limited computational power and battery life.
They are designed to handle streaming data and route it to external computer
network and processes.
o They are susceptible to faults and have to reconfigure themselves.
o These features introduces a whole new set of challenges, such as position
estimation and time estimation when designing a distributed system .
Ubiquitous or pervasive computing
o In Ubiquitous systems the processors are embedded in the environment to
perform application functions in the background.
o Examples: Intelligent devices, smart homes etc.
o They are distributed systems with recent advancements operating in wireless
environments through actuator mechanisms.
o They can be self-organizing and network-centric with limited resources.
Peer-to-peer computing
o Peer-to-peer (P2P) computing is computing over an application layer network
where all interactions among the processors are at a same level.
o This is a form of symmetric computation against the client sever paradigm.
o They are self-organizing with or without regular structure to the network.
o Some of the key challenges include: object storage mechanisms, efficient object
lookup, and retrieval in a scalable manner; dynamic reconfiguration with nodes as
well as objects joining and leaving the network randomly; replication strategies to
expedite object search; tradeoffs between object size latency and table sizes;
anonymity, privacy, and security.
Publish-subscribe, content distribution, and multimedia
o The users in present day require only the information of interest.
o In a dynamic environment where the information constantly fluctuates there is
great demand for
o Publish: an efficient mechanism for distributing this information
o Subscribe: an efficient mechanism to allow end users to indicate interest in
receiving specific kinds of information