Atomicity
Atomicity
Implementing atomic broadcast is not quite as simple as it looks. The method of Fig. 2-
33 fails because receiver overrun is possible at one or more machines. The only way to be
sure that every destination receives every message is to require them to send back an
acknowledgement upon message receipt. As long as machines never crash, this method will
do.
However, many distributed systems aim at fault tolerance, so for them it is essential
that atomicity also holds even in the presence of machine failures. In this light, all the
methods of Fig. 2-33 are inadequate because some of the initial messages might not arrive
due to receiver overrun, followed by the sender's crashing. Under these circumstances, some
members of the group will have received the message and others will not have, precisely the
situation that is unacceptable. Worse yet, the group members that have not received the
message do not even know they are missing anything, so they cannot ask for a
retransmission. Finally, with the sender now down, even if they did know, there is no one to
provide the message.
Nevertheless, there is hope. Here is a simple algorithm that demonstrates that atomic
broadcast is at least possible. The sender starts out by sending a message to all members of
the group. Timers are set and retransmissions sent where necessary. When a process
receives a message, if it has not yet seen this particular message, it, too, sends the message
to all members of the group (again with timers and retransmissions if necessary). If it has
already seen the message, this step is not necessary and the message is discarded. No
matter how many machines crash or how many packets are lost, eventually all the surviving
processes will get the message. Later we will describe more efficient algorithms for ensuring
atomicity.