2022p-SDstatement en 1st Deliverable Solution
2022p-SDstatement en 1st Deliverable Solution
Authors: Joan Manuel Marquès, Antonio González, David Mor, Joan-Antoni Vilaseca.
Distributed Systems course
Spring 2022
Assignment Outline 2
Time Stamped Anti-Entropy (TSAE) protocol 2
Groups 2
To Deliver 2
D1. First Deliverable 3
1. Phase 1: Theoretical exercise of TSAE protocol 3
1.1. TSAE protocol exercise (no purged log) 3
1.1.1. Exercise 1 3
1.1.2. Exercise 2 6
1.2. TSAE protocol exercise (purged log) 8
1.3. TSAE protocol exercise 9
2. Phase 1: Implementation and testing of Log and TimestampVector data structures 10
Environment 10
2.1. Test locally 10
2.2. Formal evaluation of phase 1 10
2.3. Things to deliver 11
Annex A. Source code and documentation 12
Annex B. Activity simulation and dynamicity 13
References 13
Distributed Systems TSAE protocol 2
Assignment Outline
The aim of this practical assignment is to implement and evaluate a weak-consistency protocol for
data dissemination.
The project consists on:
● Implementing the Time Stamped Anti-Entropy (TSAE) protocol [1] into an application that
stores cooking recipes in a set of replicated servers.
● Add a remove operation on the recipes application.
● Evaluate how TSAE behaves under different conditions.
Groups
Phase 1: should be done individually.
You are strongly advised to do the phases 2 to 4 in groups of two students (from the same
classroom), even though it is also possible to do it individually.
To Deliver
Two deliverables (more details in each phase):
1. Phase 1 (theoretical exercises and practice).
2. Phases 2 to 4 or second theoretical exercise.
Deadlines are in the course schedule.
Distributed Systems TSAE protocol 3
1.1.1. Exercise 1
For the following sequence:
Time Operation
1 Host A executes operation A3
2 Host C executes operation C3
3 Host B executes operation B2
4 Host C does an anti-entropy session with host A
5 Host A does an anti-entropy session with host B
6 Host B executes operation B3
7 Host C executes operation C4
8 Host A executes operation A4
9 Host B does an anti-entropy session with C
10 Host C does an anti-entropy session with A
T1:
Summary A = A3, B1, C2 Log A = A1, A2, A3, B1, C1, C2
Summary B = A1, B1, C1 Log B = A1, B1, C1
Summary C = A1, B1, C2 Log C = A1, B1, C1, C2
T2:
Summary A = A3, B1, C2 Log A = A1, A2, A3, B1, C1, C2
Summary B = A1, B1, C1 Log B = A1, B1, C1
Summary C = A1, B1, C3 Log C = A1, B1, C1, C2, C3
T3:
Summary A = A3, B1, C2 Log A = A1, A2, A3, B1, C1, C2
Summary B = A1, B2, C1 Log B = A1, B1, B2, C1
Summary C = A1, B1, C3 Log C = A1, B1, C1, C2, C3
T4:
Host C sends to host A: C3
Host A sends to host C: A2, A3
Summary A = A3, B1, C3 Log A = A1, A2, A3, B1, C1, C2, C3
Summary B = A1, B2, C1 Log B = A1, B1, B2, C1
Summary C = A3, B1, C3 Log C = A1, A2, A3, B1, C1, C2, C3
T5:
Host A sends to host B: A2, A3, C2, C3
Host B sends to host A: B2
Summary A = A3, B2, C3 Log A = A1, A2, A3, B1, B2, C1, C2, C3
Summary B = A3, B2, C3 Log B = A1, A2, A3, B1, B2, C1, C2, C3
Summary C = A3, B1, C3 Log C = A1, A2, A3, B1, C1, C2, C3
T6:
Summary A = A3, B2, C3 Log A = A1, A2, A3, B1, B2, C1, C2, C3
Summary B = A3, B3, C3 Log B = A1, A2, A3, B1, B2, B3, C1, C2, C3
Summary C = A3, B1, C3 Log C = A1, A2, A3, B1, C1, C2, C3
T7:
Distributed Systems TSAE protocol 5
Summary A = A3, B2, C3 Log A = A1, A2, A3, B1, B2, C1, C2, C3
Summary B = A3, B3, C3 Log B = A1, A2, A3, B1, B2, B3, C1, C2, C3
Summary C = A3, B1, C4 Log C = A1, A2, A3, B1, C1, C2, C3, C4
T8:
Summary A = A4, B2, C3 Log A = A1, A2, A3, A4, B1, B2, C1, C2, C3
Summary B = A3, B3, C3 Log B = A1, A2, A3, B1, B2, B3, C1, C2, C3
Summary C = A3, B1, C4 Log C = A1, A2, A3, B1, C1, C2, C3, C4
T9:
Host B sends to host C: B2, B3
Host C sends to host B: C4
Summary A = A4, B2, C3 Log A = A1, A2, A3, A4, B1, B2, C1, C2, C3
Summary B = A3, B3, C4 Log B = A1, A2, A3, B1, B2, B3, C1, C2, C3, C4
Summary C = A3, B3, C4 Log C = A1, A2, A3, B1, B2, B3, C1, C2, C3, C4
T10:
Host C sends to host A: B3, C4
Host A sends to host C: A4
Summary A = A4, B3, C4 Log A = A1, A2, A3, A4, B1, B2, B3, C1, C2, C3, C4
Summary B = A3, B3, C4 Log B = A1, A2, A3, B1, B2, B3, C1, C2, C3, C4
Summary C = A4, B3, C4 Log C = A1, A2, A3, A4, B1, B2, B3, C1, C2, C3, C4
Hosts B has pending operations to receive from host A or host C. The final state is not consistent.
Distributed Systems TSAE protocol 6
1.1.2. Exercise 2
For the following sequence:
Time Operation
1 Host A executes operation A3
2 Host A executes operation A4
3 Host B executes operation B2
4 Host C executes operation C3
5 Host B executes operation B3
6 Host B does anti-entropy session with host C
7 Host C does anti-entropy session with host A
8 Host A does anti-entropy session with host B
T1:
Summary A = A3, B1, C2 Log A = A1, A2, A3, B1, C1, C2
Summary B = A1, B1, C1 Log B = A1, B1, C1
Summary C = A1, B1, C2 Log C = A1, B1, C1, C2
T2:
Summary A = A4, B1, C2 Log A = A1, A2, A3, A4, B1, C1, C2
Summary B = A1, B1, C1 Log B = A1, B1, C1
Summary C = A1, B1, C2 Log C = A1, B1, C1, C2
T3:
Summary A = A4, B1, C2 Log A = A1, A2, A3, A4, B1, C1, C2
Summary B = A1, B2, C1 Log B = A1, B1, B2, C1
Summary C = A1, B1, C2 Log C = A1, B1, C1, C2
T4:
Summary A = A4, B1, C2 Log A = A1, A2, A3, A4, B1, C1, C2
Distributed Systems TSAE protocol 7
T5:
Summary A = A4, B1, C2 Log A = A1, A2, A3, A4, B1, C1, C2
Summary B = A1, B3, C1 Log B = A1, B1, B2, B3, C1
Summary C = A1, B1, C3 Log C = A1, B1, C1, C2, C3
T6:
Host B sends to host C: B2, B3
Host C sends to host B: C2, C3
Summary A = A4, B1, C2 Log A = A1, A2, A3, A4, B1, C1, C2
Summary B = A1, B3, C3 Log B = A1, B1, B2, B3, C1, C2, C3
Summary C = A1, B3, C3 Log C = A1, B1, B2, B3, C1, C2, C3
T7:
Host C sends to host A: B2, B3, C3
Host A sends to host C: A2, A3, A4
Summary A = A4, B3, C3 Log A = A1, A2, A3, A4, B1, B2, B3, C1, C2, C3
Summary B = A1, B3, C3 Log B = A1, B1, B2, B3, C1, C2, C3
Summary C = A4, B3, C3 Log C = A1, A2, A3, A4, B1, B2, B3, C1, C2, C3
T8:
Host A sends to host B: A2, A3, A4
Host B sends to host A: -
Summary A = A4, B3, C3 Log A = A1, A2, A3, A4, B1, B2, B3, C1, C2, C3
Summary B = A4, B3, C3 Log B = A1, A2, A3, A4, B1, B2, B3, C1, C2, C3
Summary C = A4, B3, C3 Log C = A1, A2, A3, A4, B1, B2, B3, C1, C2, C3
All hosts have received the same operations. The final state is consistent.
Distributed Systems TSAE protocol 8
B and E do an anti-entropy session. During the session both know who each other is (this is different
from the algorithm in the Golding thesis).
3. Which operations are exchanged during the anti-entropy session?
Operations sent by B = B3, B4, D2
Operations sent by E = A3, E4
4. Which AckSummary and log have each host after ending the session?
Final state:
AckSummary B & E
\ A B C D E
A A3 B2 C2 D1 E4
B A3 B4 C3 D2 E4
C A2 B2 C3 D1 E4
D A2 B4 C2 D2 E4
E A3 B4 C3 D2 E4
Distributed Systems TSAE protocol 9
2. What are the extensions to the TSAE algorithm proposed in Golding's doctoral thesis? Explain in
detail at least one of the possible answers.
The TSAE protocol as presented requires loosely-synchronized clocks so that each principal can
acknowledge messages using a single timestamp (Figure 5.5). If clocks are not synchronized, the
clock at one principal may be much greater than the clock at another. If the minimum timestamp
were selected to summarize the messages a principal has received, messages from the principal
with the fast clock might never be acknowledged.
A principal’s summary vector is a more general and exact measure of the messages that have been
received. If the entire summary vector is used as an acknowledgment, then clock values from
different hosts need never be compared.
To use summary vectors for acknowledgment, each principal must maintain a two-dimensional
acknowledgment matrix of timestamps, as shown in Figure 5.12. The summary vector is part of
the acknowledgment matrix: the th column in The matrix is the summary vector for the local
principal. Other columns are old copies of the summary vectors from other principals.
The unsynchronized-clock version of the TSAE protocol is little different from the synchronized
clock version. During anti-entropy sessions, principals exchange the entire matrix and update the
entire matrix using an element wise maximum at the end of a session.
The only other difference arises when the message ordering component is called upon to
determine whether a message has been acknowledged by every principal. Consider a message
sent from principal p at time t. A principal q knows that every other principal has observed the
message when every timestamp in the p row of the message vector q at is greater than t.
This could be a correct and extended answer for the question.
Distributed Systems TSAE protocol 10
Environment
Requires Java 7.
We recommend you to use Eclipse as an IDE. We will provide you an Eclipse project that contains
the implementation of the cooking recipes application except the parts related to TSAE protocol.
All scripts for running local tests are prepared for Ubuntu-linux but other OS can be used. In that
case, you will be responsible for adapting the scripts to your OS.
Subdirectory Content
TSAE folder contains all packages and classes required for the practical assignment.
Important: do not implement new classes. Modify only the classes indicated in each phase.
(except if you modify the basic modeling of parameters in phase 4)
Classes that you should modify:
1. package recipes_service.tsae.datastructures:
◦ Log: class that logs operations.
◦ TimestampVector: class to maintain the summary.
◦ TimestampMatrix: class to maintain the acknowledgment matrix of timestamps
2. package recipes_service.tsae.sessions:
◦ TSAESessionOriginatorSide: Originator protocol for TSAE.
◦ TSAESessionPartnerSide: Partner's protocol for TSAE.
3. package recipes_service:
◦ ServerData: contains Server's data structures required by the TSAE protocol (log,
summary, ack) and the application (recipes). You can add any required method to
allow TSAESessionOriginatorSide and TSAESessionPartnerSide to
manipulate these data structures.
▪ addRecipe method: adds a new recipe.
▪ removeRecipe method: removes a recipe.
Classes that you should use but NOT modify:
4. package recipes_service.tsae.data_structures:
◦ Timestamp: a timestamp. A timestamp allows the ordering of operations issued from
the same host. It is a tuple <hostId, sequenceNumber>. Sequence number is a number that
grows monotonically. The first valid timestamp issued by a host will have an initial value
of 0. A negative sequence number means that the host hasn't issued yet any operation.
Timestamps can not be used to order operations issued in different hosts. Next timestamp
is obtained by calling the method nextTimestamp() from class ServerData
(package recipes_service).
5. package recipes_service.data:
◦ AddOperation: an add operation (operations are logged in the Log and exchanged
with other partners).
◦ RemoveOperation: a remove operation (operations are logged in the Log and
exchanged with other partners).
◦ Recipe: a recipe.
◦ Recipes: class that contains all recipes.
Distributed Systems TSAE protocol 13
6. package recipes_service.communication)
◦ MessageAErequest: message sent to request an anti-entropy session.
◦ MessageOperation: message sent for each operation exchanged during an anti-
entropy session.
◦ MessageEndTSAE: message sent to finish an anti-entropy session.
7. package communication:
◦ ObjectInputStream_DS: class that implements a modification of the
ObjectInputStream to simulate failures. Use the method readObject().
◦ ObjectOutputStream_DS: class that implements a modification of the
ObjectOutputStream to simulate failures. Use the method writeObject().
8. package recipes_service.activitysimulation: Only in case that in phase 4 you
want to better understand or modify the modeling of activity and dynamicity.
References
[1] Richard A. Golding (1992, December). Weak-consistency group communication and
membership. Ph.D. thesis, published as technical report UCSC-CRL-92-52. Computer and
Information Sciences Board, University of California. Santa Cruz. (chapter 5)
[2] R. Golding; D. D. E. Long. The performance of weak-consistency replication protocols, UCSC
Technical Report UCSC-CRL-92-30, July 1992.