Distributed System Answer Key
Distributed System Answer Key
Question 7 -RPC semantics........................................................................................................20 Question 8 -. Extended RPC DOORS.........................................................................................22 Doors are RPCs implemented for processes on the same machine...............................................23 A single mechanism for communication: procedure calls (but with doors, it is not transparent)....23 Question 9 - RMI static Vs Dynamic.........................................................................................25 Question 10 -. Parameter passing in RMI...................................................................................27 Question 11. Name Resolution: a. Recursive b. Iterative.............................................................30
Recursive Name Resolution...............................................................................................................................................31 Comparison.......................................................................................................................................................................33
A few points on clock synchronization...............................................................................................................................34 Why need to synchronize clocks.......................................................................................................................................35 Problem with Physical clock..............................................................................................................................................35 Relative Clock Synchronization (Berkeley Algorithm)........................................................................................................35 Averaging algorithm..........................................................................................................................................................36
Question 16 - Lamport logical clock [problems with this approach].............................................42 Question 17. Election of a coordinator - a. Bully algorithm && b. Ring algorithm..........................45
Bully algorithm..................................................................................................................................................................45 Ring algorithm..................................................................................................................................................................47
Question 18 - Distributed Mutual Exclusion Algorithm a. Ricart-Agrawala algorithm b. Token ring algorithm.................................................................................................................................49
Ricart-Agrawala algorithm.................................................................................................................................................49 Token ring algorithm.........................................................................................................................................................51
Question 19. Explain Transaction Models...................................................................................53 QP-Sep2010. Distinguish between Strict and sequential consistency with an example for each . . .56
Data-Centric Consistency Models......................................................................................................................................56 Strict consistency (related to absolute global time)..........................................................................................................58 Sequential consistency (what we are used to - serializability)..........................................................................................59
Sequential consistency is a slightly weaker consistency model than strict consistency. ...............59
A data store is said to be sequentially consistent when it satisfies the following condition:..........60 The result of any execution is the same as if the (read and write) operations by all processes on the data store were executed in some sequential order, and the operations of each individual process appear in this sequence in the order specified by its program........................................60 Question 20 - Client centric consistency/ Monotonic read and write.............................................62
Client centric consistency model.......................................................................................................................................62 Monotonic-Read Consistency............................................................................................................................................63
Question 21 -. Replica placement 3 types of replicas a. Client driven, b. Server driven, c. Permanent...............................................................................................................................70 22. Primary based consistency protocols...................................................................................74 Primary-based remote-write protocol with a fixed server to which all read and write operations are forwarded................................................................................................................................75 ...............................................................................................................................................76 Primary-backup protocol in which the primary migrates to the process wanting to perform an update.....................................................................................................................................78 23. Fault tolerance Different types of failures..........................................................................78 24. Design issues Failure masks (Process Resilience)...............................................................79 25. Five different types of classes of failure in RPC system - Solution along with the listing.........81 a. Scalable reliable multicasting................................................................................................81 b. Hierarchical Nonhierarchical scalable..................................................................................81 Fault_Tolerance.pdf 26-34 Slides...............................................................................................81 ...............................................................................................................................................84
26. Explain Virtual synchrony....................................................................................................86 The logical organization of a distributed system to distinguish between message receipt and message delivery.....................................................................................................................87 27. General CORBA architecture and CORBA services.................................................................93 28. Messaging Interoperability................................................................................................95 29. DCOM client server processor architecture of it..................................................................97 QP-Sep2010.With the supporting diagram, explain in detail D-COM.............................................98 30. Globe Object Model Architecture and Services..................................................................100 Question 31 - NFS Architecture (basic), file system operations supported OR Explain the basic NFS architecture for unix systems. Also list any eight file system operations supported by NFS. 103 Question 32 - Naming scheme in NFS with different types of mounting ....................................105 33. Caching and replication scheme in NFS..............................................................................108
................................................................................................................................... 108
34. Organization of CODA file system Fault tolerance and security..........................................109 QP-Sep2010 - With reference to CODA file system, explain communication, process and server replication..............................................................................................................................111 35. DSM System Different algorithms, Granularity & page replacement...................................112 36. List and explain load distribution algorithms in distributed systems and 4 components........116 37. Sender/Receiver distributed algorithm...............................................................................116 38. Adaptive load distribution algorithm..................................................................................116
From 39 to 42 Questions and Answers follow DS-Secu.pdf........................................................116 43. Explain advantages of distributed systems.........................................................................117 QP-Mar2010 - Explain about the architectural model of Distributed system...............................117 44. Access protocol Security for bus and ring topology CDMA...............................................120 45. Message passing models used to develop communication primitives...................................121 QP-Sep2010. Write short notes on memory coherence..............................................................125
Different distribution algorithm components (Transfer policy/Location policy) etc .......................................................129 .......................................................................................................................................................................................129 Load distribution algorithms - Sender initiated distributed algorithms...........................................................................129 .....................................................................................................................................................................................129 Load distribution algorithms -Discuss Adaptive load distributive algorithms...................................................................129
11. Name Resolution: a. Recursive b. Iterative 12. DNS or X-500 Name resolution or issues QP-Sep2010 Write short notes on DNS and X-500 13. Clock synchronization. Also Berkeley algorithm, Averaging 14. Logical clocks with example. 15. Vector Clocks 16. Lamport logical clock [problems with this approach] 17. Election of a coordinator a. Bully algorithm b. Ring algorithm 18. Distributed Election Algorithm a. Ricart-Agrawala algorithm b. Token ring algorithm 19. Explain Transaction Models QP-Sep2010. Distinguish between Strict and sequential consistency with an example for each 20. Client centric consistency/ Monotonic read and write 21. Replica placement 3 types of replicas a. Client driven b. Server driven c. Permanent 22. Primary based consistency protocols 23. Fault tolerance Different types of failures 24. Design issues Failure masks (Process Resilience) 25. Five different types of classes of failure in RPC system - Solution along with the listing a. Scalable reliable multicasting
b. Hierarchical Non hierarchical scalable 26. Explain Virtual synchrony 27. General CORBA architecture and CORBA services 28. Messaging Interoperability 29. DCOM client server processor architecture of it. QP-Sep2010.With the supporting diagram, explain in detail D-COM 30. Globe Object Model Architecture and Services 31. NFS Architecture (basic), file system operations supported 32. Naming scheme in NFS with different types of mounting 33. Caching and replication scheme in NFS. 34. Organization of CODA file system Fault tolerance and security QP-Sep2010 - With reference to CODA file system, explain communication, process and server replication 35. DFS System Different algorithms, Granularity & page replacement 36. List and explain load distribution algorithms in distributed systems and 4 components. 37. Sender/Receiver distributed algorithm 38. Adaptive load distribution algorithm (Load.ppt 7-23 slides for above 3 questions) 39. Authentication distribution technique using: a. Key distribution centre b. Public key cryptography 8-15 slides in DS-Secu.pdf 40. Access control General issues 20-24 slides in DS-Secu.pdf 41. Explain key establishment or key distribution techniques 32-37 slides in DS-Secu.pdf 42. Kerberos authentication issues 36-37 slides in DS-Secu.pdf but not complete
43. Explain advantages of distributed systems QP-Mar2010 - Explain about the architectural model of Distributed system a. 3 different distributed systems architecture i. Minicomputers model ii.Workstation model iii.Processor pool model b. Issues in designing distributed systems (Any 3 issues) 44. Access protocol Security for bus and ring topology CDMA 45. Message passing models used to develop communication primitives 46. Compatibility/Resource management for specific issues Exam Questions QP-Sep2010. Briefly explain reliable client server communication QP-Sep2010. Write short notes on memory coherence QP-Sep2010.Explaing in detail the block cipher DES
Solutions
Question 1 - Goals of distributed system. Define distributed system.
Question 2 -
{SI, MI} x {SD, MD} = {SISD, SIMD, MISD, MIMD} SISD = Single Instruction Single Data E.g. Classical Von Neumann machines. SIMD = Single Instruction Multiple Data E.g. Array Processors or Data Parallel machines. MISD Does not exist MIMD Multiple Instruction Multiple Data Control parallelism.
Question 5 -
1. Request msg may be lost 2. Reply msg may be lost 3. Server & client may crash Some strategies for different RPC msg delivery guarantees 1. Retry request message-retransmit the request msg until either a reply is received or the server is assumed to have failed 2. Duplicate filtering-filtering duplicate requests at the server when retransmissions are used 3. Retransmission of replies-keep a history of reply messages to enable lost replies to be
Essence - Try to use the RPC mechanism as the only mechanism for interprocess communication (IPC). Doors are RPCs implemented for processes on the same machine A single mechanism for communication: procedure calls (but with doors, it is not transparent)
An alternative to iterative name resolution is to use recursion during name resolution. Instead of returning each intermediate result back to the clients name resolver, with recursive name resolution, a name server passes the result to the next name server it finds. So, for example, when the root name server finds the address of the name server implementing the node named nl, it requests that name server to resolve the path name nl:<vu, cs, ftp, pub, globe, index.txt>. Using recursive name resolution as well, this next server will resolve the complete path and eventually return the file index.txt to the root server, which, in turn, will pass that file to the clients name resolve
Comparison
1. The main drawback of recursive name resolution is that it puts a higher performance demand on each name server 2. There are two important advantages to recursive name resolution.
a. caching results is more effective compared to iterative name resolution. b. communication costs may be reduced 3. With iterative name resolution, caching is necessarily restricted to the clients name resolver. Reference: http://www.cs.vu.nl/~ast/books/ds1/04.pdf
1. Peers elect a master 2. Master polls all nodes to give him their times by the clock 3. The master estimates the local times of all nodes regarding the involved message transfer times. 4. Master uses the estimated local times for building the arithmetic mean a. Add fault tolerance 5. The deviations from the mean are sent to the nodes a. Is this better than sending the actual time?
Averaging algorithm
a) The time daemon asks all the other machines for their clock values. b) The machines answer. c) The Time daemon tells everyone how to adjust their clock
Ref:http://www.cis.upenn.edu/~lee/07cis505/Lec/lec-ch6-synch1-PhysicalClock-v2.pdf
Problem: How do we maintain a global view on the systems behavior that is consistent with the happened-before relation? Solution:attach a timestamp Problem:How do we attach a timestamp to an event when theres no global clock? Solution: maintain a consistent set of logical clocks, one per process.
Lamports Algorithm
Each process Pi maintains a local counter Ci and adjusts this counter according to the following rules: 1. For any two successive events that take place within Pi, Ci is incremented by 1. 2. Each time a message m is sent by process Pi, the message receives a timestamp Tm = Ci. 3. Whenever a message m is received by a process Pj, Pj adjusts its local counter Cj:
Drawback of Lamports clocks With Lamports clocks, one cannot directly compare the timestamps of two events to determine their precedence relationship.
Fidges Algorithm
The Fidges logical clock is maintained as follows: 1. Initially all clock values are set to the smallest value. 2. The local clock value is incremented at least once before each primitive event in a process. 3. The current value of the entire logical clock vector is delivered to the receiver for every outgoing message. 4. Values in the timestamp vectors are never decremented. 5. Upon receiving a message, the receiver sets the value of each entry in its local timestamp vector to the maximum of the two corresponding values in the local vector and in the remote vector received. The element corresponding to the sender is a special case; it is set to one greater received, but only if the local value is not greater than that received. than the value
Example
Assign the Fidges logical clock values for all the events in the below timing diagram. Assume that each processs logical clock is set to 0 initially.
Solution
Note that if P receives a victory message from a process with a lower ID number, it immediately initiates a new election. This is how the algorithm gets its name - a process with a higher ID number will bully a lower ID process out of the coordinator position as soon as it comes online.
http://www.scribd.com/doc/6919757/BULLY-ALGORITHM
Ring algorithm
http://www2.cs.uregina.ca/~hamilton/courses/330/notes/distributed/distributed.html
We assume that the processes are arranged in a logical ring; each process knows the address of one other process, which is its neighbour in the clockwise direction. The algorithm elects a single coordinator, which is the process with the highest identier. Election is started by a process which has noticed that the current coordinator has failed. The process places its identier in an election message that is passed to the following process. When a process receives an election message it compares the identier in the message with its own. If the arrived identier is greater, it forwards the received election message to its neighbour; if the arrived identier is smaller it substitutes its own identier in the election message before forwarding it. If the received identier is that of the receiver itself this will be the coordinator. The new coordinator sends an elected message through the ring. Example: Suppose that we have four processes arranged in a ring: P1 P2 P3 P4 P1 P4 is coordinator Suppose P1 + P4 crash Suppose P2 detects that coordinator P4 is not responding P2 sets active list to [ ] P2 sends Elect(2) message to P3; P2 sets active list to [2] P3 receives Elect(2) This message is the first message seen, so P3 sets its active list to [2,3] P3 sends Elect(3) towards P4 and then sends Elect(2) towards P4 The messages pass P4 + P1 and then reach P2 P2 adds 3 to active list [2,3] P2 forwards Elect(3) to P3 P2 receives the Elect(2) message
P2 chooses P3 as the highest process in its list [2, 3] and sends an Elected(P3) message P3 receives the Elect(3) message P3 chooses P3 as the highest process in its list [2, 3] + sends an Elected(P3) message
Question 18 - Distributed Mutual Exclusion Algorithm a. Ricart-Agrawala algorithm b. Token ring algorithm
There are two basic approaches to distributed mutual exclusion: 1. Non-token-based: each process freely and equally competes for the right to use the shared resource; requests are arbitrated by a central control site or by distributed agreement. 2. Token-based: a logical token representing the access right to the shared resource is passed in a regulated fashion among the processes; whoever holds the token is allowed to enter the critical section.
Ricart-Agrawala algorithm
The Ricart-Agrawala Algorithm is an algorithm for mutual exclusion on a distributed system.
Terminology
A site is any computing device which is running the Ricart-Agrawala Algorithm The requesting site is the site which is requesting entry into the critical section. The receiving site is every other site which is receiving the request from the requesting site.
Algorithm
Requesting Site: Sends a message to all sites. This message includes the site's name, and the current timestamp of the system according to its logical clock (which is assumed to be synchronized with the other sites) Receiving Site: Upon reception of a request message, immediately send a timestamped reply message if and only if: the receiving process is not currently interested in the critical section OR the receiving process has a lower priority (usually this means having a later timestamp) Otherwise, the receiving process will defer the reply message. This means that a reply will be sent only after the receiving process has finished using the critical section itself. Critical Section: Requesting site enters its critical section only after receiving all reply messages. Upon exiting the critical section, the site sends all deferred reply messages.
Problems The algorithm is expensive in terms of message trafc; it requires 2(n-1) messages for entering a CS: (n-1) requests and (n-1) replies. The failure of any process involved makes progress impossible if no special recovery measures are taken.
The logical ring topology is created by giving each process the address of one other process which is its neighbour in the clockwise direction. The logical ring topology is unrelated to the physical interconnections between the computers.
The algorithm 1. The token is initially given to one process. 2. The token is passed from one process to its neighbour round the ring.
3. When a process requires to enter the CS, it waits until it receives the token from its left neighbour and then it retains it; after it got the token it enters the CS; after it left the CS it passes the token to its neighbour in clockwise direction. 4. When a process receives the token but does not require to enter the critical section, it immediately passes the token over along the ring. It can take from 1 to n-1 messages to obtain a token. Messages are sent around the ring even when no process requires the token additional load on the network. The algorithm works well in heavily loaded situations, when there is a high probability that the process which gets the token wants to enter the CS. It works poorly in lightly loaded cases. If a process fails, no progress can be made until a reconguration is applied to extract the process from the ring. If the process holding the token fails, a unique process has to be picked, which will regenerate the token and pass it along the ring; an election algorithm
QP-Sep2010. Distinguish between Strict and sequential consistency with an example for each
Data-Centric Consistency Models
A contract between a (distributed) data store and processes, in which the data store specifies precisely what the results of read and write operations are in the presence of concurrency. A data store is a distributed collection of storages accessible to clients:
Strong consistency models: Operations on shared data are synchronized (models not using synchronization operations): 1. Strict consistency (related to absolute global time) 2. Linearizability (atomicity) 3. Sequential consistency (what we are used to -serializability)
4. Causal consistency (maintains only causal relations) 5. FIFO consistency (maintains only individual ordering) Weak consistency models: Synchronization occurs only when shared data is locked and unlocked (models with synchronization operations): 1. General weak consistency 2. Release consistency 3. Entry consistency Observation: The weaker the consistency model, the easier it is to build a scalable solution
Observations 1. Unfortunately, this is impossible to implemented in a distributed system 2. If a data item is changed, all subsequent reads performed on that data return the new value, no matter how soon after the change the reads are done, and no matter which processes are doing the reading and where they are located
A data store is said to be sequentially consistent when it satisfies the following condition: The result of any execution is the same as if the (read and write) operations by all processes on the data store were executed in some sequential order, and the operations of each individual process appear in this sequence in the order specified by its program.
Observations 1. When processes run concurrently on possibly different machines, any valid interleaving of read and write operations is acceptable behavior 2. All processes see the same interleaving of executions. 3. Nothing is said about time 4. A process sees writes from all processes but only its own reads
3. WWW: Caches all over the place, but there need be no guarantee that you are reading the most recent version of a page. Important Client-centric consistency provides guarantees for a single client concerning the consistency of access to a data store by that client No guarantees are given concerning concurrent accesses by different clients
Monotonic-Read Consistency
Example 1: Automatically reading your personal calendar updates from different servers. Monotonic Reads guarantees that the user sees all updates, no matter from which server the automatic reading takes place. Example 2: Reading (not modifying) incoming mail while you are on the move. Each time you connect to a different e-mail server, that server fetches (at least) all the updates from the server you previously visited.
Monotonic-Write Consistency
1. Example 1: Updating a program at server S2, and ensuring that all components on which compilation and linking depends, are also placed at S2. 2. Example 2: Maintaining versions of replicated files in the correct order everywhere (propagate the previous version to the server where the newest version is installed).
Question 21 -. Replica placement 3 types of replicas a. Client driven, b. Server driven, c. Permanent
Replica Placement Model: We consider objects (and dont worry whether they contain just data or code, or both) Distinguish different processes: A process is capable of hosting a replica of an object: Permanent replicas: Process/machine always having a replica (i.e., initial set of replicas) Server-initiated replica: Process that can dynamically host a replica on request of another server in the data store Client-initiated replica: Process that can dynamically host a replica on request of a client (client cache)
Server-Initiated Replicas
Keep track of access counts per file, aggregated by considering server closest to requesting clients Number of accesses drops below threshold D drop file Number of accesses exceeds threshold R replicate file Number of access between D and R migrate file
Update Propagation
Important design issues in update propagation: 1.Propagate only notification/invalidation of update (often used for caches) 2.Transfer data from one copy to another (distributed databases) 3.Propagate the update operation to other copies (also called active replication) Observation: No single approach is the best, but depends highly on available bandwidth and read-to-write ratio at replicas. Pushing updates: server-initiated approach, in which update is propagated regardless whether target asked for it or not.
Observation: We can dynamically switch between pulling and pushing using leases: A contract in which the server promises to push updates to the client until the lease expires. Issue: Make lease expiration time dependent on systems behavior (adaptive leases): Age-based leases: An object that hasnt changed for a long time, will not change in the near future, so provide a long-lasting lease Renewal-frequency based leases: The more often a client requests a specific object, the longer the expiration time for that client (for that object) will be State-based leases: The more loaded a server is, the shorter the expiration times become
Primary-based remote-write protocol with a fixed server to which all read and write operations are forwarded.
Primary-backup protocol in which the primary migrates to the process wanting to perform an update.
A system is said to fail when it cannot meetits promises. A failure is brought about by the existenceof errors in the system. The causeof an error is a fault.
Failure Masking and Replication By organizing a fault tolerant group of processes, we can protect a single vulnerable process. There are two approaches to arranging the replication of the group: 1.Primary (backup) Protocols 2.Replicated-Write Protocols
Failure Masking and Replication Primary-backup protocol. A primary coordinates all write operations. If it fails, then the others hold an election to replace the primary Replicated-write protocols. Active replication as well as quorum based protocols. Corresponds to a flat group. A system is said to be k fault tolerantif it can survive faults in k components and still meet its specifications.
For fail-silent components, k+1are enough to be k fault tolerant. For Byzantine failures, at least 2k+1components are needed to achieve k fault tolerance. Requires atomic multicasting: all requests arrive at all servers in same order.
25. Five different types of classes of failure in RPC system - Solution along with the listing a. Scalable reliable multicasting b. Hierarchical Nonhierarchical scalable
Fault_Tolerance.pdf 26-34 Slides
RPC Semantics and Failures The RPC mechanism works well as long as both the client and server function perfectly. Five classes of RPC failure can be identified: 1.The client cannot locate the server, so no request can be sent. 2.The clients request to the server is lost, so no response is returned by the server to the waiting client. 3.The server crashes after receiving the request, and the service request is left acknowledged, but undone. 4.The servers reply is lost on its way to the client, the service has completed, but the results never arrive at the client 5.The client crashes after sending its request, and the server sends a reply to a newly-restarted client that may not be expecting it.
a)The normal case. b)Crash after service execution. c)Crash before service execution. The Five Classes of Failure (2) An appropriate exception handling mechanism can deal with a missing server. However, such technologies tend to be very language-specific, and they also tend to be non-transparent (which is a big DS no-no). Dealing with lost request messages can be dealt with easily using timeouts. If no ACK arrives in time, the message is resent. Of course, the server needs to be able to deal with the possibility of duplicate requests. The Five Classes of Failure (3) Server crashes are dealt with by implementing one of three possible implementation philosophies: 1.At least once semantics: a guarantee is given that the RPC occurred at least once, but (also) possibly more that once.
2.At most once semantics: a guarantee is given that the RPC occurred at most once, but possibly not at all. 3.No semantics: nothing is guaranteed, and client and servers take their chances! It has proved difficult to provide exactly once semantics. Server Crashes (1) Remote operation: print some text and (when done) send a completion message. Three events that can happen at the server: 1.Send the completion message (M), 2.Print the text (P), 3.Crash (C). Server Crashes (2) These three events can occur in six different orderings: 1.M P C: A crash occurs after sending the completion message and printing the text. 2.M C (P): A crash happens after sending the completion message, but before the text could be printed. 3.P M C: A crash occurs after sending the completion message and printing the text. 4.PC(M): The text printed, after which a crash occurs before the completion message could be sent. 5.C (P M): A crash happens before the server could do anything. 6.C (M P): A crash happens before the server could do anything.
Different combinations of client and server strategies in the presence of server crashes The Five Classes of Failure (4) Lost replies are difficult to deal with. Why was there no reply? Is the server dead, slow, or did the reply just go missing? Emmmmm? A request that can be repeated any number of times without any nasty side-effects is said to be idempotent. (For example: a read of a static web-page is said to be idempotent). Nonidempotentrequests (for example, the electronic transfer of funds) are a little harder to deal with. A common solution is to employ unique sequence numbers. Another technique is the inclusion of additional bits in a retransmission to identify it as such to the server. The Five Classes of Failure (5) When a client crashes, and when an old reply arrives, such a reply is known as an orphan. Four orphan solutions have been proposed: 1.extermination(the orphan is simply killed-off).
2.reincarnation(each client session has an epochassociated with it, making orphans easy to spot). 3.gentle reincarnation(when a new epoch is identified, an attempt is made to locate a requests owner, otherwise the orphan is killed). 4.expiration(if the RPC cannot be completed within a standard amount of time, it is assumed to have expired). In practice, however, none of these methods are desirable for dealing with orphans. Research continues SRM: Scalable Reliable Multicasting Receivers neveracknowledge successful delivery. Only missing messages are reported. NACKs are multicast to all group members. This allows other members to suppress their feedback, if necessary. To avoid retransmission clashes, each member is required to wait a random delay prior to NACKing. Nonhierarchical Feedback Control
Several receivers have scheduled a request for retransmission, but the first retransmission request leads to the suppression of others Hierarchical Feedback Control
The essence of hierarchical reliable multicasting. Each local coordinator forwards the message to its children and later handles retransmission requests.
The logical organization of a distributed system to distinguish between message receipt and message delivery.
The principle of virtual synchronous multicast Message Ordering (1) Four different orderings are distinguished: 1.Unordered multicasts 2.FIFO-ordered multicasts 3.Causally-ordered multicasts 4.Totally-ordered multicasts
Three communicating processes in the same group. The ordering of events per process is shown along the vertical axis.
Four processes in the same group with two different senders, and a possible delivery order of messages under FIFO-ordered multicasting Implementing Virtual Synchrony
(a) Process 4 notices that process 7 has crashed and sends a view change
(b) Process 6 sends out all its unstable messages, followed by a flush message
Process 6 installs the new view when it has received a flush message from everyone else
Corba Services Service Collection Query Concurrency Transaction Event Notification Externalization Life cycle Licensing Naming Property Trading Persistence Relationship Security Time Overview of CORBA services.
Description Facilities for grouping objects into lists, queue, sets, etc. Facilities for querying collections of objects in a declarative manner Facilities to allow concurrent access to shared objects Flat and nested transactions on method calls over multiple objects Facilities for asynchronous communication through events Advanced facilities for event-based asynchronous communication Facilities for marshaling and unmarshaling of objects Facilities for creation, deletion, copying, and moving of objects Facilities for attaching a license to an object Facilities for systemwide name of objects Facilities for associating (attribute, value) pairs with objects Facilities to publish and find the services on object has to offer Facilities for persistently storing objects Facilities for expressing relationships between objects Mechanisms for secure channels, authorization, and auditing Provides the current time within specified error margins
Deferred synchronous
At-most-once
Messaging
Interoperability
Message type Request Reply Originator Client Server Description Contains an invocation request Contains the response to an invocation
Contains a request on the exact location of an object Contains location information on an object Indicates client no longer expects a reply Indication that connection will be closed Contains information on an error Part (fragment) of a larger message
The general organization of ActiveX, OLE, and COM. DCOM Services CORBA Service Collection Query Concurrency Transaction Event Notification Externalization Life cycle Licensing Naming Property Trading Persistence Relationship Security Time
DCOM/COM+ Service ActiveX Data Objects None Thread concurrency COM+ Automatic Transactions COM+ Events COM+ Events Marshaling utilities Class factories, JIT activation Special class factories Monikers None None Structured storage None Authorization None
Windows 2000 Service Distributed Transaction Coordinator Active Directory Active Directory Active Directory Database access Database access SSL, Kerberos None
The general organization of a local object for distributed shared objects in Globe.
Service Collection Concurrency Transaction Event/Notification Externalization Life cycle Licensing Naming Property/Trading Persistence Security
Possible Implementation in Globe Separate object that holds references to other objects Each object implements its own concurrency control strategy Separate object representing a transaction manager Separate object per group of events (as in DCOM) Each object implements its own marshaling routines Separate class objects combined with per-object implementations Implemented by each object separately Separate service, implemented by a collection of naming objects Separate service, implemented by a collection of directory objects Implemented on a per-object basis Implemented per object,
combined with (local) security services Implemented on a per-object basis Implemented per object combined with fault-tolerant services
Yes Yes
Overview of possible Globe implementations of typical distributes-systems services. (this answer is not very specific so please refer chap-09.pdf also)
Question 31 - NFS Architecture (basic), file system operations supported OR Explain the basic NFS architecture for unix systems. Also list any eight file system operations supported by NFS
NFS An industry standard for file sharing on local networks since the 1980s An open standard with clear and simple interfaces Supports many of the design requirements already mentioned: o transparency o heterogeneity o efficiency
fault tolerance
Unix implementation advantages Binary code compatible -no need to recompile applications o Standard system calls that access remote files can be routed through the NFS client module by the kernel Shared cache of recently-used blocks at client Kernel-level server can access i-nodes and file blocks directly o But a privileged (root) application program could do almost the same.
file system operations supported by NFS 1. Create Create a regular file 2. Rename Change the name of a file 3. Mkdir Create a subdirectory under a given directory 4. Rmdir Remove an empty subdirectory from a directory 5. Open Open a file 6. Close Closes a file 7. Read Read the data contained in a file 8. Write Write data to a file.
o Soft - If a file request fails, the NFS client will report an error to the process on the client machine requesting the file access. -The program accessing a file on a NFS mounted file system will hang when the server crashes
o Hard
Replica Servers NFS ver 4 supports replications Entire file systems must be replicated FS_LOCATION attribute for each file Replicated servers: implementation specific
(Incomplete)
QP-Sep2010 - With reference to CODA file system, explain communication, process and server replication
Coda File System
Descendant of CMUs Andrew File System (AFS) AFS Design for Scalability
Whole-file serving:
on opening a file, the entire file is transferred to client
Whole-file caching:
persistent cache contains most recently used files on that computer.
Observations:
shared files updated infrequently working set of single user typically fits into cache on local machine file access patterns what about transactional data (databases)
Coda/AFS Architecture:
Small number of dedicated Vice file servers. Much larger collection of Virtue workstations give users and processes access to the file system. Coda provides globally shared name space. 10
CODA: Communication
Interprocess communication using RCP2
(http://www.coda.cs.cmu.edu/doc/html/rpc2_manual.html)
RPC2 provides reliable RPC over UDP. Support for Side Effects
RPC connections may be associated with Side-Effects to allow application-specific network optimizations to be performed. An example is the use of a specialized protocol for bulk transfer of large files. Detailed information pertinent to each type of side effect is specified in a Side Effect Descriptor. Adding support for a new type of side effect is analogous to adding a new device driver in Unix. To allow this extensibility, the RPC code has hooks at various points where side-effect routines will be called. Global tables contain pointers to these side effect routines. The basic RPC code itself knows nothing about these side-effect routines.
CODA: Communication
Interprocess communication using RCP2
(http://www.coda.cs.cmu.edu/doc/html/rpc2_manual.html)
RPC2 provides reliable RPC over UDP. Support for Side Effects
RPC connections may be associated with Side-Effects to allow application-specific network optimizations to be performed. An
example is the use of a specialized protocol for bulk transfer of large files. Detailed information pertinent to each type of side effect is specified in a Side Effect Descriptor. Adding support for a new type of side effect is analogous to adding a new device driver in Unix. To allow this extensibility, the RPC code has hooks at various points where side-effect routines will be called. Global tables contain pointers to these side effect routines. The basic RPC code itself knows nothing about these side-effect routines.
Coda: Processes
Clear distinction between client and server processes Venus processes represent clients. Vice processes represent servers. All processes realized as collection of user-level threads. Additional low-level thread handles I/O operations (why?)
(Replication incomplete)
It services read/write requests from clients, by returning data items. Timeouts and sequence numbers can be employed for retransmitting requests (which did not get responses). Simple to implement, but central server can become a bottleneck.
Migration Algorithm
(e.g.,) if a page fault occurs, check memory map table. If map table points to a remote page, migrate the page before mapping it to the requesting processs address space. Several processes can share a page at a node. Locating remote page: Use a server that tracks the page locations. Use hints maintained at nodes. Hints can direct the search for a page toward the node holding the page. Broadcast a query to locate a page.
36. List and explain load distribution algorithms in distributed systems and 4 components. 37. Sender/Receiver distributed algorithm 38. Adaptive load distribution algorithm
For these questions read 7-23 slides of load.pdf
Consists of several workstations ( up to several thousands) Each user has a workstation at his disposal, which consist of powerful processor, memory and display With the help of DFS, users can access data regardless of its location Ratio between no. of processors to no. of users is usually 1 e.g. Athena and Andrew Processor Pool Model Ratio between processor to no. of users is normally greater than 1 This model allocates one or more processors according to users need Once the processors complete their jobs, they return to the pool and await a new assignment Amoeba is a combination of the processor pool model and workstation model
B) Issues in DS Global Knowledge Naming Scalability Compatibility Process Synchronization Resource Management Security Structuring Client-Server Model
Global Knowledge Lack of global shared memory, global clock, unpredictable message delays Lead to unpredictable global state, difficult to order events (A sends to B, C sends to D: may be related)
Naming
Need for a name service: to identify objects (files, databases), users, services (RPCs). Replicated directories? : Updates may be a problem. Need for name to (IP) address resolution. Distributed directory: algorithms for update, search, ...
Scalability systems Includes: overheads for message exchange in algorithms used for file system updates, directory management... System requirements should (ideally) increase linearly with the number of computer
Compatibility Binary level: Processor instruction level compatibility Execution level: same source code can be compiled and executed Protocol level: Mechanisms for exchanging messages, information (e.g., directories) understandable.
Resource Management
Data/object management: Handling migration of files, memory values. To achieve a transparent view of the distributed system. Security Authentication and authorization Main issues: consistency, minimization of delays, ..
Structuring Monolithic Kernel: Not needed (e.g.,) file management not needed fully on diskless workstations. Collective kernel: distributed functionality on all systems. Micro kernel + set of OS processes Micro kernel: functionality for task, memory, processor management. Runs on all systems. OS processes: set of tools. Executed as needed.
Object-oriented system: services as objects. Operations on the objects: encapsulated data can be manipulated Object types: process, directory, file,
44. Access protocol Security for bus and ring topology CDMA
Contention Depending on the network topology, a link may connect more than two sites in the computer network, It is possible that several sites will want to transmit information over a link simultaneously This difficulty occurs mainly in a ring or multiaccess bus network. In this case, the transmitted information may become scrambled and must be discarded
Several techniques have been developed to avoid repeated collisions, including collision detection, token passing, and message slots. CSMA/CD: Before transmitting a message over a link, a site must listen to determine whether another message is currently being transmitted over that link; this technique is called carrier sense with multiple access (CSMA). If the link is free, the site can start transmitting Otherwise it must wait (and continue to listen) until the link is free. If two or more sites begin transmitting at exactly the same time (each thinking that no other site is using the link), then they will register a collision detection (CD)and will stop transmitting. Token Passing: A unique message type, known as a token, continuously circulates in the system (usually a ring structure) A site that wants to transmit information must wait until the token arrives. It removes the token from the ring and begins to transmit its messages When the site completes its round of message passing, it transmits the token This action, in turn, allows another site to receive and remove the token, and to starts its message transmission. If the token gets lost, then the systems must detect the loss and generate a new token They usually do that by declaring an election, to elect a unique site where a new token will be generated. A token-passing scheme has been adopted by the IBM and HP/Apollo systems. The benefit of a tokenpassing network is that performance is constant. Message Slots: A number of fixed-length message slots continuously circulate in the system (usually a ring structure) Each slot can hold a fixed-sized message and control information (such as what the source and destination are, and whether the slot is empty or full) A site that is ready to transmit must wait until an empty slot arrives. If then inserts its message into the slot, setting the appropriate control information The slot with its message then continues in the network. When it arrives at a site, that site inspects the control information to determine whether the slot contains a message for this site. If not, that site re-circulates the slot and message. Otherwise, it removes the message, resetting the control information to indicate that the slot is empty.
The send Primitives has two parameters: A message and its destination. The send primitive has also two parameters: The source of a message and a buffer for storing the message. An application of these primitives can be found in client server computation model. Blocking VS. Non Blocking Primitives In the standard message passing model messages are copied three times From user buffer to kernel buffer From kernel buffer on sending computer to kernel buffer to the kernel buffer on receiving computer. From receiving buffer to user buffer. With non-blocking primitive, the send primitive return the control to user process. While the Receiving primitive respond by signaling and provide a buffer to copy. A significant disadvantages of non-blocking is that programming becomes difficult. In the unbufferedoption, data is copied from one user buffer to another user directly. With Blocking primitives, the send primitive does not return the control to the user program until the message has been sent (an unreliable blocking primitive) or until an acknowledgment has been received ( a reliable blocking primitive). In both cases user buffer can be reused.
Synchronous Vs Asynchronous Primitives With synchronous primitive, a send primitive is block until a corresponding Receive primitive is executed at the receiving computer. With asynchronous primitive ,the messages are buffered. A send primitive is not block even if no corresponding execution of a receive primitive.
Remote Procedural Call A More natural way to communicate is through Procedural call: every language supports it . semantics are well defined and understood . natural for programmers to use. Programmer Using such a model must handle the following details: Pairing of responses with request messages. Data representation. Knowing the address of remote machine on the server Taking care of communication and system failure
The RPC Mechanism is based on the observation that a procedural call is well known for transfer of control and data with in a program running an a single machine. On invoking a remote procedure, the calling process is suspended. If any parameter are passed to the remote machine where the procedure will execute. On completion, the result are passed back from server to client and resuming execution as if it had called a local procedure.
RPC System Components Message module IPC module of Send/Receive/Reply responsible for exchanging messages Stub procedures(client and server stubs) a stub is a communications interface that implements the RPC protocol and specifies how messages are constructed and exchanged responsible for packing and unpacking of arguments and results (this is also referred to as marshaling) these procedures are automatically generated by stub generators or protocol compilers (more later) Client stub packs the arguments with the procedure name or ID into a message sends the msgto the server and then awaits a reply msg unpacks the results and returns them to the client Server stub receives a request msg unpacks the arguments and calls the appropriate server procedure
when it returns, packs the result and sends a reply msgback to the client
Symmetric Cryptosystem: DES Data Encryption Standard is used for symmetric cryptosystems. DES is designed to operate on 64-bit blocks of data. A block is transformed in to an encrypted (64bit) block of output in 16 rounds. Where each round uses a different 48-bit key for encryption. Each of these 16 keys is derived from a 56-bit master key
Triple DES or 3DESSecurity concerns over DES led to the creation of Triple DES(3DES). In 3DES,l the plaintext is first encrypted using the key K1,then decrypted using the key K2 and then finally encrypted once again using the key K3. This is called the 3 key triple DES.
Appendix
Questions from old question papers (Test 1)
Introduction
1. 2. 3. 4. 5. 6. 7.
With Suitable examples explain the fields of application of distributed systems? What is global state? Explain the mechanism of distributed snapshot. What are Goals of DS? Explain briefly. Explain the different system architectures of DS. Enumerate the fundamental characters required for DS. How DS operating system differs from normal OS. Discuss about various characteristics of DS.
RPC
1. Describe Remote Procedure call RPC with example. 2. What is Distributed object based system? Discuss how object-based system is differ from conventional RPC system. 3. Explain the mechanism of RPC with diagram.
DNS
1. 2. Write a note on i) DNS ii)X.500 Explain about Directory and Discovery services of Name Services. 3. Discuss the problems raised by the uses of aliases in a name service and indicate how, if at all these may be overcome.
Model
1.
Describe thread synchronization, thread scheduling and thread implementation in Distributed OS. 2. What is fundamental model? Discuss the features. 3. Explain about the Architecture Models of DS. 4. What is Architect Model? Discuss in brief Client-server model.
RMI
1. Describe RMI. Discuss the design issues of RMI.
Mobile
1. 2.
Write a note on i) Mobile code ii)Mobile Agent Differentiate between Mobile agents and codes.
Communication
1. 2. 3.
Discuss in brief the Client Server communication process. Discuss the architecture of CORBA. With a supporting diagram explain the general organization of an internet search engine showing three different layers(UI layer, processing layer and data layer).