0% found this document useful (0 votes)
171 views23 pages

OMONDI

The document discusses various approaches to managing distributed shared memory in a distributed system. It describes centralized approaches where a single centralized manager tracks all data locations and handles all requests. It also describes distributed manager approaches where management tasks are distributed among machines, including fixed distribution of pages, broadcast of requests, dynamic ownership transfer, and distribution of copy sets. The document provides details on the advantages and disadvantages of different approaches.

Uploaded by

alfayoAbuto
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as RTF, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
171 views23 pages

OMONDI

The document discusses various approaches to managing distributed shared memory in a distributed system. It describes centralized approaches where a single centralized manager tracks all data locations and handles all requests. It also describes distributed manager approaches where management tasks are distributed among machines, including fixed distribution of pages, broadcast of requests, dynamic ownership transfer, and distribution of copy sets. The document provides details on the advantages and disadvantages of different approaches.

Uploaded by

alfayoAbuto
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as RTF, PDF, TXT or read online on Scribd
You are on page 1/ 23

CALVINCE HARST ABUTO

MUC-BIT-0001/2008 DISTRIBUTED SYSTEM ICS2403. Q. When a process requires a piece of non-local data, the system must include a mechanism to find and retrieve this data. Discuss the centralized and distributed manager approach to this problem.

I will use the distributed memory and centralized memory management for this approach: 1 Introduction
Distributed Shared Memory (DSM) describes distributed shared memory as an abstraction used for sharing data between processes that do not share physical memory. Figure 1 shows the logical view and physical situation in DSM. The logical view is one of several machines sharing a centralized memory. The physical situation, however, is quite different; the distributed machines have their own local memory and

are connected to one another through the interconnection network. With the addition of appropriate
software these individual machines are able to directly address memory in another machines local

address space. The overall objective in DSM is to reduce the access time for non-local memory to as close as possible to the access time of local memory.

P P P P

Interconnection Network

Physical situation: Distributed Memory Logical view: Shared Memory Fig. 1. Logical view and physical situation of distributed shared memory The advantages of DSM
Increases the ease of programming by sparing programmers the concerns of message passing; Allows the use of algorithms and software written for shared memory multiprocessors on

distributed systems;
Distributed machines scale to much larger numbers than multiprocessors resulting from the

absence of hardware bottlenecks; and Enables the use of low cost of distributed memory machines.

Design choices
There are several design choices that are made when implementing Distributed Shared Memory. These are: Structure and Granularity of the Shared Memory; Coherence Protocols and Consistency Models; Synchronization;
2

Data location and access; Heterogeneity; Scalability; Replacement Strategy; and Thrashing.

Data location and access When a process requires a piece of non-local data the system must include a mechanism to nd and retrieve this data. If the data is not migrated or replicated, this is trivial since the data exists only in the central server or remains xed. However, if the data is allowed to migrate or is replicated there are several possible solutions to the location problem. This can be subdivided into: Centralized approaches; and Distributed manager approaches. Centralized approaches In these approaches there is a single centralized manager for the whole shared memory.
Monitor-like centralized manager approach, where a central memory manager acts like a monitor. It synchronizes all access to each piece of data, keeps track of all replicated copies of the data through the copy set information and has information about the owner of any page. Any

machine requiring access to a page sends a request to the manager. The owner of a page is the
machine that has write privileges on that page and the copy set is the information regarding the

location of all replicated pages on the network. Improved centralized manager approach, where, asopposed to the monitor-like

approach, the access to data is no longer synchronized by the central manager. The central manager still maintains the copy set of the replicated pages and ownership
information. Thus, any machine requiring access to a page still sends a request to the

manager which has the information regarding the owner of that page. 2.4.2 Distributed manager approaches The centralized approaches can cause a potential bottleneck. The following approaches provide a means to distribute the management tasks among the machines. Fixed distributed manager approach, where each machine is given a predetermined

subset of the pages to manage. A mapping function, say a hashing function provides the
mapping between pages and machines. A page fault causes the faulting machine to apply the mapping function to locate the machine on which the manager resides. The faulting

machine can then get the location of the true page owner from the manager of that page. Broadcast distributed manager approach, where each machine manages the pages it

owns and read and write requests cause a broadcast message to be sent to locate the
owner of the required page. A write broadcast request results in the owner invalidating all pages in its copy set and itself and sending the page to the requesting machine. A read

28 May 1997

request causes the owner to send a copy of the page to the requesting machine and to add

to its copy set.


Dynamic distributed manager approach, where a write request results in the ownership being transferred to the requesting machine. The copy set information moves with the ownership. Each machine maintains a table with a variable probowner, the probable

owner of each page, which provides a hint to the actual owner of the page. If probowner
is not the actual owner it provides the start of a sequence through which the true owner

can be found. The

probowner

received through a broadcast message.

eld is updated whenever an invalidation request is

Improvement using fewer broadcasts, where a reduced number of broadcasts is required compared to the previous two distributed approaches. The latter required a broadcast to
be issued for every invalidation updating the owner of the page. This approach still uses the probowner variable but only enforces a broadcast message updating probowner after

every M page faults.


Distribution of copy sets, where copy sets are maintained on all machines which have a valid copy of the data. A read request can be satised by any machine with a valid copy which then adds the requesting machine to its copy set. Invalidation messages are propagated in waves through the network starting at the owner which sends invalidation

messages to its copy set which in turn send invalidation messages to their copy sets. 2.5 Heterogeneity Sharing data between heterogeneous machines is an important problem for distributed shared memory designers [Nitzberg et al. 94]. Data shared at the page level is not typed, hence accommodating different data representations of different machines, languages or operating systems is a very difcult problem. The Mermaid approach mentioned in [Li et al. 88] is to only allow one type of data on an appropriately tagged page. The overhead of converting the data might be too high to make DSM on a heterogeneous system worth implementing [Tam et al. 90]. 2.6 Scalability One of the benets of distributed systems [Nitzberg et al. 94], [Tanenbaum 95] is that they scale better than many tightly-coupled, shared-memory multiprocessors. However this advantage can be lost if the scalability is limited by bottlenecks. Just as the buses in tightlycoupled multiprocessor systems limit their scalability so too do operations which require global information or distribute information globally in distributed systems such as broadcast messages [Nitzberg et al. 94]. 2.7 Replacement Strategy Replacement strategies are required to decide which blocks on a machine should be replaced by blocks of data being copied or migrated to that machine. The replacement strategies used most often in DSM implementations are similar to the replacement strategies used in caching: least recently used. The block which has been the least recently used is replaced; or

random replacement strategies [Nitzberg et al. 94], [Tanenbaum 95]. 2.8 Thrashing This is a problem when non-replicated data is required by more than one process or replicated data is written often by one process and read often by other processes. Strategies to reduce

28 May 1997

thrashing, such as allowing replication have to be implemented [Nitzberg et al. 94], [Tam et al. 90].

3 Implementation
This section covers implementation issues, the approaches to implementation and the categories of current implementations. It covers, in more detail, the three categories Tanenbaum discusses in his book [Tanenbaum 95] and the implementations given as examples in his book. 3.1 Basic Schemes for Implementing DSM There are four basic approaches for the implementation of DSM. Stumm and Zhou describe them as [Stumm et al. 1990] follows: Central Server; Migration; Read Replication; and Full Replication Schemes. 3.1.1 Central Server Scheme This is the simplest scheme for implementing DSM, depicted in Figure 2. The central server maintains the only copy of the data and controls all accesses to the data. In fact, the central server carries out all operations on the data. Thus, a request to perform an operation upon a piece of data is sent to the central server which receives the request, accesses the data and

sends a response to the requesting machine. The advantages of this scheme are that it is easy to implement, controls all synchronization, avoids all consistency related problems but it can introduce a considerable bottleneck to the system.

Send request Receive response Send request Receive response

Client Client

Receive request Access data Send response

Central Server Figure 2. Central Server scheme

3.1.2 Migration Scheme

28 May 1997

In the Migration Scheme, as in the Central Server Scheme (Figure 3), only one copy of the data is maintained on the network, however, control of the memory and the memory itself are now distributed across the network. A process requiring access to a non-local piece of data sends a request to the machine holding that piece of data, the machine sends a copy of the data to the requesting machine and makes its copy of the data invalid.
This scheme has the advantage of being able to be incorporated into the existing virtual

memory system of the local operating system, like the central server scheme it has no consistency problems and synchronized access is implicit. However, it has the disadvantage of possibly causing thrashing when more than one process requires the same piece of data.

3.1.3 Read Replication Scheme

Send request for non-local bloc Receive response Access data

Figure 3. Migration scheme

Receive request Send block

The read replication scheme (Figure 4) allows multiple read-only copies of a piece of data and
a single read/write copy of the data over the network. A process requiring write access to a piece of data sends a request to the process which currently has write access to the data. All existing read-only copies of the data are invalidated before the access is granted and the

requesting process can alter the data. The advantage that this scheme offers is that multiple processes can now have read access to the same piece of data, making read operations less expensive, however it can increase the cost of write operations since multiple copies have to be invalidated. Thus, this scheme would be indicated in an application where the number of reads far exceeds the number of writes. 3.1.4 Full Replication Scheme The full replication scheme (Figure 5) allows multiple readers and writers. One method of keeping the multiple copies of the data consistent is to implement a global sequencer which attaches a sequence number to each write operation which allows the system to maintain sequential consistency.
This scheme reduces the cost of data migration and invalidation, when a write is

requested, but introduces the problem of maintaining consistency

Receive ack

scheme

Update local data Send data to be written Receive block Multicast invali-

Sequencer

date and access data

Receive data Add sequence number and multicast update

Receive request Send block Invalidate block

Figure 5. Full Replication scheme. Figure 3. Read Replication

Send data to be written

28 May 1997
Invalidate block

receive update update local data

3.2 Implementation Categories Generally DSM implementations can be divided into various categories. Nitzberg and Lo [Nitzberg et al. 94] categorize implementations into: Hardware Implementations, extend traditional caching techniques to scalable

architectures, i.e. implement DSM with the addition of specialized hardware to the system;
Operating System and Library Based Implementations, achieve sharing and coherence

through virtual-memory-management mechanisms; and Compiler Implementations, synchronization and coherence primitives. These implementation categories are similar to those used by [Coulouris et al. 93]: shared accesses are automatically converted into

28 May 1997

Hardware-based Implementations. Which rely on specialized hardware to handle load


and store instructions applied to addresses in DSM, and communicate with remote

memory modules; Page-based Implementations. Which implement DSM as a region of virtual memory
occupying the same address range in the address space of every participating process;

and Library-based Implementations. Where DSM is implemented through communication between instances of the language run-time.. Tanenbaum [Tanenbaum 95] includes a gure showing all shared memory machines in relation to one another as follows in Figure 6. The three types of DSM shown here, managed by MMU (Memory Management Unit, a hardware device), managed by OS, and managed by language

runtime system match the categories given by both [Nitzberg et al. 94] and [Coulouris et al. 93]. However, in his book [Tanenbaum 95] Tanenbaum concentrates on software based systems since he considers that hardware based solutions do not constitute true DSM.
Software-controlled caching Hardware-controllers caching

Managed by OS Managed by MMU

Managed by language runtime system

Switched multi-processor Single-bus multi-processor Dash

NUMA Machine

Pagebased DSM

Shared-variable DSM

Object-based DSM

Cm*

Ivy

Munin

Linda

Sequent Tightly Coupled


Transfer

Firey Cache Block


Block

Alewife Cache Page

Buttery Page

Mirage Data structure

Midway Object

Orca

Loosely Coupled

Unit

Remote access in Software Remote access in Hardware

Figure 6. The spectrum of shared memory machines [Tanenbaum, 95]

The software based solutions discussed in [Tanenbaum 95] are essentially divided into the same areas as the software implementations listed by the other authors above with the addition of shared variable techniques. These are an extension of the page based solutions where a suitably annotated section containing the shared variables is shared. Tanenbaums categories follow: Page-Based Techniques; Shared Variable Techniques; and Object Based Techniques;

28 May 1997

3.2.1 Page-Based Techniques (IVY type, Dash) The syntax of memory access in this type of DSM is the same as that in a shared memory
multiprocessor. Variables are directly referenced unless they are shared by more than one

process in which case they would have to be protected explicitly by locks. Page-based DSM is an attempt to emulate multiprocessor cache. The total address space is subdivided into equal sized chunks which are spread over the machines in the system. A request by a process to access a non-local piece of memory results in a page fault, a trap occurs and the DSM software fetches the required page of memory and restarts the instruction.
In page-based DSM a decision has to be made whether to replicate pages or maintain

only one copy of any page and move it around the network. In the latter case a situation can arise where a page is moved backwards and forwards between two or more machines which share data on the same page and are accessing it often, this can drastically increase network trafc and reduce performance. Replication of pages can reduce the trafc, however, consistency must then be maintained between the replicated pages.
Some of the consistency models based on those used for cache consistency can be used

in distributed systems using page-based DSM. The weakening of consistency models can improve performance
The granularity of the pages has to be decided before implementation. A substantial

overhead in the transportation of data across the network is the setup time, hence the larger the page the cheaper it is to transport it across the network [Tanenbaum 95]. Another advantage of large pages is that processes are less likely to require more than one page. However, large pages can cause false sharing, where two processes use two unrelated variables on the same page [Nitzberg et al. 94]. This can result in the page moving backwards and forwards between the two machines unnecessarily. A solution for this is for the compiler to anticipate false sharing and locate the variables appropriately in the address space. Smaller pages may prevent false sharing but increase the chance that more than one process will require the pages. Page-based distributed shared memory is a simple, familiar, well understood model which is easily implemented and can run existing multiprocessor programs. However, the biggest issue in this implementation is that it can exhibit poor performance, because of the network trafc generated by false sharing and strict consistency protocols [Tanenbaum 95].
Tanenbaum gives IVY type implementations as examples of page-based solutions. IVY

as described in [Hellwagner 1990] and [Li et al. 89] is regarded as the pioneering work on DSM. Memory mapping managers are used to map the addresses, local memories and the distributed memory. The unit of granularity is the page. The memory coherence protocol used in IVY is derived form the Berkley cache coherence protocol [Tam et al. 90]. It uses page ownership and invalidation. IVY was developed in an attempt to explore whether DSM was feasible. Hellwagner [Hellwagner 1990] states that this was shown even when the DSM was implemented in software. 3.2.2 Shared Variable Techniques (Munin, Midway) In shared variable techniques only the variables and data structures required by more than one process are shared. The problems associated with this technique are very similar to those of maintaining a distributed database which only contains shared variables [Tanenbaum 95].
In the current implementations of the shared-variable technique the shared variables are

identied as type shared[Tanenbaum 95]. Synchronization for mutual exclusion is achieved

using special synchronization variables. This makes synchronization the responsibility of the programmer.
The replication of the shared variables brings with it the problem of how to maintain

consistency. While updating a page required the rewriting of the whole page, in the shared

28 May 1997

variable implementation an update algorithm can be used to update individually controlled variables [Tanenbaum 95]. Nevertheless, a consistency protocol has to be decided upon when the system is being implemented.
This approach is a step towards ordering shared memory in a more structured way than

page based systems [Tanenbaum 95]. However programmers must still provide information about which variables are shared and control access to shared variables through semaphores and locks which makes programming more difcult and it is possible for the programmer to compromise consistency.
Munin is given by Tanenbaum as an example of shared-variable based DSM. It is

described in [Bennett et al. 90] and [Hellwagner 1990]. The address space in each machine is divided into shared and private address space. The latter contains the runtime memory coherence structures and the global shared memory map. The memory on each machine is viewed as a separate segment. Each data item in Munin is provided with a memory coherence mechanism suitable for the access it requires. The type of each data item is to be supplied for each item by either the user or a smart compiler. A memory fault will cause the runtime system to check the objects type and call a suitable mechanism to handle that object type.
Midway [Bershad et al. 93] is another example, given by Tanenbaum, of the shared

variable technique. Bershad, et al. write that Midway supports multiple consistency models in each program, which may all be active at the same time, i.e. processor consistency, release consistency, or entry consistency. The implementation of Midway is made up of three main components [Bershad et al. 93]: a set of keywords and function calls used to annotate a parallel program; a compiler which will generate and maintain reference information; and a runtime system to implement several consistency models. Tanenbaum [Tanenbaum 95] says Programs in Midway are basically conventional programs written in C, C++ or ML, with certain additional information provided by the programmer. 3.2.3 Object Based Techniques (Orca, Linda) An object can be dened as a programmer-dened encapsulated data structure which comprises data and methods [Tanenbaum 95]. In object-based DSM memory can be conceptualized as an abstract space lled with objects. Processes on multiple machines share these objects. The management and location of objects are handled at the level of the object operation by the operating system. Thus the programmer does not have to worry about synchronization as this is handled implicitly in the object denition. One of the issues to be decided before implementation is whether to allow replication. If it is not allowed the data can be thought of as immutable and all access must be through the methods associated with the only copy of each object. This may lead to poor performance. Objects can instead be allowed to migrate or be replicated. Migration can improve performance by moving objects to where they are needed. Replication can improve performance even further but consistency has to be taken into consideration.
Tanenbaum [Tanenbaum 95] cites Orca and Linda as examples of object based

distributed shared memory. The shared data in Orca are encapsulated in data objects, which are instances of user-dened data types. The authors of the Orca language say the key idea in Orca is to access shared data structures through higher level operations. Thus, the

programmer is able to dene operations to manipulate the data structures. Only these operations can be carried out upon the encapsulated data objects. The parallel processes are created through the explicit creation of sequential processes that execute in parallel with one

28 May 1997

another [Bal et al. 92]. Linda [Carriero et al. 86] is a language which consists of four simple operators which are added to a host language to turn it into a parallel programming language. Linda processes communicate through a globally shared collection of ordered tuples these tuples are exist in a global abstract tuple space. Despite Tanenbaums apparent enthusiasm for the object-based model, the authors of Linda say the object-based model is powerful and attractive, but utterly irrelevant to parallelism [Carriero et al. 86], [Tanenbaum 95].

4 Conclusion
Since one of the main goals of Distributed Systems is transparency, the achievement of this when DSM is implemented is only possible if the use the shared memory is completely invisible. Hellwagner [Hellwagner 1990] says that the overall objective in DSM is to reduce the access time for non-local memory to as close as possible to the access time of local memory. Thus, the major research push appears to be in the area of the reduction of access times for the distributed memory. This has been looked at from the perspective of various consistency models, data location and access methods and the granularity and structure of the shared data. All of these are related to a reduction in the number of messages being sent between the distributed machines since this is seen as the major overhead. Centralized systems monitoring the shared data and controlling the access have been implemented and have been largely rejected because they can cause bottlenecks and require a large number of messages between the nodes and the central server. The move to distributed control of the shared memory and the replication of shared data has brought with it many problems related to the maintenance of coherence. The relaxation of consistency models has lead to a reduction in the number of messages but a complication of the programming model, particularly with the addition of explicit synchronization primitives to identify and control access to shared data. Further research could be done in this area particularly with reference to the use of distributed synchronization algorithms and the use of lazy synchronization techniques. In the latter a lock on a critical region is not released by a process until requested by another process.

5 Acknowledgements
I would like to thank my supervisor Professor Andrzej Goscinski and all the members of

the RHODOS group for their assistance in this paper.

6 Bibliography
[Adve et al. 91] Sarita Adve, Mark Hill, Barton Miller, Robert Netzer, Races on Weak Memory Systems., Proceedings 18th Annual International Symposium on Detecting Data

Computer Architecture, May 1991.


[Bal et al. 92] H. Bal, M. Kaashoek, A. Tanenbaum, Orca: A Language for Parallel

Programming of Distributed Systems., IEEE Transactions on Software Engineering,

Vol. 18, No. 3, March 1992. [Bennett et al. 90] J. Bennett, J. Carter, W. Zwaenepoel, Munin: Distributed Shared Memory Based on Type-Specic Memory Coherence., Proceedings Second ACM Symposium on Principles and Practice of Parallel Programming, ACM, pp. 596-615, Dec. 1984. [Bershad et al. 91] Brian N. Bershad, Matthew J. Zekauskas, Shared Memory Parallel
Programming with Entry Consistency for Distributed Memory Multiprocessors, CMU

Technical Report CMU-CS-91-170, September 1991.


[Bershad et al. 93] B. Bershad, M. Zekauskas, W. Sawdon, The Midway Distributed Shared Memory System, Proceedings IEEE COMPCON Conference., IEEE, pp 528-537,

28 May 1997

1993. [Carriero et al. 86] Nicholas Carriero, David Gelernter, ACM

The S/Nets Linda Kernel.,

Transactions on Computer Systems, Vol. 4, No. 2, May 1986, Pages 110-129.


[Coulouris et al.93] G. F. Coulouris, J. Dollimore, T. Kindberg, Distributed Systems. Concepts

and Design. Addison-Wesley Publishing Company.


[Hellwagner 1990] H. Hellwagner, A Survey of Virtually Shared Memory Systems, TUM-

19056 Technische Universitat Munchen, Dec. 1990. [Jul et al. 88] E. Jul, H. Levy, N. Hutchinson, A. Black, Fine-Grained Mobility in the Emerald System., ACM Transactions on Computer Systems, Vol.6, No. 1, February 1988, pp 109-133.
[Lamport 79] L. Lamport, How to make a multi processor computer that correctly executes

Multiprocess programs., IEEE Transactions on Computers, Vol. C-28, September 1979, pp 690-691.
[Levelt et al. 92] Levelt, W.G., Kaashoek, M.F. Bal, H.E., and Tanenbaum, A.S.: A Comparison of Two Paradigms for Distributed Shared Memory, Software--Practice

and Experience, vol. 22, Nov. 1992, pp. 985-1010. [Li et al. 89] K. Li, P. Hudak, Memory Coherence in Shared Virtual Memory Systems, ACM Transactions on Computer Systems, Vol. 7, No. 4, pp 321-359, November 1989.
[Li et al. 88] Kai Li, Michael Stumm, David Wortman., Shared Virtual Memory

accommodating Heterogeneity, Technical Report CS-TR-210-89, February 1989. [Libes 85] Don Libes, User-Level Shared Variables, Proceedings, Tenth USENIX Conference, Summer 1985.
[Mosberger 94] D. Mosberger, Memory Consistency Models, Tech., Report TR 93/11, Dept. of

Computer Science, Univ. of Arizona, 1993. [Nitzberg et al. 94] B. Nitzberg, V. Lo, Distributed Shared Memory: A Survey of Issues and
Algorithms, In Casavant T.L. and Singal M. (eds), Readings in Distributed Computing

Systems, IEEE Press, 1994, pp 375-386. [Ramachandran et al. 91] U. Ramachandran, M. Khalidi, An Implementation of Distributed Shared Memory, Software - Practice and Experience, Vol.21(5), John Wiley and Sons, Ltd., May 1991. [Sinha 93] Himansu Shekhar Sinha, Mermera: Non-Coherent Distributed Shared Memory for Parallel Computing., PhD Thesis, Boston University, Boston., April 1993.
[Stumm et al. 1990] M. Stumm S. Zhou, Algorithms Implementing Distributed Shared

Memory, IEEE Computer, Vol. 23, No 5, May 1990, pp 54-64.


[Tam et al. 90] M. Tam, J. Smith, D. Farber, A Survey of Distributed Shared Memory Systems.,

ACM SIGOPS, June 1990. [Tanenbaum 95] A. Tanenbaum, Distributed Operating Systems., Prentice Hall, 1995. [Zekauskas et al. 94] Matthew J. Zekauskas, Wayne A. Sawdon and Brian N. Bershad,
Software Write Detection for a Distributed Shared Memory, Proceedings of the First

Symposium on Operating Systems Design and Implementation (OSDI), 1994.

You might also like