0% found this document useful (0 votes)
338 views68 pages

Chapter 4 - Naming: Distributed Systems (IT 441)

This document discusses naming in distributed systems. Naming is fundamental as it allows resources like files, users, and services to be identified and located. There are three key points: 1. Names are mapped to identifiers and addresses to locate resources. Names are human-friendly while addresses specify locations. Naming services resolve names to addresses. 2. Name spaces organize the structure of names in a graph with directories and leaf nodes. Directories group related names. 3. Resolution maps names to identifiers and addresses using bindings stored in name servers. This allows resources to be found regardless of location changes.

Uploaded by

hiwot kebede
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
338 views68 pages

Chapter 4 - Naming: Distributed Systems (IT 441)

This document discusses naming in distributed systems. Naming is fundamental as it allows resources like files, users, and services to be identified and located. There are three key points: 1. Names are mapped to identifiers and addresses to locate resources. Names are human-friendly while addresses specify locations. Naming services resolve names to addresses. 2. Name spaces organize the structure of names in a graph with directories and leaf nodes. Directories group related names. 3. Resolution maps names to identifiers and addresses using bindings stored in name servers. This allows resources to be found regardless of location changes.

Uploaded by

hiwot kebede
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 68

Distributed Systems

(IT 441)

Chapter 4 - Naming
Objectives of the Chapter

 we discuss how
 human friendly names are organized and implemented;
 names are used to locate mobile entities
 to remove names that are no more used, also called garbage
collection

2
Which one is easy for humans and machines? and
why?
 74.125.237.83 or google.com
 128.250.1.22 or distributed systems website
 128.250.1.25 or Prof. Buyya
 Disk 4, Sector 2, block 5 OR /usr/raj/hello.c

 3
3
4. Introduction
 In a distributed system, names are used to refer to a wide variety of
resources such as:
 Computers, services, remote objects, and files, as well as users.
 Naming is fundamental issue in DS design as it facilitates
communication and resource sharing.
 A name in the form of URL is needed to access a specific web
page.
 Processes cannot share particular resources managed by a
computer system unless they can name them consistently
 Users cannot communicate within one another via a DS unless
they can name one another, with email address.
 Names are not the only useful means of identification: descriptive
attributes are another.

4
What are Naming Services?
 How do Naming Services facilitate communication and resource
sharing?
 An URL facilitates the localization of a resource exposed on
the Web.
 e.g., abc.net.au means it is likely to be an Australian entity?
 A consistent and uniform naming helps processes in a
distributed system to interoperate and manage resources.
 e.g., commercials use .com; non-profit organizations use
.org
 Users refers to each other by means of their names (i.e. email)
rather than their system ids
 Naming Services are not only useful to locate resources but
also to gather additional information about them such as
attributes

5
What are Naming Services?
 Definition
In a Distributed System, a Naming Service is a specific
service whose aim is to provide a consistent and uniform
naming of resources, thus allowing other programs or
services to localize them and obtain the required
metadata for interacting with them.

 Key benefits
 Resource localization
 Uniform naming
 Device independent address (e.g., you can move domain
name/web site from one server to another server seamlessly).
6
The role of names and name services

 Resources are accessed using identifier or reference


 An identifier can be stored in variables and retrieved from tables quickly
 Identifier includes or can be transformed to an address for an object
 E.g. NFS file handle, CORBA remote object reference
 A name is human-readable value (usually a string) that can be resolved to an
identifier or address
 Internet domain name, file pathname, process number
 E.g ./etc/passwd, http://www.cdk5.net/
 For many purposes, names are preferable to identifiers
 because the binding of the named resource to a physical location is deferred
and can be changed
 because they are more meaningful to users
 Resource names are resolved by name services
 to give identifiers and other useful attributes

7
Role of Names and Naming Services
- Name Resolution

 66.102.11.10
4

 Client
name IP attributes
www.google.com
www.hotmail.com
……..

 Naming
 100.109.23.
Service 104

8
Cont..
 names play an important role to:
 share resources
 uniquely identify entities
 refer to locations
 etc.
 an important issue is that a name can be resolved to the entity
it refers to
 to resolve names, it is necessary to implement a naming
system
 in a distributed system, the implementation of a naming
system is itself often distributed, unlike in nondistributed
systems
 efficiency and scalability of the naming system are the main
issues

9
4.1 Naming Entities
 Names, Identifiers, and Addresses
 a name in a distributed system is a string of bits or
characters that is used to refer to an entity
 an entity is anything; e.g., resources such as hosts, printers,
disks, files, objects, processes, users, ...
 entities can be operated on; e.g., a resource such as a printer
offers an interface containing operations for printing a
document, requesting the status of a job, ...
 to operate on an entity, it is necessary to access it through
its access point, itself an entity (special)

10
Identity

 Name that uniquely identify an entity


 E.g., your SSN(ID No)

 Identifier properties:
 An identifier refers to at most one entity

 An identifier always refers to the same entity (i.e., it is never reused)


 identifiers allow us to unambiguously refer to an entity

11
Internet Centric View
 Addresses:
 Says how to reach an object  it has location semantics
associated to it
 Usually, a format easy to process by computers
 Name:
 Does not have any location semantics associated to it
 Usually, a format easier to understand/read/remember by
people
 Examples:
 IP address: 169.229.131.109
 Name: arachne.berkeley.edu

12
Naming Systems

 Flat Naming
 Resolves identifiers to addresses

 Structured Naming
 Resolves structured human-friendly names to addresses

 Attributed-based Naming
 Resolves descriptive names to addresses

13
Name Service
 Name space: define the set of possible names and their relationship
 Hierarchical (e.g., Unix and Windows file names)
 Flat
 Bindings: the mapping between names and values (e.g., addresses or
other names)
 Bindings can be implemented by using tables
 Resolution: procedure that, when invoked with a name, returns the
corresponding value
 Name server: specific implementation of a resolution mechanism that
is available on the network and that can be queried by sending
messages

14
Binding and Resolution in the Internet

 In general there are multiple mappings

 Host name: arachne.berkeley.edu



DNS resolution
 IP address: 169.229.131.109
  ARP (Address Resolution Protocol)

 Ethernet MAC address: 12.34.56.78.90.12


15
Mapping

 Multiple names can map onto the same address


 Example: www.berkeley.edu and arachne.berkeley.edu maps
to the same machine (i.e., the same IP address)

 One name can map onto multiple addresses


 Example: www.yahoo.com can be mapped to multiple
machines

16
 access point
 the name of an access point is called an address (such as
IP address and port number as used by the transport layer)
 the address of the access point of an entity is also referred
to as the address of the entity
 an entity can have more than one access point (similar to
accessing an individual through different telephone
numbers)
 an entity may change its access point in the course of time
(e.g., a mobile computer getting a new IP address as it
moves)
17
 an address is a special kind of name

 it refers to at most one entity


 each entity is referred by at most one address; even when
replicated such as in Web pages
 an entity may change an access point, or an access point
may be reassigned to a different entity (like telephone
numbers in offices)
 separating the name of an entity and its address makes it
easier and more flexible; such a name is called location
independent

18
 Examples
 name of an FTP server (entity)

 URL of the FTP server


 address of the FTP server
 IP number:port number
 the address of the FTP server may change

19
4.2 Name Spaces and Name Resolution
 names in a distributed system are organized into a name space
 a name space is generally organized as a labeled, directed
graph with two types of nodes
 leaf node: represents the named entity and stores
information such as its address or the state of that entity
 directory node: a special entity that has a number of outgoing
edges, each labeled with a name

a general naming graph with a single root node


20
 each node in a naming graph is considered as another entity
with an identifier
 a directory node stores a table in which an outgoing edge is
represented as a pair (edge label, node identifier), called a
directory table
 each path in a naming graph can be referred to by the
sequence of labels corresponding to the edges of the path
and the first node in the path, such as
N:<label1, label2, ..., labeln>, where N refers to the first
node in the path
 such a sequence is called a path name
 if the first node is the root of the naming graph, it is called an
absolute path name; otherwise it is a relative path name
 instead of the path name n0:<home, steen, mbox>, we often
use its string representation /home/steen/mbox
 there may also be several paths leading to the same node,
e.g., node n5 can be represented as /keys or
/home/steen/keys
21
 Name Resolution
 given a path name, the process of looking up a name stored
in the node is referred to as name resolution; it consists of
finding the address when the name is given (by following
the path)
 Linking and Mounting
o Linking: giving another name for the same entity (an alias)
e.g., environment variables in UNIX such as HOME that
refer to the home directory of a user
 two types of links (or two ways to implement an alias):
o hard link: to allow multiple absolute path names to
refer to the same node in a naming graph
e.g., in the previous graph, there are two different path
names for node n5: /keys and /home/steen/keys

23
Hard link

24
o symbolic link: representing an entity by a leaf node and
instead of storing the address or state of the entity, the
node stores an absolute path name

the concept of a symbolic link explained in a naming graph

 when first resolving an absolute path name stored in a


node (e.g., /home/steen/keys in node n6), name resolution
will return the path name stored in the node (/keys), at
which point it can continue with resolving that new path
name
25
 so far name resolution was discussed as taking place
within a single name space
 name resolution can also be used to merge different name
spaces in a transparent way
 two methods: mounting and adding a new root node and
making the existing root nodes its children
1. Mounting
 as an example, consider a mounted file system, which
can be generalized to other name spaces as well
 let a directory node store the directory node from a
different (foreign) name space
 the directory node storing the node identifier is called a
mount point
 the directory node in the foreign name space is called a
mounting point, normally the root of a name space
 during name resolution, the mounting point is looked up
and resolution proceeds by accessing its directory table
26
 consider a collection of name spaces distributed across
different machines (each name space implemented by a
different server)
 to mount a foreign name space in a DS, the following are at
least required
 the name of an access protocol (for communication)
 the name of the server
 the name of the mounting point
 each of these names needs to be resolved
 to the implementation of the protocol
 to an address where the server can be reached
 to a node identifier in the foreign name space
 the three names can be listed as a URL

27
 example: Sun’s Network File System (NFS) is a distributed file
system with a protocol that describes how a client can access
a file stored on a (remote) NFS file server
 an NFS URL may look like nfs://flits.cs.vu.nl/home/steen
- nfs is an implementation of a protocol
- flits.cs.vu.nl is a server name to be resolved using DNS
- /home/steen is resolved by the server
 e.g., the subdirectory /remote includes mount points for
foreign name spaces on the client machine
 a directory node named /remote/vu is used to store
nfs://flits.cs.vu.nl/home/steen
 consider /remote/vu/mbox
 this name is resolved by starting at the root directory on
the client’s machine until node /remote/vu, which returns
the URL nfs://flits.cs.vu.nl/home/steen
 this leads the client machine to contact flits.cs.vu.nl
using the NFS protocol
 then the file mbox is read in the directory /home/steen
28
Linking and Mounting
 Name space A  Name space B

 Protocol
 Server
 Mounting point

 Mounting point
 Mount point

 Mounting remote name spaces through a specific process protocol ( i.e. in


NFS) to merge different mane spaces 29
2. Add a new root node and make the existing root nodes its
children
 a method followed in GNS (Global Name Service by DEC)
 problem: existing names need to be changed
e.g., the absolute path name /home/steen has now changed
to a relative path name and corresponds to the absolute
path name /vu/home/steen
 hence the system must expand no:/home/steen to
/vu/home/steen without the awareness of users
 this requires storing a mapping table (with entries such as
n0vu) when a new root node is added
 merging thousands of name spaces may lead to
performance problems

30
A different approach to merge name spaces (with
scalability problems)

 Mapping table
 New root node

Organization of the DEC Global Name Service


Names in GNS always include the id of the node from where resolution should start 31
 The Implementation of a Name Space
 a name space forms the heart of a naming service
 a naming service allows users and processes to add,
remove, and lookup names
 a naming service is implemented by name servers
 for a distributed system on a single LAN, a single server
might suffice; for a large-scale distributed system the
implementation of a name space is distributed over multiple
name servers
 Name Space Distribution
 in large scale distributed systems, it is necessary to
distribute the name service over multiple name servers,
usually organized hierarchically
 a name service can be partitioned into logical layers
 the following three layers can be distinguished

32
 global layer
 formed by highest level nodes (root node and nodes close
to it or its children)
 nodes on this layer are characterized by their stability, i.e.,
directory tables are rarely changed
 they may represent organizations, groups of
organizations, ..., where names are stored in the name
space
 administrational layer
 groups of entities that belong to the same organization or
administrational unit, e.g., departments
 relatively stable
 managerial layer
 nodes that may change regularly, e.g., nodes representing
hosts of a LAN, shared files such as libraries or binaries,

 nodes are managed not only by system administrators, but
also by end users
33
an example partitioning of the DNS name space, including Internet-
accessible files, into three layers 34
 the name space is divided into nonoverlapping parts, called
zones in DNS
 a zone is a part of the name space that is implemented by a
separate name server
 some requirements of servers at different layers
 performance (responsiveness to lookups), availability (failure
rate), etc.
 high availability is critical for the global layer, since name
resolution cannot proceed beyond the failing server; it is also
important at the administrational layer for clients in the same
organization
 performance is very important in the lowest layer, since
results of lookups can be cached and used due to the relative
stability of the higher layers
 they may be enhanced by client side caching (global and
administrational layers since names do not change often)
and replication; they create implementation problems since
they may introduce inconsistency problems (see Chapter 6)
35
Item Global Administrational Managerial

Geographical scale of network Worldwide Organization Department

Total number of nodes Few Many Vast numbers

Responsiveness to lookups Seconds Milliseconds Immediate

Update propagation Lazy Immediate Immediate

Availability requirement Very High High low

Number of replicas Many None or few None

Is client-side caching applied? Yes Yes Sometimes

a comparison between name servers for implementing nodes from a large-


scale name space partitioned into a global layer, an administrational
layer, and a managerial layer

36
Simple DNS Example
 root name
server
Host whsitler.cs.cmu.edu wants IP
address of www.berkeley.edu
 2  4
1. Contacts its local DNS server,  3
 5
mango.srv.cs.cmu.edu
2. mango.srv.cs.cmu.edu contacts
root name server, if necessary
3. Root name server contacts  local name server authorititive name server

authoritative name server, mango.srv.cs.cmu.edu ns1.berkeley.edu


 

ns1.berkeley.edu, if necessary  1  6

 requesting host  www.berkeley.edu


 whistler.cs.cmu.edu
37
 Implementation of Name Resolution
 recall that name resolution consists of finding the address
when the name is given
 assume that name servers are not replicated and that no
client-side caches are allowed
 each client has access to a local name resolver, responsible
for ensuring that the name resolution process is carried out
 e.g., assume the path name
root:<nl, vu, cs, ftp, pub, globe, index.txt>
is to be resolved
or using a URL notation, this path name would correspond
to ftp://ftp.cs.vu.nl/pub/globe/index.txt
 two ways of implementing name resolution
 iterative name resolution
 recursive name resolution

38
 Iterative
 a name resolver hands over the complete name to the root name
server
 the root server will resolve the name as far as it can and return the
result to the client and Each layer resolves as much as it can and
returns address of next name server
 at the minimum it can resolve the first level and sends the name of
the first level name server to the client
 the client calls the first level name server, then the second, ..., until
it finds the address of the entity

the principle of iterative name resolution 39


 Recursive
 a name resolver hands over the whole name to the root name
server
 the root server will try to resolve the name and if it can’t, it
requests the first level name server to resolve it and to return
the address
 the first level will do the same thing recursively

the principle of recursive name resolution 40


Advantages and drawbacks
 recursive name resolution puts a higher performance
demand on each name server; hence name servers in the
global layer support only iterative name resolution
 caching is more effective with recursive name resolution;
each name server gradually learns the address of each
name server responsible for implementing lower-level
nodes; eventually lookup operations can be handled
efficiently
Server for Should Looks Passes to Receives Returns to
node resolve up child and caches requester
cs <ftp> #<ftp> -- -- #<ftp>
vu <cs,ftp> #<cs> <ftp> #<ftp> #<cs>
#<cs, ftp>
nl <vu,cs,ftp> #<vu> <cs,ftp> #<cs> #<vu>
#<cs,ftp> #<vu,cs>
#<vu,cs,ftp>
root <nl,vu,cs,ftp> #<nl> <vu,cs,ftp> #<vu> #<nl>
#<vu,cs> #<nl,vu>
#<vu,cs,ftp> #<nl,vu,cs>
#<nl,vu,cs,ftp>

recursive name resolution of <nl, vu, cs, ftp>; name servers cache
41
intermediate results for subsequent lookups
 communication costs may be reduced in recursive name
resolution

the comparison between recursive and iterative name resolution with


respect to communication costs; assume the client is in Ethiopia and
the name servers in the Netherlands
 Summary
Method Advantage(s)
Recursive Less Communication cost; Caching is more effective
Iterative Less performance demand on name servers
42
 Example 1 - The Domain Name System (DNS)
 one of the largest distributed naming services is the
Internet DNS
 it is used for looking up host addresses and mail servers
 hierarchical, defined in an inverted tree structure with the
root at the top
 the tree can have only 128 levels

43
 Label
 each node has a label, a string with a maximum of 63
characters (case insensitive)
 the root label is null
 children of a node must have different names (to guarantee
uniqueness)

 Domain Name
 each node has a domain
name
 a full domain name is a
sequence of labels
separated by dots (the last
character is a dot;)
 domain names are read
from the node up to the
root
 full path names must not
exceed 255 characters
44
4.3 Locating Mobile Entities
 the naming services discussed so far are used for naming
entities that have fixed locations
 they are not well suited for supporting name-to-address
mappings that change regularly as is the case in mobile
entities
 mobility could be within the same domain or to a different
domain
 e.g. 1; an ftp server called ftp.cs.vu.nl is moved to a new
machine (but within the same domain)
 update only the DNS database of the name server for
cs.vu.nl; lookups are not affected

46
 e.g. 2; ftp.cs.vu.nl is moved to a machine named
ftp.cs.unisa.edu.au, which is in a completely different domain
 two solutions to allow users to continue to access the server
 record the address of the new machine in the DNS
database for cs.vu.nl; lookup operations are not affected;
but if ftp.cs.vu.nl moves once again to a different machine,
the database must be updated, making operations on
nodes at the managerial layer less efficient
 record the name of the new machine, instead of its
address, in the DNS database, making ftp.cs.vu.nl a
symbolic link
 lookup operations become less efficient (2 step process)
 but a further movement needs only a local update (make
ftp.cs.unisa.edu.au a symbolic link)
 but there will be another step added for the lookup
operation
 hence, both approaches have drawbacks

47
 the problems with traditional naming services is that they
maintain a direct mapping between human friendly names and
the addresses of entities
 each time a name or an address changes, the mapping should
also change
 a better solution is to separate naming from locating entities
by introducing identifiers (since it never changes, each entity
has exactly one identifier, and an identifier is never assigned to
a different entity)
 a naming service is used to look up an identifier; it gets a
name as input and returns an identifier as output
 the identifier (obtained from a naming service) can be stored
locally since it does not change
 locating an entity is handled by a location service; it gets an
identifier as input and returns the current address of the
identified entity as output

48
a) direct, single level mapping between names and addresses
b) two-level mapping using identifiers

49
 Location Service
 two solutions for LANs: Broadcasting and Multicasting, and
Forwarding Pointers
1. Broadcasting and Multicasting
 a computer that wants to access another computer for
which it knows its IP address, broadcasts this address
 the owner responds by sending its Ethernet address
 used by ARP (Address Resolution Protocol) in the Internet
to find the data link address (MAC address) of a machine
 broadcasting is inefficient when the network grows
(wastage of bandwidth and too much interruption to other
machines)
 multicasting is better when the network grows - send only
to a restricted group of hosts
 multicasting can also be used to locate the nearest replica
- choose the one whose reply comes in first

50
2. Forwarding Pointers
 when an entity moves from A to B, it leaves behind a
reference to its new location
 advantage
 simple: as soon as the first name is located using
traditional naming service, the chain of forwarding
pointers can be used to find the current address
 drawbacks
 the chain can be too long - locating becomes expensive
 all the intermediary locations in a chain have to maintain
their pointers; vulnerability if links are broken
 hence, making sure that chains are short and that
forwarding pointers are robust is an important issue

51
 Home-Based Approaches
 the previous approaches have scalability problems
 a home location keeps track of the current location of an
entity; often it is the place where an entity was created
 it is a two-tiered approach
 an example where it is used in Mobile IP
 each mobile host uses a fixed IP address
 all communication to that IP address is initially directly
sent to the host’s home agent located on the LAN
corresponding to the network address contained in the
mobile host’s IP address
 whenever the mobile host moves to another network, it
requests a temporary address in the new network and
informs the new address to the home agent
 when the home agent receives a message for the mobile
host it forwards it to its new address and also informs the
sender the host’s current location for sending other
packets 52
home-based approach: the principle of Mobile IP

53
 problems:
 creates communication latency
 the host is unreachable if the home does no more exist
(permanently changed); the solution is to register the home
at a traditional name service
 Hierarchical Approaches
 a generalization of the two-tiered approach into multiple
layers
 a network is divided into a collection of domains, similar to
DNS
 a single top-level domain spans the entire network
 each domain can be subdivided into multiple, smaller
domains
 the lowest-level domain is called a leaf domain; typically a
LAN
 each domain D has an associated directory node dir(D) that
keeps track of the entities in that domain leading to a tree of
directory nodes
 the root (directory) node knows about all entities 54
hierarchical organization of a location service into domains, each having an
associated directory node

55
an example of storing information of an entity having two addresses in
different leaf domains

57
 Pointer Caching
 caching is effective only if the cached data rarely change
 since a mobile entity changes its address regularly, it is not
advisable to cache its address; instead we can cache the
pointers in higher level domains since they don’t change
frequently
 if D is the smallest domain in which a mobile entity moves
regularly, then a lookup operation can start at dir(D); hence
cache dir(D)

59
caching a reference to a directory node of the lowest-level domain in
which an entity will reside most of the time
60
4.4 Removing Unreferenced Entities
 when an entity is no longer referenced (by naming and location
services), it must be removed
 facilities of automatically removing unreferenced entities are
called distributed garbage collectors
 The Problem of Unreferenced Objects
 consider remote objects
 an object can be accessed only if there is a remote reference
to it
 an object for which there is no remote reference to it must be
removed
 but there could be two objects, each storing a reference to
the other, but are not referenced at all; these can be
generalized to two or more objects creating a cycle of
objects referring only to each other
 such objects must be detected and removed

61
 this can be modeled by a graph, where each node represents
an object
 there are special objects, such as system wide services and
users, which need not be referenced themselves, called the
root set
 the hollow nodes represent objects that are not directly or
indirectly referenced by objects in the root set; such objects
must be removed

an example of a graph representing objects containing references to each other 62


 Reference Counting
 Simple Reference Counting
 in uniprocessor systems, to check whether an object can
be deleted is to simply count the references to that object
 increment a counter when a reference to an object is
created, decrement the counter when a reference is
removed; the object is removed when the counter reaches
zero
 Problems in simple reference counting for distributed
systems
 with unreliable communication, an acknowledgement
message to increment or decrement a counter may be
lost, leading the sender to retransmit

63
the problem of maintaining a proper reference count in the presence of
unreliable communication

 a mechanism of detecting duplicate messages is required

64
 another problem occurs when copying a remote reference to
another process
 P1 passes a reference to P2, but the object is yet unaware of
the new reference; then P1 removes its own reference before
P2 contacts the object; creating a race condition
 solution: let P1 first inform the object that it is passing a
reference to P2

a) copying a reference to another process and incrementing the counter


too late
b) a solution 65
 the race condition can be avoided if the counter is never
incremented, it is only decremented
 this is done using weighted reference counting in which
each object has a fixed total weight
 when an object is created, the total weight is stored in its
skeleton along with a partial weight, initialized to the total
weight

the initial assignment of weights in weighted reference counting

66
 when a new remote reference is created, half of the partial
weight is assigned to the new proxy

weight assignment when creating a new reference

67
 when a remote reference is duplicated, half of the partial
weight of the proxy is assigned to the new proxy

weight assignment when copying a reference


 when a reference is destroyed, a decrement message is sent
to the object’s skeleton, which subtracts the partial weight of
the removed reference from its partial weight
 the object is removed when the partial weight reaches zero
(assuming that no messages are lost or duplicated)
 main drawback: only a limited number of references can be
created (when the counters drop to zero) - read the book for
suggested solutions 68
 Reference Listing
 let the skeleton keep track of the proxies that have a
reference to it; the object can be destroyed when the list is
empty
 adding a proxy (if it is already in the list) and removing a
proxy (when it is not already in the list) have no effect; i.e.,
adding and removing proxies are idempotent operations -
unlike increment and decrement
 hence communication reliability is not important
 adding and removing a reference must be acknowledged by
the skeleton
 drawback: may scale badly if the list grows

 an operation is said to be idempotent if it can be repeated


without affecting the end result

69
 Identifying Unreachable Entities
 some entities can’t be reached from the root set and must be
removed; but the garbage collection methods discussed so
far fail to locate these entities
 we need methods by which all entities can be traced and to
remove those that can not be reached from the root set;
such methods are called tracing-based garbage collection

70
 Naive Tracing in Distributed Systems
 in a uniprocessor system mark-and-sweep collectors are
used
 they use two phases
 mark phase: trace all entities from the root set and mark
them (such as recording the entity in a table)
 sweep phase: search those that are not marked (those to
be removed)
 drawbacks: to ensure that the reachable graph remains the
same, all executing programs needs to be stopped
temporarily and execution is switched to garbage
collection; a scenario called stop-the-world and is not
desirable for distributed garbage collectors

71
Th
a nk
Yo
u!
!

72

You might also like