HARAMAYA UNIVERSITY
COLLEGE OF COMPUTING AND INFORMATICS
DEPARTMENT OF SOFTWARE ENGINEERING
FUNDAMENTALS OF DISTRIBUTED SYSTEMS
CHAPTER 5
NAMING SYSTEM
Compiled By: Gizachew B.
2
Contents
Introduction
Names, Identifiers and Addresses
Name Resolution
Naming Systems
Flat Naming
Structured Naming
Attribute-based Naming
3 Introduction (1)
Naming is fundamental in distributed system design.
Names are used to denotes an entities in a distributed system.
Any process that requires access to a specific resource must possess a name or an identifier for it.
For example: Filename, URLs, Domain Name etc.
To operate on an entity, we need to access it at an access point.
Access points are entities that are named means of an address.
Processes cannot share particular resources managed by a computer system unless they can name
them consistently.
Users cannot communicate with one another via a distributed system unless they can name one
another.
4 Introduction (2)
A name may also be thought of as a logical object that identifies a physical object to which it is bound from among
a collection of physical objects.
Therefore, the correspondence between names and objects is the relation of binding logical and physical objects for
the purpose of object identification.
The OSI Reference Model Definition of Name:
"a linguistic construct which corresponds to an object in some universe of discourse."
It covers:
o Data items which identify objects by their location. Such names are called addresses.
o It also covers names which are assigned to objects. Such names are called titles.
5 Introduction (3)
Naming system comprises of two important mechanisms.
1. The naming mechanism: a facility which enables users and programs to assign character-string names to objects
and use the names to refer those objects.
2. The locating mechanism: a facility which is an integral part of the naming facility, maps an object’s name to the
objects location in a distributed system.
The naming and locating facilities jointly form a naming system that hides the details of how and where an object is
actually locating in the network.
6 Introduction (4)
In a distributed system, the implementation of a naming system is itself often distributed across multiple
machines.
How this distribution is done plays a key role in the efficiency and scalability of the naming system.
In a distributed system, names are used to refer to a wide variety of resources such as computers, services,
remote objects and files, as well as to users.
The entities named can be of many types, and they may be managed by different services.
Names are used to communicate and share resources, to uniquely identify entities, to refer to locations, and etc.
To resolve names, it is necessary to implement a naming system.
The difference between naming in distributed systems and non-distributed systems lies in the way naming
systems are implemented.
7 Characteristics of Good Naming System
Location Transparency: name of the object must not indicate and dependent on physical location of the object
directly or indirectly.
Location independency: Name of the object need not be changed when the object’s location is changed. User must
be able to access the object irrespective of the node from where it is being accessed.
Scalability: any changes in a system scale should not require any change in naming or location.
Uniform Naming Convention: use the same naming mechanism for all types of objects to reduce the complexity
of the design.
Allow multiple user-defined names for the same object: naming system must have the flexibility of assigning
multiple user defined names to the same object. It should be possible for the user to either change or delete the user
defined name without affecting the name given by other users.
8 Naming Entities ………. (1)
Names, Identifiers, and Addresses
A name in a distributed system is a string of bits or characters that is used to refer to an entity
An entity is anything;
e.g., resources such as hosts, printers, disks, files, objects, processes, users, ...
To operate on an entity, it is necessary to access it, for which we need an access point.
The name of an access point is called an address.
The address of an access point of an entity is also simply called an address of that entity.
An entity can offer more than one access point.
As a comparison, a telephone can be viewed as an access point of a person, whereas the telephone
number corresponds to an address.
9 Naming Entities ………. (2)
Indeed, many people nowadays have several telephone numbers, each number corresponding to a
point where they can be reached.
In a distributed system, a typical example of an access point is a host running a specific server,
with its address formed by the combination of, for example, an IP address and port number (i.e.,
the server's transport-level address).
An address is thus just a special kind of name: it refers to an access point of an entity.
Because an access point is tightly associated with an entity, it would seem convenient to use the
address of an access point as a regular name for the associated entity.
10 Naming Entities ………. (3)
It is common to regularly reorganize a distributed system, so that a specific server is now running
on a different host than previously.
The old machine on which the server used to be running may be reassigned to a completely
different server.
In other words, an entity may easily change an access point, or an access point may be reassigned
to a different entity.
If an address is used to refer to an entity, we will have an invalid reference the instant the access
point changes or is reassigned to another entity.
Therefore, it is much better to let a service be known by a separate name independent of the
address of the associated server.
11 Naming Entities ………. (4)
Identity
In addition to addresses, there are other types of names that deserve special treatment,
names that are used to uniquely identify an entity.
A true identifier is a name that has the following properties:
1. An identifier refers to at most one entity.
2. Each entity is referred to by at most one identifier.
3. An identifier always refers to the same entity (i.e., it is never reused).
By using identifiers, it becomes much easier to clearly refer to an entity.
12 Naming Entities ………. (5)
Addresses and identifiers are two important types of names that are each used for very different purposes.
In many computer systems, addresses and identifiers are represented in machine-readable form only, that
is, in the form of bit strings.
For example, an Ethernet address is essentially a random string of 48 bits memory addresses are typically
represented as 32/64-bit strings.
Another important type of name is that which is tailored to be used by humans, also referred to as
human-friendly names.
In contrast to addresses and identifiers, a human-friendly name is generally represented as a character
string.
These names appear in many different forms.
13 Naming Entities ………. (6)
Having names, identifiers, and addresses brings us to the central theme of the chapter:
How do we resolve names and identifiers to addresses?
Before we go into various solutions, it is important to understand the close link among name
resolution in distributed systems & message routing.
In principle, a naming system maintains a name-to-address binding which in its simplest form is
just a table of (name, address) pairs.
However, in distributed systems spanning large networks and for which many resources are named,
a centralized table is not going to work.
14 Naming Entities ………. (7)
Addresses:
Says how to reach an object it has location semantics associated to it
Usually, a format easy to process by computers
Name:
Does not have any location semantics associated to it
Usually, a format easier to understand/read/remember by people
Examples:
Address (IP): 74.125.31.99 // 31.13.67.35
Name: www.google.com // www.facebook.com
15 Human and System-Oriented Names
Human Oriented Names
System Oriented Names
A name defined by the users
A name generated by the system automatically.
Is a set of characters that is meaningful to users.
Is bit patterns of fixed size that can be easily
Independent of the physical location of the object.
manipulated and stored by machines.
Also called high-level names because they can be
Also called unique identifiers or low-level names
easily remembered by their users.
because they cannot be easily remembered by their
users.
16 Name Resolution …. (1)
Name resolution is the process of mapping an object's name to the object's properties ,such as its location.
Name is resolved when it is translated in to data about the named resource or object, often in order to invoke an
action up on it.
The association between the name and an object is called binding.
In general, names are bound to attributes of the named object, rather than the implementation of the objects
themselves.
An attribute is the value of a property associated with an object.
A key attribute of an entity that is usually relevant in a distributed system is its address.
An important issue with naming is that a name can be resolved to the entity it refers to.
Name resolution thus allows a process to access the named entity.
17 Name Resolution …. (2)
“In computer systems, name resolution refers to the retrieval of the underlying numeric values corresponding to
computer hostnames, account user names, group names, and other named entities”. wikipedia
How do we resolve names and identifiers to addresses?
Before we go into various solutions, it is important to understand the close link among name resolution in
distributed systems & message routing.
In principle, a naming system maintains a name-to-address binding which in its simplest form is just a table of
(name, address) pairs.
However, in distributed systems spanning large networks and for which many resources are named, a centralized
table is not going to work.
18 Name Resolution …. (3)
Instead, a name is decomposed into several parts such as Jtp.cs.vu.nl and that name resolution takes place
through a recursive lookup of those parts.
For example, a client demanding to know the address of the FTP server named by jtp.cs.vu.nl would first resolve
nl to find the server NS(nl) in charge for names ending with nl, then the rest of the name is passed to server
NS(nl).
This server may then resolve the name vu to the server NS(vu.ni) responsible for names that end with vu.nl who
can further handle the remaining name jtp.cs.
Eventually, this leads to routing the name resolution request as:
NS(.) ~ NS(nl) ~ NS(vu.nl) ~ address of jtp.cs.vu.nl
19 Naming Systems
Flat Naming
Structured Naming
Attribute based Naming
20 Naming System: Flat Naming … (1)
Flat naming is unstructured naming format. Identifiers are just random bits of strings.
Doesn’t have any information on how to locate access point of associated entity.
The simplest name space is a flat name space where names are character strings.
Names defined in flat name space are called primitives.
Different mechanisms are available to locate an entity given only its identifier.
Therefore, flat names are suitable for use either for small name spaces having names fro only a few
objects or for system oriented names that need not be meaningful to the users.
In the following, we will take a look at how flat names can be resolved, or, equivalently, how we can
locate an entity when given only its identifier.
21 Naming System: Flat Naming … (2)
Flat Names or simply identifier based address or entity names can be used to identify entity(resources in
distributed systems).
The key aspect about names in distributed system is how these names can be resolved (Mapped to the entity
they refer to).
Different approaches are proposed for flat names resolution in distributed system.
Some of these approaches includes
o Simple solutions
Broadcast/Multicast
Forwarding Pointers
o Home-based approach
22 Naming System: Flat Naming … (3)
Simple Solutions to flat Names (Identifiers) resolution:
Flat name resolution techniques such as multicasting & broadcasting approach and forward pointers
approach are considered as simple solutions to flat name resolution in distributed system.
In this section we will examine these two approaches used to locate entities using their identifier.
These techniques are Namely the Broadcasting and Multicasting technique and the Forwarding
Pointers techniques.
Each of these techniques have their own advantages and drawbacks based on the network scale in
which they are applied.
23 Naming System: Flat Naming … (4)
Broadcasting Approach
Consider a distributed system built on a computer network that offers efficient broadcasting facilities.
Typically, such facilities are offered by local-area networks in which all machines are connected to a single cable
or a logical equivalent.
Locating an entity in such an environment is simple: a message containing the identifier of the entity is
broadcast to each machine and each machine is requested to check whether it has that entity.
This principle is used in the Internet Address Resolution Protocol (ARP) to find the data-link address (Physical
Address) of a machine when given only its IP.
In essence, a machine broadcasts a packet on the local network asking who is the owner of a given IP address.
When the message arrives at a machine, the receiver checks whether it should listen to the requested IP address.
Only the machines that can offer an access point for the entity send a reply message containing the address of
that access point.
24 Naming System: Flat Naming … (5)
Broadcasting Approach…….
Broadcasting becomes inefficient when the network grows.
Not only is network bandwidth wasted by request messages, but, more seriously, too many hosts
maybe interrupted by requests they cannot answer.
One possible solution is to switch to multicasting, by which only a restricted group of hosts receives
the request.
For example, Ethernet networks support data-link level multicasting directly in hardware
25 Naming System: Flat Naming … (6)
Example: Address Resolution Protocol (ARP) Who has the address
192.168.0.1?
Resolve an IP address to a MAC address
In this system,
o IP address is the address of the entity
o MAC address is the identifier of the access
point
I am 192.168.0.1. My identifier is
02:AB:4A:3C:59:85
26 Naming System: Flat Naming … (7)
Multicasting Approach
Multicasting can also be used to locate entities in point-to-point networks.
For example, the Internet supports network-level multicasting by allowing hosts to join a specific
multicast group.
Such groups are identified by a multicast address.
When a host sends a message to a multicast address, the network layer provides a best-effort service
to deliver that message to all group members.
A multicast address can be used as a general location service for multiple entities.
27 Naming System: Flat Naming … (8)
Multicasting Approach…….
For example, consider an organization where each employee has his or her own laptop.
When such a computer connects to the locally available network.
It is dynamically assigned an IP address.
In addition, it joins a specific multicast group.
When a process wants to locate computer A, it sends a "where is A?" request to the multicast group.
If A is connected, it responds with its current IP address.
28 Naming System: Flat Naming … (9)
Multicasting Approach…….
Another way to use a multicast address is to associate it with a replicated entity, and to use
multicasting to locate the nearest replica.
When sending a request to the multicast address, each replica responds with its current (normal) IP
address.
A crude way to select the nearest replica is to choose the one whose reply comes in first.
Hence, as discussed so far, broadcast and multicast technique can be used to locate a entities or
resource in distributed system from given Flat name.
29 Naming System: Flat Naming … (10)
Forward Pointers Approach
Another popular approach to locating mobile entities is to make use of forwarding pointers.
The principle is simple: when an entity moves from A to B, it leaves behind in A , a reference to its new
location at B.
The main advantage of this approach is its simplicity: as soon as an entity has been located, for example by
using a traditional naming service, a client can look up the current address by following the chain of forwarding
pointers.
To better understand how forwarding pointers work, consider their use with respect to remote objects: objects
that can be accessed by means of a RPC.
For example each forwarding pointer can be implemented as a (client stub, server stub) pair.
A server stub contains either a local reference to the actual object or a local reference to a remote client stub for
that object.
30 Naming System: Flat Naming … (11)
31 Naming System: Flat Naming … (12)
Forward Pointers Approach…
To short-cut a chain of (client stub, server stub) pairs, an object invocation carries the identification of the client
stub from where that invocation was initiated.
A client-stub identification consists of the client's transport-level address, combined with a locally generated
number to identify that stub.
When the invocation reaches the object at its current location, a response is sent back to the client stub where
the invocation was initiated (often without going back up the chain).
The current location is piggybacked with this response, and the client stub adjusts its companion server stub to
the one in the object's current location.
32 Naming System: Flat Naming … (13)
33 Naming System: Flat Naming … (14)
Forward Pointers Approach..
There are also a number of important drawbacks of this approach.
First, if no special measures are taken, a chain for a highly mobile entity can become so long that locating that
entity is prohibitively expensive.
Second, all intermediate locations in a chain will have to maintain their part of the chain of forwarding pointers
as long as needed.
A third drawback is the vulnerability to broken links; when any forwarding pointer is lost (for any reason) the
entity can no longer be reached.
An important issue is, therefore, to keep chains relatively short, and to ensure that forwarding pointers are
robust.
34 Naming System: Flat Naming … (15)
Home Based Approach
The simple solutions to locate entities from their identifiers discussed so far have the following limitations:
The use of broadcasting and forwarding pointers imposes scalability problems.
Broadcasting or multicasting is difficult to implement efficiently in large scale networks whereas long chains of
forwarding pointers introduce performance problems and are susceptible to broken links.
An other approach to locate resources (objects or entities) in distributed system from theirs given flat name
(identifier) is via the home-based approach.
A popular approach to supporting mobile entities in large-scale networks is to introduce a home location, which
keeps track of the current location of an entity.
Special techniques may be applied to safeguard against network or process failures.
35 Naming System: Flat Naming … (16)
Home Based Approach…
In practice, the home location is often chosen to be the place where an entity was created.
The home-based approach is used as a fall-back mechanism for location services based on forwarding pointers.
Another example where the home-based approach is followed is in Mobile IP
Each mobile host uses a fixed IP address; All communication to that IP address is initially directed to the mobile
host's home agent.
This home agent is located on the local-area network corresponding to the network address contained in the
mobile host's IP address.
Whenever the mobile host moves to another network, it requests a temporary address that it can use for
communication.
This care-of address is registered at the home agent.
36 Naming System: Flat Naming … (17)
37 Naming System: Flat Naming … (18)
Home Based Approach…
A drawback of the home-based approach is the use of a fixed home location.
First, it must be ensured that the home location always exists.
Otherwise, contacting the entity will become impossible.
Second, when entity decides to move permanently to a completely different part of the network than where its
home is located; In that case, it would have been better if the home could have moved along with the host.
A solution to this problem is to register the home at a traditional naming service and to let a client first look up
the location of the home.
Because the home location can be assumed to be relatively stable, that location can be effectively cached after it
has been looked up.
38 Naming System: Structured Naming … (1)
Flat names or identifiers are good for machines (because they are bit strings), but are generally not very
convenient for humans to use.
As an alternative, naming systems generally support structured names that are composed from simple, human-
readable names. Not only file naming, but also host naming on the Internet follow this approach.
In this section, we deliberate on structured names and the way that these names are resolved to addresses.
Names are commonly organized into what is called a name space.
Name spaces for structured names can be represented as a labeled, directed graph with two types of nodes.
A leaf node is a named entity & it is known for having no outgoing edges.
39 Naming System: Structured Naming … (2)
A leaf node stores information on the entity it is representing.
For example, its address so that a client can access it.
Alternatively, it can store the state of that entity, such as in the case of file systems 'in which a leaf node actually
contains the complete file it is representing.
In contrast to a leaf node, a directory node has a number of outgoing edges, each labeled with a name.
Each node in a naming graph is considered as yet another entity in a distributed system, and, in particular, has
an associated identifier.
A directory node stores a table in which an outgoing edge is denoted as a pair (edge label, node identifier).
Such a table is called a directory table.
40 Naming System: Structured Naming … (3)
41 Naming System: Structured Naming … (4)
Basic issue: distribute name resolution process and name space management across multiple machines, by
distributing nodes of the naming graph.
Consider a structured (hierarchical) naming graph, three key levels:
Global level – high-level directory nodes; jointly managed by different administrations
Administrational level – mid-level directory nodes grouped so that each group can be assigned to a separate
administration
Managerial level – low-level directory nodes within a single administration; main issue is effectively mapping
directory nodes to local name servers
42 Naming System: Structured Naming … (5)
Example: Domain Name System (DNS)
43 Naming System: Structured Naming … (6)
44 Assignment-4
Attribute Based Naming