0% found this document useful (0 votes)
3 views29 pages

Chapter 4 Naming

Chapter 4 discusses naming in distributed systems, defining names as strings that refer to entities like resources and processes. It covers various naming systems, including flat, structured, and attribute-based naming, highlighting the importance of access points and identifiers for unambiguous entity reference. The chapter also explores name resolution techniques, including broadcasting, multicasting, and hierarchical approaches, emphasizing the need for efficient and flexible naming solutions.

Uploaded by

Debela Adane
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views29 pages

Chapter 4 Naming

Chapter 4 discusses naming in distributed systems, defining names as strings that refer to entities like resources and processes. It covers various naming systems, including flat, structured, and attribute-based naming, highlighting the importance of access points and identifiers for unambiguous entity reference. The chapter also explores name resolution techniques, including broadcasting, multicasting, and hierarchical approaches, emphasizing the need for efficient and flexible naming solutions.

Uploaded by

Debela Adane
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 29

Chapter 4 - Naming

4.1 Introduction
 a name in a distributed system is a string of bits or characters
that is used to refer to an entity
 an entity is anything; e.g., resources such as hosts, printers,
disks, files, objects, processes, users, Web pages,
newsgroups, mailboxes, network connections, ...
 entities can be operated on
 e.g., a resource such as a printer offers an interface
containing operations for printing a document, requesting
the status of a job, etc.
 a network connection may provide operations for sending
and receiving data, setting quality of service parameters, etc.
 to operate on an entity, it is necessary to access it through its
access point, itself an entity (special)

2
 access point
 the name of an access point is called an address
(such as IP address and port number as used by the
transport layer)
 the address of the access point of an entity is also
referred to as the address of the entity
 an entity may change its access point in the course of
time (e.g., a mobile computer getting a new IP
address as it moves)
 an address is a special kind of name
 it refers to at most one entity
 each entity is referred by at most one address; even
when replicated such as in Web pages
 separating the name of an entity and its address
makes it easier and more flexible; such a name is
called location independent
3
 a true identifier is a name with
 at most one entity
 each entity is referred by at most one identifier
 it always refers to the same entity (never reused)
 identifiers allow us to unambiguously refer to an entity
 examples
 name of an FTP server (entity)
 URL of the FTP server
 address of the FTP server
 IP number:port number
 the address of the FTP
server may change
 there are three classes on naming systems: flat naming,
structured naming, and attribute-based naming

4
4.2 Flat Naming
 a name is a sequence of characters without structure; like
human names?
 difficult to be used in a large system since it must be centrally
controlled to avoid duplication
 moreover, it does not contain any information on how to
locate the access point of its associated entity
 how are flat names resolved (or how to locate an entity when
a flat name is given)
 name resolution: mapping a name to an address or an
address to a name is called name-address resolution
 possible solutions: simple solutions, home-based
approaches, and hierarchical approaches

5
1. Simple Solutions
 two solutions (for LANs only): Broadcasting and

Multicasting, and Forwarding Pointers


a. Broadcasting and Multicasting
 broadcast a message containing the identifier of an
entity; only machines that can offer an access point for
the entity send a reply
 e.g., ARP (Address Resolution Protocol) in the Internet to
find the data link address (MAC address) of a machine
 a computer that wants to access another computer for
which it knows its IP address broadcasts this
address
 the owner responds by sending its Ethernet address
 broadcasting is inefficient when the network grows
(wastage of bandwidth and too much interruption to other
machines)
 multicasting is better when the network grows - send only
to a restricted group of hosts 6
 multicasting can also be used to locate the nearest replica
- choose the one whose reply comes in first

b. Forwarding Pointers
 how to look for mobile entities
 when an entity moves from A to B, it leaves behind a
reference to its new location
 advantage
 simple: as soon as the first name is located using
traditional naming service, the chain of forwarding
pointers can be used to find the current address
 drawbacks
 the chain can be too long - locating becomes
expensive
 all the intermediary locations in a chain have to maintain
their pointers
 vulnerability if links are broken
 hence, making
forwarding sure that
pointers chainsisare
are robust anshort and that
important issue 1
2. Home-Based Approaches
 broadcasting and multicasting have scalability problems;
performance and broken links are problems in
forwarding pointers
 a home location keeps track of the current location of an
entity; often it is the place where an entity was created
 it is a two-tiered approach
 an example where it is used in Mobile IP
 each mobile host uses a fixed IP address
 all communication to that IP address is initially directly
sent to the host’s home agent located on the LAN
corresponding to the network address contained in the
mobile host’s IP address
 whenever the mobile host moves to another network, it
requests a temporary address in the new network
(called care-of-address) and informs the new address
to the home agent
11
 when the home agent receives a message for the mobile host
(from a correspondent agent) it forwards it to its new
address (if it has moved) and also informs the sender the
host’s current location for sending other packets

home-based approach: the principle of Mobile IP 1


 problems:
 creates communication latency (Triangle routing:
correspondent-home network-mobile)
 the home location must always exist; the host is
unreachable if the home does no more exist (permanently
changed); the solution is to register the home at a traditional
name service and let a client first look up the location of the
home

10
3. Hierarchical Approaches
 a generalization of the two-tiered approach into multiple
layers
 a network is divided into a collection of domains, similar
to DNS
 a single top-level domain spans the entire network
 each domain can be subdivided into multiple, smaller
domains
 the lowest-level domain is called a leaf domain; typically a
LAN
 each domain D has an associated directory node dir(D)
that keeps track of the entities in that domain leading to a
tree of directory nodes
 the root (directory) node knows about all entities

11
hierarchical organization of a location service into domains, each having an
associated directory node

12
 each entity is represented by a location record in the
directory node dir(D) to keep track of its whereabouts
 a location record for an entity in a leaf domain contains the
entity’s current address; all other high-level domains will
have only pointers to this address; this means the root node
will store only pointers to all entities
 an entity may have multiple addresses, for instance, if it is
replicated; a higher level domain containing the two
subdomains where the entity has addresses will have two
pointers

13
an example of storing information of an entity having two addresses in
different leaf domains D1 and D2

14
 example of a look up operation
 a client (in Domain D) would like to locate an entity E

looking up a location in a hierarchically organized location service


15
4.3 Structured Naming
 flat names are not convenient for humans
 Name Spaces
 names are organized into a name space
 each name is made of several parts; the first may define
the nature of the organization, the second the name,
the third departments, ...
 the authority to assign and control the name spaces can be
decentralized where a central authority assigns only the
first two parts
 a name space is generally organized as a labeled, directed
graph with two types of nodes
 leaf node: represents the named entity and stores
information such as its address or the state of that entity
 directory node: a special entity that has a number of
outgoing edges, each labeled with a name
 each node in a naming graph is considered as another entity
with an identifier
16
a general naming graph with a single root node, no
 a directory node stores a table in which an outgoing edge is
represented as a pair (edge label, node identifier), called a
directory table
 each path in a naming graph can be referred to by the
sequence of labels corresponding to the edges of the path
and the first node in the path, such as
N:<label-1, label-2, ..., label-n>, where N refers to the first
node in the path 2
 such a sequence is called a path name
 if the first node is the root of the naming graph, it is called an
absolute path name; otherwise it is a relative path name
 instead of the path name n0:<home, steen, mbox>, we often
use its string representation /home/steen/mbox
 there may also be several paths leading to the same node,
e.g., node n5 can be represented as /keys or
/home/steen/keys
 although the above naming graph is directed acyclic graph (a
node can have more than one incoming edge but is not
permitted to have a cycle), the common way is to use a tree
(hierarchical) with a single root (as is used in file systems)
 in a tree structure, each node except the root has exactly
one incoming edge; the root has no incoming edges
 each node also has exactly one associated (absolute) path
name

18
 e.g., file naming in UNIX file system
 a directory node represents a directory and a leaf node
represents a file
 there is a single root directory, represented in the naming
graph by the root node
 we have a contiguous series of blocks from a logical disk
 the boot block is used to load the operating system
 the superblock contains information on the entire file
system such as its size, etc.
 inodes are referred to by an index number, starting at
number zero, which is for the inode representing the root
directory
 given the index number of an inode, it is possible to access
its associated file

19
Name Resolution
 given a path name, the process of looking up a name
stored in the node is referred to as name resolution; it
consists of finding the address when the name is given (by
following the path)
 knowing how and where to start name resolution is referred
to as closure mechanism; e.g., UNIX file system
 Linking
 Linking: giving another name for the same entity (an alias)
e.g., environment variables in UNIX such as HOME that
refer to the home directory of a user
 two types of links (or two ways to implement an alias):
hard link and symbolic link
 hard link: to allow multiple absolute path names to
refer to the same node in a naming graph
e.g., in the previous graph, there are two different path
names for node n5: /keys and /home/steen/keys
20
 symbolic link: representing an entity by a leaf node and
instead of storing the address or state of the entity, the
node stores an absolute path name

the concept of a symbolic link explained in a naming graph

 when first resolving an absolute path name stored in a


node (e.g., /home/steen/keys in node n6), name
resolution will return the path name stored in the node
(/keys), at which point it can continue with resolving that
new path name, i.e., closure mechanism
21
 The Implementation of a Name Space
 a name space forms the heart of a naming service
 a naming service allows users and processes to add,
remove, and lookup names
 a naming service is implemented by name servers
 for a distributed system on a single LAN, a single server
might suffice; for a large-scale distributed system the
implementation of a name space is distributed over multiple
name servers

 Name Space Distribution


 in large scale distributed systems, it is necessary to
distribute the name service over multiple name servers,
usually organized hierarchically
 a name service can be partitioned into logical layers
 the following three layers can be distinguished (according to
Cheriton and Mann)
22
 global layer
 formed by highest level nodes (root node and nodes close
to it or its children)
 nodes on this layer are characterized by their stability, i.e.,
directory tables are rarely changed
 they may represent organizations, groups of
organizations, ..., where names are stored in the name
space
 administrational layer
 groups of entities that belong to the same organization or
administrational unit, e.g., departments
 relatively stable
 managerial layer
 nodes that may change regularly, e.g., nodes representing
hosts of a LAN, shared files such as libraries or binaries,

 nodes are managed not only by system administrators,
but also by end users 3
an example partitioning of the DNS name space, including Internet-
accessible files, into three layers 3
 the name space is divided into nonoverlapping parts, called
zones in DNS
 a zone is a part of the name space that is implemented by a
separate name server
 some requirements of servers at different layers: performance
(responsiveness to lookups), availability (failure rate), etc.
 high availability is critical for the global layer, since name
resolution cannot proceed beyond the failing server; it is
also important at the administrational layer for clients in the
same organization
 performance is very important in the lowest layer, since
results of lookups can be cached and used due to the
relative stability of the higher layers
 they may be enhanced by client side caching (for global and
administrational layers since names do not change often)
and replication; they create implementation problems since
they may introduce inconsistency (see Chapter 6)

25
Item Global Administrational Managerial

Geographical scale of network Worldwide Organization Department

Total number of nodes Few Many Vast numbers

Responsiveness to lookups Seconds Milliseconds Immediate

Update propagation Lazy Immediate Immediate

Availability requirement Very High High low

Number of replicas Many None or few None

Is client-side caching applied? Yes Yes Sometimes

a comparison between name servers for implementing nodes from a large-


scale name space partitioned into a global layer, an administrational
layer, and a managerial layer

26
 Implementation of Name Resolution
 recall that name resolution consists of finding the address
when the name is given
 assume that name servers are not replicated and that no
client-side caches are allowed
 each client has access to a local name resolver, responsible
for ensuring that the name resolution process is carried out
 e.g., assume the path name
root:<nl, vu, cs, ftp, pub, globe, index.txt>
is to be resolved
or using a URL notation, this path name would correspond
to ftp://ftp.cs.vu.nl/pub/globe/index.txt

27
4.4 Attribute-Based Naming
 flat naming: provides a unique and location-independent way
of referring to entities
 structured naming: also provides a unique and location-
independent way of referring to entities as well as human-
friendly names
 but both do not allow searching entities by giving a
description of an entity
 in attribute-based naming, each entity is assumed to have a
collection of attributes that say something about the entity
 then a user can search an entity by specifying (attribute, value)
pairs known as attribute-based naming
 Directory Services
 attribute-based naming systems are also called directory
services whereas systems that support structured naming
are called naming systems

28
 how are resources described? one possibility is to use RDF
(Resource Description Framework) that uses triplets
consisting of a subject, a predicate, and an object
 e.g., (person, name, Alice) to describe a resource Person
whose Name is Alice
 or in e-mail systems, we can use sender, recipient, subject,
etc. for searching

You might also like