0% found this document useful (0 votes)
16 views57 pages

Chapter 5 Naming

Chapter 5 discusses the importance of naming systems in distributed environments for resource sharing and entity identification. It outlines different types of names, identifiers, and addresses, and explains various naming systems including flat, structured, and hierarchical naming approaches. The chapter also covers name resolution processes and the implementation of naming services across distributed systems.

Uploaded by

yosefdemeke08
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views57 pages

Chapter 5 Naming

Chapter 5 discusses the importance of naming systems in distributed environments for resource sharing and entity identification. It outlines different types of names, identifiers, and addresses, and explains various naming systems including flat, structured, and hierarchical naming approaches. The chapter also covers name resolution processes and the implementation of naming services across distributed systems.

Uploaded by

yosefdemeke08
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 57

Chapter 5 - Naming

Introduction
 names play an important role to:
 share resources
 uniquely identify entities
 refer to locations
 etc.
 an important issue is that a name can be resolved to the entity
it refers to
 to resolve names, it is necessary to implement a naming
system
 in a distributed system, the implementation of a naming
system is itself often distributed, unlike in nondistributed
systems
 efficiency and scalability of the naming system are the main
issues
2
5.1 Names, Identifiers, and Addresses
 a name in a distributed system is a string of bits or characters
that is used to refer to an entity
 an entity is anything; e.g., resources such as hosts, printers,
disks, files, objects, processes, users, Web pages, ...
 entities can be operated on; e.g., a resource such as a printer
offers an interface containing operations for printing a
document, requesting the status of a job, ...
 to operate on an entity, it is necessary to access it through its
access point, itself an entity (special)

3
 access point
 the name of an access point is called an address (such as

IP address and port number as used by the transport layer)


 the address of the access point of an entity is also referred

to as the address of the entity


 an entity can have more than one access point (similar to

accessing an individual through different telephone


numbers)
 an entity may change its access point in the course of time

(e.g., a mobile computer getting a new IP address as it


moves)

4
 an address is a special kind of name
 it refers to at most one entity
 each entity is referred by at most one address; even when
replicated such as in Web pages
 an entity may change an access point, or an access point
may be reassigned to a different entity (like telephone
numbers in offices)
 separating the name of an entity and its address makes it
easier and more flexible; such a name is called location
independent
 there are also other types of names that uniquely identify an
entity; in any case an identifier is a name with the following
properties
 it refers to at most one entity
 each entity is referred by at most one identifier
 it always refers to the same entity (never reused)
 identifiers allow us to unambiguously refer to an entity
5
 examples
 name of an FTP server (entity)
 URL of the FTP server
 address of the FTP server
 IP number:port number
 the address of the FTP server may change
 there are three classes on naming systems: flat naming,
structured naming, and attribute-based naming

6
5.2 Flat Naming
 a name is a sequence of characters without structure; like
human names?
 difficult to be used in a large system since it must be centrally
controlled to avoid duplication
 how are flat names resolved
 name resolution: mapping a name to an address or an
address to a name is called name-address resolution
 possible solutions: simple, home-based approaches, and
hierarchical approaches

7
1. Simple Solutions
 two solutions for LANs: Broadcasting and Multicasting,

and Forwarding Pointers


a. Broadcasting and Multicasting
 a computer that wants to access another computer for

which it knows its IP address broadcasts this address


 the owner responds by sending its Ethernet address

 used by ARP (Address Resolution Protocol) in the

Internet to find the data link address (MAC address) of a


machine
 broadcasting is inefficient when the network grows

(wastage of bandwidth and too much interruption to other


machines)
 multicasting is better when the network grows - send only

to a restricted group of hosts


 multicasting can also be used to locate the nearest

replica - choose the one whose reply comes in first


8
b. Forwarding Pointers
 how to look mobile entities

 when an entity moves from A to B, it leaves behind a

reference to its new location


 advantage

 simple: as soon as the first name is located using

traditional naming service, the chain of forwarding


pointers can be used to find the current address
 drawbacks

 the chain can be too long - locating becomes expensive

 all the intermediary locations in a chain have to maintain

their pointers
 vulnerability if links are broken

 hence, making sure that chains are short and that

forwarding pointers are robust is an important issue

9
2. Home-Based Approaches
 broadcasting and multicasting have scalability problems;
performance problems and broken links are problems in
forwarding pointers
 a home location keeps track of the current location of an
entity; often it is the place where an entity was created
 it is a two-tiered approach
 an example where it is used in Mobile IP
 each mobile host uses a fixed IP address
 all communication to that IP address is initially directly
sent to the host’s home agent located on the LAN
corresponding to the network address contained in the
mobile host’s IP address
 whenever the mobile host moves to another network, it
requests a temporary address in the new network
(called care-of-address) and informs the new address to
the home agent
10
 when the home agent receives a message for the
mobile host it forwards it to its new address and also
informs the sender the host’s current location for
sending other packets

home-based approach: the principle of Mobile IP 11


 problems:
 creates communication latency

 the host is unreachable if the home does no more exist

(permanently changed); the solution is to register the home


at a traditional name service

12
3. Hierarchical Approaches
 a generalization of the two-tiered approach into multiple
layers
 a network is divided into a collection of domains, similar
to DNS
 a single top-level domain spans the entire network
 each domain can be subdivided into multiple, smaller
domains
 the lowest-level domain is called a leaf domain; typically a
LAN
 each domain D has an associated directory node dir(D)
that keeps track of the entities in that domain leading to a
tree of directory nodes
 the root (directory) node knows about all entities

13
hierarchical organization of a location service into domains, each
having an associated directory node

14
 each entity is represented by a location record in the
directory node dir(D) to keep track of its whereabouts
 a location record for an entity in a leaf domain contains the
entity’s current address; all other high-level domains will
have only pointers to this address; this means the root node
will store only pointers to all entities
 an entity may have multiple addresses, for instance, if it is
replicated; a higher level domain containing the two
subdomains where the entity has addresses will have two
pointers

15
an example of storing information of an entity having two addresses
in different leaf domains

16
 example of a look up operation
 a client (in Domain D) would like to locate an entity E

looking up a location in a hierarchically organized location service


17
 update operations (i.e., inserting and deleting addresses)
 read pages 194 - 195)

 another solution is Distributed Hash Tables (DHT)


 read pages 188 - 191

18
5.3 Structured Naming
 flat names are not convenient for humans
 Name Spaces
 names are organized into a name space

 each name is made of several parts; the first may define the

nature of the organization, the second the name, the third


departments, ...
 the authority to assign and control the name spaces can be

decentralized where a central authority assigns only the


first two parts
 a name space is generally organized as a labeled, directed

graph with two types of nodes


 leaf node: represents the named entity and stores

information such as its address or the state of that entity


 directory node: a special entity that has a number of

outgoing edges, each labeled with a name


 each node in a naming graph is considered as another entity
with an identifier 19
a general naming graph with a single root node, no
 a directory node stores a table in which an outgoing edge is
represented as a pair (edge label, node identifier), called a
directory table
 each path in a naming graph can be referred to by the
sequence of labels corresponding to the edges of the path
and the first node in the path, such as
N:<label-1, label-2, ..., label-n>, where N refers to the first
node in the path 20
 such a sequence is called a path name
 if the first node is the root of the naming graph, it is called an
absolute path name; otherwise it is a relative path name
 instead of the path name n0:<home, steen, mbox>, we often
use its string representation /home/steen/mbox
 there may also be several paths leading to the same node,
e.g., node n5 can be represented as /keys or
/home/steen/keys
 although the above naming graph is directed acyclic graph
(a node can have more than one incoming edge but is not
permitted to have a cycle), the common way is to use a tree
(hierarchical) with a single root (as is used in file systems)
 in a tree structure, each node except the root has exactly
one incoming edge; the root has no incoming edges
 each node also has exactly one associated (absolute)
path name

21
 Name Resolution
 given a path name, the process of looking up a name stored

in the node is referred to as name resolution; it consists of


finding the address when the name is given (by following
the path)
 Linking and Mounting
 Linking: giving another name for the same entity (an alias)

e.g., environment variables in UNIX such as HOME that


refer to the home directory of a user
 two types of links (or two ways to implement an alias):

 hard link: to allow multiple absolute path names to refer

to the same node in a naming graph


e.g., in the previous graph, there are two different path
names for node n5: /keys and /home/steen/keys

22
 symbolic link: representing an entity by a leaf node and
instead of storing the address or state of the entity, the
node stores an absolute path name

the concept of a symbolic link explained in a naming graph


 when first resolving an absolute path name stored in a
node (e.g., /home/steen/keys in node n6), name resolution
will return the path name stored in the node (/keys), at
which point it can continue with resolving that new path
name
23
 so far name resolution was discussed as taking place
within a single name space
 name resolution can also be used to merge different name

spaces in a transparent way


 the solution is to use mounting

Mounting
 as an example, consider a mounted file system, which

can be generalized to other name spaces as well


 let a directory node store the directory node from a

different (foreign) name space


 the directory node storing the node identifier is called a

mount point
 the directory node in the foreign name space is called a

mounting point, normally the root of a name space


 during name resolution, the mounting point is looked up

and resolution proceeds by accessing its directory table

24
 consider a collection of name spaces distributed across
different machines (each name space implemented by a
different server)
 to mount a foreign name space in a DS, the following are at
least required
 the name of an access protocol (for communication)

 the name of the server

 the name of the mounting point in the foreign name space

 each of these names needs to be resolved


 to the implementation of the protocol

 to an address where the server can be reached

 to a node identifier in the foreign name space

 the three names can be listed as a URL

25
 example: Sun’s Network File System (NFS) is a distributed file
system with a protocol that describes how a client can access
a file stored on a (remote) NFS file server
 an NFS URL may look like nfs://flits.cs.vu.nl/home/steen

- nfs is an implementation of a protocol


- flits.cs.vu.nl is a server name to be resolved using DNS
- /home/steen is resolved by the server
 e.g., the subdirectory /remote includes mount points for

foreign name spaces on the client machine


 a directory node named /remote/vu is used to store

nfs://flits.cs.vu.nl/home/steen
 consider /remote/vu/mbox

 this name is resolved by starting at the root directory on

the client’s machine until node /remote/vu, which returns


the URL nfs://flits.cs.vu.nl/home/steen
 this leads the client machine to contact flits.cs.vu.nl

using the NFS protocol


 then the file mbox is read in the directory /home/steen
26
mounting remote name spaces through a specific process protocol

27
 distributed systems that allow mounting a remote file
system also allow to execute some commands
 example commands to access the file system
cd /remote/vu
ls -l
 by doing so the user is not supposed to worry about the
details of the actual access; the name space on the local
machine and that on the remote machine look to form a
single name space

28
 The Implementation of a Name Space
 a name space forms the heart of a naming service

 a naming service allows users and processes to add,

remove, and lookup names


 a naming service is implemented by name servers

 for a distributed system on a single LAN, a single server

might suffice; for a large-scale distributed system the


implementation of a name space is distributed over multiple
name servers
 Name Space Distribution
 in large scale distributed systems, it is necessary to

distribute the name service over multiple name servers,


usually organized hierarchically
 a name service can be partitioned into logical layers

 the following three layers can be distinguished (according to

Cheriton and Mann)

29
 global layer
 formed by highest level nodes (root node and nodes close

to it or its children)
 nodes on this layer are characterized by their stability, i.e.,

directory tables are rarely changed


 they may represent organizations, groups of

organizations, ..., where names are stored in the name


space
 administrational layer
 groups of entities that belong to the same organization or

administrational unit, e.g., departments


 relatively stable

 managerial layer
 nodes that may change regularly, e.g., nodes representing

hosts of a LAN, shared files such as libraries or binaries,



 nodes are managed not only by system administrators, but

also by end users 30


an example partitioning of the DNS name space, including Internet-
accessible files, into three layers 31
 the name space is divided into nonoverlapping parts, called
zones in DNS
 a zone is a part of the name space that is implemented by a
separate name server
 some requirements of servers at different layers
 performance (responsiveness to lookups), availability (failure

rate), etc.
 high availability is critical for the global layer, since name

resolution cannot proceed beyond the failing server; it is also


important at the administrational layer for clients in the same
organization
 performance is very important in the lowest layer, since

results of lookups can be cached and used due to the relative


stability of the higher layers
 they may be enhanced by client side caching (global and

administrational layers since names do not change often)


and replication; they create implementation problems since
they may introduce inconsistency problems (see Chapter 7)
32
Item Global Administrational Managerial

Geographical scale of network Worldwide Organization Department

Total number of nodes Few Many Vast numbers

Responsiveness to lookups Seconds Milliseconds Immediate

Update propagation Lazy Immediate Immediate

Availability requirement Very High High low

Number of replicas Many None or few None

Is client-side caching applied? Yes Yes Sometimes

a comparison between name servers for implementing nodes from a


large-scale name space partitioned into a global layer, an
administrational layer, and a managerial layer

33
 Implementation of Name Resolution
 recall that name resolution consists of finding the address

when the name is given


 assume that name servers are not replicated and that no

client-side caches are allowed


 each client has access to a local name resolver, responsible

for ensuring that the name resolution process is carried out


 e.g., assume the path name

root:<nl, vu, cs, ftp, pub, globe, index.txt>


is to be resolved
or using a URL notation, this path name would correspond
to ftp://ftp.cs.vu.nl/pub/globe/index.txt

34
 Resolution
 mapping a name to an address or an address to a name is
called name-address resolution
 Resolver
 a host that needs to map an address to a name or a name

to an address calls a DNS client named a resolver


 the resolver accesses the closest DNS server with a

mapping request
 if the server has the information it satisfies the resolver;

otherwise, it either refers the resolver to other servers


(called Iterative Resolution) or asks other servers to
provide the information (called Recursive Resolution)

35

Iterative

a name resolver hands over the complete name to the root
name server

the root name server will resolve the name as far as it can and
return the result to the client; at the minimum it can resolve
the first level and sends the name of the first level name
server to the client

the client calls the first level name server, then the second, ...,
until it finds the address of the entity

the principle of iterative name resolution 36


 Recursive
 a name resolver hands over the whole name to the root name

server
 the root name server will try to resolve the name and if it

can’t, it requests the first level name server to resolve it and


to return the address
 the first level will do the same thing recursively

the principle of recursive name resolution 37


 Advantages and drawbacks
 recursive name resolution puts a higher performance
demand on each name server; hence name servers in the
global layer support only iterative name resolution
 caching is more effective with recursive name resolution;
each name server gradually learns the address of each
name server responsible for implementing lower-level
nodes; eventually lookup operations can be handled
efficiently
Server for Should Looks Passes to Receives Returns to
node resolve up child and caches requester
cs <ftp> #<ftp> -- -- #<ftp>
vu <cs,ftp> #<cs> <ftp> #<ftp> #<cs>
#<cs, ftp>
nl <vu,cs,ftp> #<vu> <cs,ftp> #<cs> #<vu>
#<cs,ftp> #<vu,cs>
#<vu,cs,ftp>
root <nl,vu,cs,ftp> #<nl> <vu,cs,ftp> #<vu> #<nl>
#<vu,cs> #<nl,vu>
#<vu,cs,ftp> #<nl,vu,cs>
#<nl,vu,cs,ftp>

recursive name resolution of <nl, vu, cs, ftp>; name servers cache
intermediate results for subsequent lookups 38
 communication costs may be reduced in recursive name
resolution

the comparison between recursive and iterative name resolution with


respect to communication costs; assume the client is in Ethiopia
 Summary and the name servers in the Netherlands
Method Advantage(s)
Recursive Less Communication cost; Caching is more effective
Iterative Less performance demand on name servers
39
 Example - The Domain Name System (DNS)
 one of the largest distributed naming services is the
Internet DNS
 it is used for looking up host addresses and mail servers
 hierarchical, defined in an inverted tree structure with the
root at the top
 the tree can have only 128 levels

40
 Label
 each node has a label, a string with a maximum of 63

characters (case insensitive)


 the root label is null

 children of a node must have different names (to guarantee

uniqueness)

 Domain Name
 each node has a domain
name
 a full domain name is a
sequence of labels
separated by dots (the last
character is a dot; null
string is nothing)
 domain names are read
from the node up to the
root
 full path names must not
exceed 255 characters 41
 Fully Qualified Domain Name (FQDN) or Absolute
 terminated by a null string

 contains the full name of a host, e.g., cs.aau.edu.et.

 usually the last dot is omitted for readability

 Partially Qualified Domain Name (PQDN) or Relative


 not terminated with a null string

 it starts from a node but does not reach the root

 used when the name to be resolved belongs to the same

site as the client (the resolver supplies the missing part,


called the suffix to create an FQDN)

42
 Domain
 a domain is a subtree of the domain name space

 the name of the domain is the domain name of the node at

the top of the subtree

 the Internet is divided into over 200 top-level domains;


each partitioned into subdomains, ... ; the leaves represent
domains that have no subdomains; a leaf domain may
contain a single host or represent a company and contain
thousands of hosts 43
 Hierarchy of Name Servers
 storing the information contained in the domain name space
in a single computer is inefficient and unreliable
 distribute the information among many computers called
DNS servers
 there is a hierarchy of name servers as we have a hierarchy
of names

44
 Zone
 what a server is responsible for, or has authority over, is

called a zone; zones are nonoverlapping


 the server makes a database called a zone file and keeps

all the information for every node under that domain


 it can divide its domain into subdomains and delegate part

of its authority to other servers

45
 Root Server
 a server whose zone consists of the whole tree

 it usually does not store the whole information about

domains but delegates its authority to other servers and


keeps references to those servers
 there are currently more than 13 root servers, each

covering the whole domain name space and distributed all


around the world
 Primary and Secondary Servers
 a primary server is one that stores a file about the zone for

which it is an authority; it is responsible for creating,


maintaining, and updating the zone file
 a secondary server is one that transfers the complete

information about a zone from another server (primary or


secondary); it does not create or update the file
 such arrangement is to create redundancy so that if one

server fails, the other can still serve clients


46
 Types of Top-Level Domains
 two types: generic domains and country domains; there is a
third one called Inverse Domain (used to map an address to
a name; we will not discuss it further)
 Generic Domains
 define registered hosts according to their generic

behaviour

Label Description
com Commercial organizations
edu Educational institutions
gov Government institutions
int International organizations
mil Military groups
net Network support centers
org Nonprofit organizations
47
 newly introduced first-level domains
Label Description
aero Airlines and aerospace companies
biz Businesses or firms (similar to com)
coop Cooperative business organizations
info Information service providers
museum Museums and other nonprofit organizations
name Personal names (individuals)
pro Professional individual organizations
 Country Domains
 include one entry for every
country (as defined by ISO) -
two character abbreviations

48
 the contents of a node is formed by a collection of resource
records; the important ones are the following
Type of Associated
Description
record entity
SOA (start of Holds information on the represented zone, such as an
Zone
authority) e-mail address of the system administrator
A (address) Host Contains an IP address of the host this node represents
MX (mail Refers to a mail server to handle mail addressed to this
Domain
exchange) node; it is a symbolic link; e.g. name of a mail server
SRV Domain Refers to a server handling a specific service
NS (name Refers to a name server that implements the
Zone
server) represented zone
CNAME Node Contains the canonical name of a host
Symbolic link with the primary name of the represented
PTR (pointer) Host
node
HINFO (host Holds information on the host this node represents;
Host
info) such as machine type and OS
Contains any entity-specific information considered
TXT Any kind
useful
49
 cs.vu.nl
represents the
domain as well
as the zone; it
has 3 name
servers (star, top,
solo) and 3 mail
servers
 name server for
this zone with 2
network
addresses

 mail server

 Web server
 FTP server
 a single machine
implementing
Web server and
FTP server
 laser printer

 inverse mapping
an excerpt from the DNS database for the zone cs.vu.nl 50
5.4 Attribute-Based Naming
 flat naming: provides a unique and location-independent way
of referring entities
 structured naming: also provides a unique and location-
independent way of referring entities as well as human-friendly
names
 but do not allow searching entities by giving a description of
an entity
 each entity is assumed to have a collection of attributes that
say something about the entity
 then a user can search an entity by specifying (attribute, value)
pairs known attribute-based naming
 Directory Services
 attribute-based naming systems are also called directory

services

51
 how are resources described? one possibility is to use RDF
(Resource Description Framework) that uses triplets
consisting of a subject, a predicate, and an object
 e.g., (person, name, Alice) to describe a resource Person

whose Name is Alice


 Hierarchical Implementations: LDAP(Lightweight Directory
Access Protocol)
 distributed directory services are implemented by combining

structured naming with attribute-based naming


 e.g., Microsoft’s Active directory service

 such systems rely on the lightweight directory access

protocol or LADP which is derived from OSI’s X.500


directory service
 a LADP directory service consists of a number of records

called directory entries (attribute, value) pairs, similar to a


resource record in DNS; could be single- or multiple-valued
(e.g., Mail_Servers)
52
Attribute Abbr. Value
Country C NL
Locality L Amsterdam
Organization O Vrije Universiteit
OrganizationalUnit OU Comp. Sc.
CommonName CN Main server
Mail_Servers -- 137.37.20.3, 130.37.24.6,137.37.20.10
FTP_Server -- 130.37.20.20
WWW_Server -- 130.37.20.20

a simple example of an LDAP directory entry using LDAP


naming conventions to identify the network addresses of
some servers

53
 the collection of all directory entries is called a Directory
Information Base (DIB)
 each record is uniquely named so that it can be looked up
 each naming attribute is called a Relative Distinguished
Name (RDN); the first 5 entries above
 a globally unique name is formed using abbreviations of
naming attributes, e.g.,
/C=NL/O=Vrije Universiteit/OU=Comp. Sc.
 this is similar to the DNS name nl.vu.cs
 listing RDNs in sequence leads to a hierarchy of the
collection of directory entries, called a Directory
Information Tree (DIT)
 a DIT forms the naming graph of an LDAP directory service
where each node represents a directory entry

54
 node N corresponds to the directory entry shown earlier; it
also acts as a parent of other directory entries that have an
additional attribute, Host_Name; such entries may be used
to represent hosts

part of the directory information tree

55
Attribute Value Attribute Value
Country NL Country NL
Locality Amsterdam Locality Amsterdam
Organization Vrije Universiteit Organization Vrije Universiteit

OrganizationalUnit Comp. Sc. OrganizationalUnit Comp. Sc.

CommonName Main server CommonName Main server


Host_Name star Host_Name zephyr
Host_Address 192.31.231.42 Host_Address 137.37.20.10

two directory entries having Host_Name as RDN

 read pages 222 - 226 about Decentralized Implementations

56
THANK YOU
??

57

You might also like