0% found this document useful (0 votes)
21 views40 pages

CH 5

Names play a critical role in distributed systems by uniquely identifying entities. There are two main types of names: flat names/identifiers which are random strings, and structured names which are human-readable. For flat names, challenges include mapping the name to an entity's address. Approaches include home-based naming, distributed hash tables (DHTs), and hierarchical location services. Structured names are organized in a name space represented by a graph, with leaf nodes storing entity information and directory nodes linking to other parts of the name space.

Uploaded by

gemchis dawo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views40 pages

CH 5

Names play a critical role in distributed systems by uniquely identifying entities. There are two main types of names: flat names/identifiers which are random strings, and structured names which are human-readable. For flat names, challenges include mapping the name to an entity's address. Approaches include home-based naming, distributed hash tables (DHTs), and hierarchical location services. Structured names are organized in a name space represented by a graph, with leaf nodes storing entity information and directory nodes linking to other parts of the name space.

Uploaded by

gemchis dawo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 40

Chapter 5- Naming

Mulugeta A.

1
Naming
 Names play a critical role in all computer systems
 Used to share resources
 Used to refer to locations
 To uniquely identify entities in a distributed system, and more
 A name in a distributed system is a string of bits or characters that
is used to refer to an entity.
 hosts, printers, disks, files, processes, web-pages
 An important issue in naming is that a name can be resolved to the
entity it refers to.
 Name resolution system thus allows a process to access the named entity.
 In distributed system, the implementation of a naming system is often
distributed across multiple machines.
 How distribution is done has impact on the performance

2
Cont...
 Entities can be operated on.
 E.g. Printer provides operations for printing, pausing, cancelling etc.
 An entity such as a network connection may provide operations for sending and receiving
data, setting quality-of-service parameters, requesting the status, and so forth.
 To operate on an entity, it is necessary to access it, for which we need an access point.
 An access point is yet another, but special, kind of entity in a distributed system.
 The name of an access point is called address
 E.g. A host running a specific server, its address formed by the combination of IP address and port
number
 An entity may have multiple access points, and its access point may change
 Thus, the address of an access point should not be used to name the entity
 For example, when a mobile computer moves to another location, it is often assigned a
different IP address than the one it had before.
 Likewise, when a person moves to another city or country, it is often necessary to change
telephone numbers as well.
 In a similar fashion, changing jobs or Internet Service Providers, means changing your e-
mail address.

3
Cont...
 Therefore, what we need is a name for an entity that is independent
from its addresses
 i.e., a location-independent name
 A location-independent name for an entity E , is independent from the
addresses of the access points offered by E .
 Identifier
 Identifier is a name having the following properties:
 P1: Each identifier refers to at most one entity
 P2: Each entity is referred to by at most one identifier
 P3: An identifier always refers to the same entity (prohibits
reusing an identifier)

4
The Central Theme of Naming
 How to resolve names and identifiers to addresses
 In principle, a naming system maintains a name-to-
address binding in the form of mapping table
 However, a centralized table in a large network is not going to
work
 The name resolution as well as the table is often
distributed across multiple machines
 E.g., www.xx.edu.et is divided into multiple parts, and several
DNS servers are used during the resolution
 Will discuss the resolution of the following names
 Flat names, structured names, attribute-based names

5
Flat Names (Identifiers)
 An identifier is often a string of random bits
 Does not contain any information on how to locate the access
point of its associated entity
 Problem:- given an essentially unstructured name (e.g., an
identifier), how can we locate its associated access point?
 Simple solutions (broadcasting)
 Home-based approaches
 Distributed Hash Tables (structured P2P)
 Hierarchical location service

6
Simple solutions
 Two simple solutions to locate the entity given an
identifier
 Broadcasting and multicasting (e.g., ARP)
 Broadcast the ID, requesting the entity to return its current address.
 Can never scale beyond local-area networks
 Requires all processes to listen to incoming location requests
 Forwarding pointers
 A popular approach to locate mobile entities
 When an entity moves, it leaves a pointer to where it went
 Update a client’s reference when present location is found
 Advantage:
 Dereferencing can be made transparent to client by following the
pointer chain

7
Forwarding Pointers
 Disadvantages
 Geographical scalability problems:
 Chain can be very long for highly mobile entities
 Long chains are not fault tolerant
 High latency when dereferencing
 Need chain reduction mechanisms
 Update client's reference when the most recent location is found

8
Home-based solution
 Let the home keep track where the entity is.
 Entity’s home address is registered at a naming service.
 In practice, the home location is often chosen to be the place where an
entity was created.
 The home registers the foreign address of the entity.
 Clients always contact the home first, and then continue with
the foreign location.

9
Home-Based Approaches

Figure 5-3. The principle of Mobile IP.


10
Problems With Home-Based Approaches
 Home address has to be supported for entity’s lifetime.
 Home address is fixed.
 Unnecessary burden when the entity permanently moves.
 Poor geographical scalability.
 Entity may be next to client.

11
Distributed Hash Table (DHT)- Chord
 In DHT-based system,
 Each node has an m-bit random identifier
 Each entity has an m-bit random key
 An entity with key k is located on a node with the smallest identifier
that satisfies id >= k, denoted as succ(k)
 The main issue in DHT-based systems is to efficiently resolve
a key k to the address of succ(k).
 Two approaches:
 linear approach and finger table
 In linear approach, let each node p keep track of the successor
succ(p+ 1) as well as its predecessor pred(p)
 whenever a node p receives a request to resolve key k, it will simply
forward the request to one of its two neighbours
12
Finger table
 Each node p maintains a finger table FTp[] with at most m
entries:
 FTp[i] = succ (p+ 2i−1)
 Note: FTp[i] points to the first node succeeding p by at least 2 i−1
 To look up a key k, node p forwards the request to node q
with index j in the figure table satisfying
 q= FTp[ j] ≤ k< FTp[ j+1]
 If p < k < FTp[1], the request is also forwarded to FTp[1].

13
Distributed Hash Tables

Figure 5-4. Resolving key 26 from node 1 and key 12 from node 28 in a Chord system.
14
Exploiting Network Proximity
 Problem:
 The logical organization of nodes in the overlay may lead to
erratic message transfers in the underlying Internet
 Node k and node succ(k +1) maybe very far apart.
 Solutions:
 Topology-aware node assignment: When assigning an ID to a
node, make sure that nodes close in the ID space are also close
in the network.
 Proximity routing: Maintain more than one possible successor,
and forward to the closest.
 Proximity neighbour selection: When there is a choice of
selecting who your neighbour will be, pick the closest one.

15
Hierarchical Location Services
 The basic idea is to build a large-scale search tree for which the underlying
network is divided into hierarchical domains.
 Each domain is represented by a separate directory node.
 Leaf domains typically correspond to a local-area network or a cell.
 The root (directory) node knows all the entities.
 Each entity currently in a domain D is represented by a location record in the
directory node dir(D) which is the entity’s current address or a pointer.

16
Hierarchical Approaches- organization
 The address of an entity E is stored in a leaf or intermediate node.
 Intermediate nodes contain a pointer to a child if the sub-tree rooted at the child
stores an address of the entity.
 An entity may have multiple addresses (e.g., if it is replicated).

An example of storing information of an entity having two addresses in different leaf domains.
17
Hierarchical Approaches - Lookup Operation
 Start lookup at local leaf node.
 If node knows about entity E, follow downward pointer, otherwise go up.
 Upward lookup always stops at root.

Looking up a location in a hierarchically organized location service.


18
Hierarchical Approaches – Insert
operation
 Consider an entity E that has created a replica in leaf domain D for which it
needs to insert its address.
 [A]. An insert request is forwarded to the first node that knows about entity E.
 [B]. A chain of forwarding pointers to the leaf node is created.

19
Structured Naming
 Flat names are not convenient for humans to use
 As a result, naming systems often support structured
names that
 Are composed from simple, human-readable names
 E.g., file names, Internet domain names
 Structured names are often organized into what is called a
name space
 Holds a collection of valid names recognized by a particular
service.
 Examples: Phone numbers, DNS, URL etc
 It is represented by a labelled, directed graph with two types of
nodes, leaf node and directory node
20
Con’t
 A leaf node represents a named entity and has the property
that it has no outgoing edges.
 A leaf node generally stores information on the entity it is
representing–for example, its address–so that a client can
access it.
 A directory node has a number of outgoing edges, each
labeled with.
 Each node in a naming graph is considered as yet another entity in
a distributed system, and, in particular, has an associated identifier.
 A directory node stores a table in which an outgoing edge is
represented as a pair (node identifier, edge label). Such a
table is called a directory table.
21
Con’t
 A general naming graph with a single root node.

Note:- A directory node contains a (directory) table of (node identifier, edge label) pairs.
22
Name Resolution
 Name spaces offer a convenient mechanism for storing and retrieving information
about entities by means of names.
 More generally, given a path name, it should be possible to look up any information
stored in the node referred to by that name.
 The process of looking up a name is called name resolution.
 Name resolution refers to the process of looking up a name.
 To resolve a name we need a directory node.
 Problem:
 How do we actually find that (initial) node?
 Solution:
 Closure Mechanism: the mechanism to select the implicit context from which to start name
resolution.
 Knowing how and where to start name resolution is generally referred to as a closure
mechanism.
 Examples:
- www.xx.edu.et: start at a DNS name server
- /home/steen/mbox: start at the local file server
 Start from well know root directory, or start from home directory
23
Linking and Mounting
 Aliases are commonly used in a name space
 An alias is another name for the same entity.
 In naming graphs, there are basically two different ways to implement an alias :
hard link and a symbolic link
 Hard Link
 Multiple absolute path names refer to the same node in a naming graph.
 names resolved by following a specific path in a naming graph from one node to
another.
 For example in the above diagram, in which node n5 can be referred to by two
different path names. Both path names /keys and /home/steen/keys
 Symbolic Link
 It represent an entity by a leaf node, say N, but instead of storing the address or
state of that entity, the node stores an absolute path name.
 For example, as it can be shown in the following figure the path name
/home/steen/keys, which refers to a node containing the absolute path name
/keys, is a symbolic link to node n5.
24
Linking and Mounting

The concept of a symbolic link explained in a naming graph.

25
Linking and Mounting
 A collection of name spaces can be distributed across
different machines.
 Each name space implemented by separate servers
 To mount a foreign name space in a distributed system
requires at least the following information:
 The name of an access protocol.
 The name of the server.
 The name of the mounting point in the foreign name space.

26
Linking and Mounting
 The organization of a file system on the client machine is partly shown in the
following figure.
 The root directory has a number of user-defined entries, including a subdirectory
called /remote.
 This subdirectory is intended to include mount points for foreign name spaces
such as the user’s home directory at VU University.
 To this end, a directory node named /remote/vu is used to store the URL
nfs ://its.cs.vu.nl/home/steen.
 Now consider the name /remote/vu/mbox.
 This name is resolved by starting in the root directory on the client’s machine and
continues until the node /remote/vu is reached.
 The process of name resolution then continues by returning the URL nfs
://its.cs.vu.nl/home/steen, in turn leading the client machine to contact the file server
its.cs.vu.nl by means of the NFS protocol, and to subsequently access directory
/home/steen.
 Name resolution can then be continued by reading the file named mbox in that directory,
after which the resolution process stops.
27
Linking and Mounting
 Mounting remote name spaces through a specific access protocol.

28
Implementation of a Name Space
 A name space is often implemented by name servers
 In LAN, a single name server is often good enough
 In large-scale distributed system, the implementation of a name space is
often distributed over multiple name servers
 A name space for large-scale distributed systems is often
organized hierarchically
 Global layer
 often stable, represents organizations or groups of organizations
 Administrational layer
 Represents groups of entities in a single organization or admin. unit
 Managerial layer
 Nodes often change frequently, e.g., hosts in a local network
 May be managed by system administrators or end users
29
Name Space Distribution

An example partitioning of the DNS name space, including Internet-accessible files, into three layers.

30
Name Space Distribution

A comparison between name servers for implementing nodes from a large-scale name space
partitioned into a global layer, an administrational
layer, and a managerial layer.

31
Implementation of Name Resolution
 Assume that the (absolute) path name
 root: <nl, vu, cs, ftp> is to be resolved.
 Using URL notation, this path would correspond to ftp://cs.vu.nl
 Two ways to implement name resolution
 Iterative name resolution
 Hands over the complete name to the root name server.
 Intermediate result(i.e., the address of the next name server in the
hierarchy) is returned back to the client’s name resolver.
 The resolution process continues iteratively until the complete path is
resolved.
 Recursive name resolution
 Intermediate results are passed to next name server in the hierarchy.
 This process continues recursively until the complete path is resolved.
32
Implementation of Name Resolution
 The principle of iterative name resolution.

33
Implementation of Name Resolution
The principle of recursive name resolution.

34
Iterative Vs Recursive Resolution

 Recursive name resolution of <nl, vu, cs, ftp>.


 Name servers cache intermediate results for subsequent lookups.
35
Iterative Vs Recursive Resolution

The comparison between recursive and iterative name resolution with respect to
communication costs.

36
Domain name system
 DNS is one of the largest distributed naming services in use today.
 Used for looking up host addresses and mail servers.
 The DNS name space is hierarchically organized as a rooted tree.
 A sub-tree is called a domain;
 a path name to its root node is called domain name.
 In practice, the DNS namespace can be divided into a global layer and
administrational layer.
 The managerial layer, which is generally formed by local file systems, is
formally not part of DNS.
 A node is the DNS name space often represents several entities at the same
time, e.g., a domain and a zone.

 The content of a node is formed by a collection of resource records.

37
The DNS Name Space
 The most important types of resource records forming the contents of nodes
in the DNS name space.

38
Attribute Based Naming
 Attribute-based naming name and lookup entities by
means of their attributes
 Also known directory service (yellow page)
 Attribute-based naming
 Each entity is associated with a collection of attributes
 The naming system provides one or multiple entities that match
a user’s description
 Problem
 Lookup operations can be extremely expensive,
 As it demands to match requested attribute values, against actual attribute
values
 Done by inspecting all entities (in principle)

39
End of Lecture 5

40

You might also like