Distributed Computing: Unit-1 (
Distributed Computing: Unit-1 (
A distributed computer system consists of multiple software components that are on multiple computers, but
run as a single system. The computers that are in a distributed system can be physically close together and
connected by a local network, or they can be geographically distant and connected by a wide area network. A
distributed system can consist of any number of possible configurations, such as mainframes, personal
computers, workstations, minicomputers, and so on. The goal of distributed computing is to make such a
network work as a single computer.
Distributed computing systems can run on hardware that is provided by many vendors, and can use a variety
of standards-based software components. Such systems are independent of the underlying software. They can
run on various operating systems, and can use various communications protocols. Some hardware might use
UNIX or Linux as the operating system, while other hardware might use Windows operating systems. For
intermachine communications, this hardware can use SNA or TCP/IP on Ethernet or Token Ring.
We can organize software to run on distributed systems by separating functions into two parts: clients and
servers.
1. Architectural Models
An architectural model of a distributed system:
1. it simplifies and abstracts the functions of the individual components.
2. it considers:
‣ the placement of the components across a network of computers (seeking to define useful patterns for the
distribution of data and workload).
‣ the interrelationships between the components (i.e., their functional roles and the patterns of communication
between them).
Client processes interact with individual server processes in separate host computers in order to access the
shared resources that they manage.
• Mobile code
2. Fundamental models
Fundamental models are concerned with a more formal description of the properties that are common in all of
the architectural models.
A model contains only the essential ingredients that we need to consider in order to understand and reason
about some aspects of a system’s behaviour.
Failure model: the correct operation of a distributed system is threatened whenever a fault occurs in any of the
computers on which it runs or in the network that connects them.
• In a distributed system both processes and communication channels may fail (that is, they may depart from
what is considered to be correct or desirable behavior).
• The failure model defines the ways in which failures may occur in order to provide an understanding of the
effects of failures.
‣ Omission failures: a process or communication channel fails to perform actions that it is supposed to do.
Distributed Computing
‣ Arbitrary failures: any type of error may occur. The term arbitrary is used to describe the worst possible
failure semantics, in which any type of error may occur.
• Arbitrary failure of a process: the process arbitrarily omits intended processing steps or takes unintended
processing steps.
• Communication channel arbitrary failures: message contents may be corrupted or non-existent messages
may be delivered or real messages may be delivered more than once.
‣ Timing failures: applicable in synchronous distributed systems. Timing failures are applicable in synchronous
distributed systems, where time limits are set on process execution time, message delivery time and clock drift
rate.
Security model: the openness of distributed systems exposes them to attack by both external and internal
agents.
• This model provides the basis for the security model : The security of a distributed system can be achieved
by securing the processes and the channels used for their interactions and by protecting the objects that they
encapsulate against unauthorized access.
Distributed Computing
Designing a distributed system does not come as easy and straightforward. A number of challenges need to
be overcome in order to get the ideal system. The major challenges in distributed systems are listed below:
1. Heterogeneity:
The Internet enables users to access services and run applications over a heterogeneous collection of
computers and networks. Heterogeneity (that is, variety and difference) applies to all of the following:
Middleware : The term middleware applies to a software layer that provides a programming abstraction as well
as masking the heterogeneity of the underlying networks, hardware, operating systems and programming
languages. Most middleware is implemented over the Internet protocols, which themselves mask the
differences of the underlying networks, but all middleware deals with the differences in operating systems
and hardware.
Heterogeneity and mobile code : The term mobile code is used to refer to program code that can be
transferred from one computer to another and run at the destination – Java applets are an example. Code
suitable for running on one computer is not necessarily suitable for running on another because executable
programs are normally specific both to the instruction set and to the host operating system.
2. Transparency:
Transparency is defined as the concealment from the user and the application programmer of the separation of
components in a distributed system, so that the system is perceived as a whole rather than as a collection of
independent components. In other words, distributed systems designers must hide the complexity of the
systems as much as they can. Some terms of transparency in distributed systems are:
Access - Hide differences in data representation and how a resource is accessed.
Concurrency - Hide that a resource may be shared by several competitive users.
Failure - Hide the failure and recovery of a resource.
Location - Hide where a resource is located.
Migration - Hide that a resource may move to another location.
Persistence - Hide whether a (software) resource is in memory or a disk.
Relocation - Hide that a resource may be moved to another location while in use.
Replication - Hide that a resource may be copied in several places.
3. Openness
The openness of a computer system is the characteristic that determines whether the system can be extended
Distributed Computing
and reimplemented in various ways. The openness of distributed systems is determined primarily by the
degree to which new resource-sharing services can be added and be made available for use by a variety of
client programs. If the well-defined interfaces for a system are published, it is easier for developers to add new
features or replace sub-systems in the future. Example: Twitter and Facebook have API that allows developers
to develop their own software interactively.
4. Concurrency
Both services and applications provide resources that can be shared by clients in a distributed system. There
is therefore a possibility that several clients will attempt to access a shared resource at the same time. For an
object to be safe in a concurrent environment, its operations must be synchronized in such a way that its data
remains consistent. This can be achieved by standard techniques such as semaphores, which are used in
most operating systems.
5. Security
Many of the information resources that are made available and maintained in distributed systems have a high
intrinsic value to their users. Their security is therefore of considerable importance. Security for information
resources has three components:
Confidentiality - protection against disclosure to unauthorized individuals.
Integrity - protection against alteration or corruption.
Availability for the authorized - protection against interference with the means to access the resources.
6. Scalability
Distributed systems must be scalable as the number of user increases. A system is said to be scalable if it can
handle the addition of users and resources without suffering a noticeable loss of performance or increase in
administrative complexity. Scalability has 3 dimensions:
7. Failure Handling
Computer systems sometimes fail. When faults occur in hardware or software, programs may produce
incorrect results or may stop before they have completed the intended computation. The handling of failures is
particularly difficult.
Distributed Computing
One of the following methods can be used to enable any two computers to exchange binary data values:
• The values are converted to an agreed external format before transmission and converted to the local form
on receipt; if the two computers are known to be the same type, the conversion to external format can be
omitted.
• The values are transmitted in the sender’s format, together with an indication of the format used, and the
recipient converts the values if necessary.
Note, however, that bytes themselves are never altered during transmission. To support RMI or RPC, any data
type that can be passed as an argument or returned as a result must be able to be flattened and the primitive
data values represented in an agreed format. An agreed standard for the representation of data structures
and primitive values is called an external data representation.
primitive values from their external data representation and the rebuilding of the data structures.
Remote Procedure Call (RPC) is a protocol that enables one program to request a service from a program
located in another computer on a network without having to understand the network's details. A procedure call
is also sometimes known as a function call or a subroutine call.
RPC uses the client-server model. The requesting program is a client and the service providing program is the
server. Like a regular or local procedure call, an RPC is a synchronous operation requiring the requesting
program to be suspended until the results of the remote procedure are returned. However, the use of
lightweight processes or threads that share the same address space allows multiple RPCs to be performed
concurrently.
Remote Method Invocation (RMI) is an API which allows an object to invoke a method on an object that exists
in another address space, which could be on the same machine or on a remote machine. Through RMI, object
running in a JVM present on a computer (Client side) can invoke methods on an object present in another JVM
(Server side). RMI creates a public remote server object that enables client and server side communications
through simple method calls on the server object.
Working of RMI
The communication between client and server is handled by using two intermediate objects: Stub object (on
client side) and Skeleton object (on server side).
Stub Object
The stub object on the client machine builds an information block and sends this information to the server. The
block consists of
Skeleton Object
The skeleton object passes the request from the stub object to the remote object. It performs following tasks
● It calls the desired method on the real object present on the server.
● It forwards the parameters received from the stub object to the method.
1. Approach:
RMI uses an object-oriented paradigm where the user needs to know the object and the method of the object
he needs to invoke.
RPC doesn't deal with objects. Rather, it calls specific subroutines that are already established.
2. Working:
With RPC, you get a procedure call that looks pretty much like a local call. RPC handles the complexities
involved with passing the call from local to the remote computer.
RMI does the very same thing, but RMI passes a object and the method that is being called.
RMI = RPC + Object-orientation
3. Better one:
RMI is a better approach compared to RPC, especially with larger programs as it provides a cleaner code that
is easier to identify if something goes wrong.
Distributed Computing
1. Exactly once semantics –The call executes at most once - either it does not execute at all or it executes
exactly once depending on whether the server machine goes down. Unlike the previous semantics, these
semantics require the detection of duplicate packets, but work for non-idempotent operations.
2. Maybe semantics – caller can not determine whether or not the remote method has been executed.
3. At-least-once semantics – The call executes at least once as long as the server machine does not fail.
These semantics require very little overhead and are easy to implement. The client machine continues to send
call requests to the server machine until it gets an acknowledgement. If one or more acknowledgements are
lost, the server may execute the call multiple times. This approach works only if the requested operation is
idempotent, that is, multiple invocations of it return the same result. Servers that implement only idempotent
operations must be stateless, that is, must not change global state in response to client requests. Thus, RPC
systems that support these semantics rely on the design of stateless servers.
4. At-most-once semantics - RMI implements "at most once" semantics (some other distributed systems
provide "at least once" semantics). If a method returns normally, it is guaranteed to have executed once.
However, if an exception such as a MarshallException occurs, the caller doesn't know whether the exception
occurred when transmitting the call or returning the return value, i.e. the caller cannot determine whether the
remote method executed. In this case, the client may wish to attempt the invocation again. To facilitate this,
the implementation of a remote method should be idempotent where possible, which means that the operation
can be executed multiple times with the same effect as executing once. Even a method such as an account
withdrawal can be made idempotent by passing an operation ID with the invocation: the remote object ignores
invocations for operation IDs that have already been performed.
The first thing to do is to create an interface which will provide the description of the methods that can be
invoked by remote clients. This interface should extend the Remote interface and the method prototype within
the interface should throw the RemoteException.
The next step is to implement the remote interface. To implement the remote interface, the class should extend
to UnicastRemoteObject class of java.rmi package. Also, a default constructor needs to be created to throw the
java.rmi.RemoteException from its parent constructor in class.
return result;
}
Distributed Computing
}
Step 3 : Creating Stub and Skeleton objects from the implementation class using rmic. The rmic tool is used to
invoke the rmi compiler that creates the Stub and Skeleton objects. Its prototype is rmic classname. For above
program the following command need to be executed at the command prompt
rmic SearchQuery
Start the registry service by issuing the following command at the command prompt start rmiregistry
The next step is to create the server application program and execute it on a separate command prompt.
● The server program uses createRegistry method of LocateRegistry class to create rmiregistry within the
server JVM with the port number passed as argument.
● The rebind method of Naming class is used to bind the remote object to the new name.
}
Step 6: Create and execute the client application program
The last step is to create the client application program and execute it on a separate command prompt. The
lookup method of Naming class is used to get the reference of the Stub object.
The idea of distributed objects is an extension of the concept of remote procedure calls. In a remote procedure
call system, code is executed remotely via a remote procedure call. The unit of distribution is the procedure /
function /method (used as synonyms). So the client has to import a client stub (either manually as in RPC or
automatically as in RMI) to allow it to connect to the server offering the remote procedure. Distribution unit in a
system for distributed objects, is the object. That is, a client imports a ”something” (in Java JINI system, it’s
called a proxy) which allows the client access to the remote object as if it were part of the original client
program (as with RPC and RMI, sort of transparently).
Distributed Computing
Distributed event-based systems extend local event model by allowing multiple objects at diff. locations to be
notified of events taking place at an object.
Two characteristics: heterogeneous, asynchronous. Such systems can be useful: stock dealing room
Publish-subscribe paradigm:
Publisher sends notifications.
Subscriber registers interest to receive notifications.
Object of interest: where events happen, change of state as a result of its operations being invoked.
Events: occurs in the object of interest.
Notification: an object containing information about an event.
Observer objects: decouple an object of interest from its subscribers.
Distributed Computing
A Server having a service to offer exports an interface for it. Exporting an interface registers it with the
system so that clients can use it.
A Client must import an (exported) interface before communication can begin
Distributed Computing
A distributed file system enables programs to store and access remote files exactly as they do local ones,
allowing users to access files from any computer on a network. The performance and reliability experienced for
access to files stored at a server should be comparable to that for files stored on local disks.
The design of large-scale wide area read-write file storage systems poses problems of load balancing,
reliability, availability and security.
a.) File Server Architecture :- File server architecture provides access to files by structuring the file service
as three components:
1. Flat file service
2. Directory service
3. Client module
The relevant modules and their relationship is shown in Figure
Distributed Computing
Operations :-
1. Read 2. Write 3. Create 4. Delete 5. GetAttributes 6. SetAttributes
Read(FileId, i, n) : Reads a sequence of items up to n from a file starting at item i.
Write(FileId, i, Data) : Write a sequence of Data to a file, starting at item i.
Create() : Creates a new file of length 0 and delivers a UFID for it.
Delete(FileId) :Removes the file from the file store.
GetAttributes(FileId) : Returns the attributes of the file.
SetAttributes(FileId, Attr) :Sets the file attributes.
Directory service :-
• Provides mapping between text names for the files and their UFIDs.
• Clients may obtain the UFID of a file by quoting its text name to directory service.
• Directory service supports functions to add new files to directories.
Operations :-
1. Lookup 2. AddName 3. UnName 4. GetNames
Lookup(Dir, Name) : Locates the text name in the directory and returns the relevant UFID. If Name is not in the
directory, throws an exception.
AddName(Dir, Name, File) : If Name is not in the directory, adds(Name,File) to the directory and updates the
file’s attribute record. If Name is already in the directory, throws an exception.
UnName(Dir, Name) :If Name is in the directory, the entry containing Name is removed from the directory. If
Name is not in the directory, throws an exception.
GetNames(Dir, Pattern):Returns all the text names in the directory that match the regular expression Pattern.
b.) Sun Network File System :- provides transparent, remote access to filesystems. Unlike many other
remote file system implementations under UNIX, NFS is designed to be easily portable to other operating
systems and machine architectures. It uses an External Data Representation (XDR) specification to describe
protocols in a machine and system independent way. NFS is implemented on top of a Remote Procedure Call
package (RPC) to help simplify protocol definition, implementation, and maintenance.
Distributed Computing
Design Goals -
(a) NFS was designed to simplify the sharing of filesystem resources in a network of non-homogeneous
machines.
(b) The overall design goals of NFS were: Machine and Operating System Independence. The protocols used
should be independent of UNIX so that an NFS server can supply files to many different types of clients. The
protocols should also be simple enough that they can be implemented on low-end machines like the PC.
(c) Crash Recovery - When clients can mount remote filesystems from many different servers it is very
important that clients and servers be able to recover easily from machine crashes and network problems.
(d) Transparent Access - to provide a system which allows programs to access remote files in exactly the
same way as local files, without special pathname parsing, libraries, or recompiling. Programs should not need
or be able to tell whether a file is remote or local.
e.) Mounting:-
The process of including a new filesystem.
/etc/exports has filesystems that can be mounted by others.
Clients use a modified mount command for remote filesystems.
Communicates with the mount process on the server in a mount protocol.
Hard-mounted
• user process is suspended until request is successful
• when server is not responding, request is retried until it's satisfied.
Soft-mounted
• if server fails, client returns failure after a small no. of retries.
• user process handles the failure
AutoFS
You can mount NFS file system resources by using a client-side service called automounting (or AutoFS), which enables a system to
automatically mount and unmount NFS resources whenever you access them. The resource remains mounted as long as you remain in
the directory and are using a file. If the resource is not accessed for a certain period of time, it is automatically unmounted.
AutoFS provides the following features:
● NFS resources don't need to be mounted when the system boots, which saves booting time.
● Users don't need to know the root password to mount and unmount NFS resources.
● Network traffic might be reduced, since NFS resources are only mounted when they are in use.
Distributed Computing
f.) Discovery service:- Directory service that registers services provided in a spontaneous networking
environment. Provide an interface for automatically registering and de-registering services, as well as an interface
for clients to look up the services they require. Ex : a printer (or the service that manages it) may register its
attributes with the discovery service as follows :‘resourceClass = printer, type=laser, color=yes, resolution=600dpi,
location=room101,url=http://www.hotelNW.com/services/printer98’.
Jini Discovery Service Designed to be used for spontaneous networking Entirely java-based. Computers
communicate by means of RMI, and can download code if necessary. Discovery-related components in a Jini
system are look up services. A Jini service (such as printing service) may be registered with many look up services.
g.) Domain Name System(port, tree structure, block diagram name to ip):-
Port : 53
Domain Name System(DNS) tree structure:-
Working of DNS:-
Distributed Computing
h.) Name Services :- Stores a collection of one or more set of bindings between textual name and attributes
for objects. Provide facilities for resource location, email addressing and authentication. When the naming database
grows from small to large scale, the structure of namespace may change. The service should accommodate it. The
structure of the name space may change during that time to reflect changes in organizational structures. Difficult to
maintain complete consistency between all copies of a database entry.
Namespace = collection of all valid names recognised by a service with – a syntax for specifying names, and – rules
for resolving names (e.g., left to right)
Distributed Computing
Name Resolution
Distributed Computing
Name Resolution
The way these hostnames are resolved to their mapped IP address is called Domain Name
Resolution. On almost all operating systems whether they be Apple, Linux, Unix, Netware, or
Windows the majority of resolutions from domain names to IP addresses are done through a
procedure calledDNS.
Distributed Computing
Distributed Computing
Two inherent limitations of distributed systems are: lack of global clock and lack of shared memory. This has
two important implications. First, due to the absence of any system-wide clock that is equally accessible to all
processes, the notion of common time does not exist in a distributed system; different processes may have
different notions of time. As a result, it is not always possible to determine the order in which two events on
different processes were executed. Second, since processes in a distributed system do not share common
memory, it is not possible for an individual process to obtain an up-to-date state of the entire system.
Example:
The group consists of eight processes, numbered from 0 to 7.
● Previously process 7 was the coordinator, but it has just crashed.
● Process 4 is the first one to notice this, so it sends ELECTION messages to all the processes higher than it,
namely 5, 6, and 7, as shown in Figure (a).
● Processes 5 and 6 both respond with OK, as shown in Figure (b).
● Upon getting the first of these responses, 4 knows that its job is over. It knows that one of these bigwigs will
take over and become the coordinator. It just sits back and waits to see who the winner will be.
● Figure (c), both 5 and 6 hold elections, each one only sending messages to those processes higher than
itself.
● Figure (d) process 6 tells 5 that it will take over.
Distributed Computing
● At this point, 6 knows that 7 is dead and that it (6) is the winner.
● If there is state information to collect from disk or elsewhere to pick up where the old coordinator left off, 6
must now do what is needed.
● When it is ready to take over, 6 announces this by sending a COORDINATOR message to all running
processes.
● When 4 gets this message, it can now continue with the operation it was trying to do when it discovered that
7 was dead, but using 6 as the coordinator this time.
● In this way, the failure of 7 handled and the work can continue.
● If process 7 ever restarted, it will just send all the others a COORDINATOR message and bully them into
submission.
Distributed Computing
● We can solve this problem using relative time correction C which can be calculated as:
C=(T2−T1)+(T3−T4)2C=(T2−T1)+(T3−T4)2
● The way this works is that the client sends a packet with T1 recorded to the time server. The time
server will record the receipt time of the packet T2. When the response is sent, the time server will write
its current time T3 to the response. When the client receives the response packet, it will record T4 from
its local clock.
● When the value of C is worked out, the client can correct its local clock.
● The client must be careful. If the value of C is positive, then C can be added to the software clock.
● If the value of C is negative, then the client must artificially decrease the amount of milliseconds added
to its software clock until the offset is cleared.
● It is always inadvisable to cause the clock to go backwards. Most software that relies on time will not
react well to this.
Physical Clocks
● Computer Timer : an integrated circuit that contains a precise machined quartz crystal. When kept
under tension the quartz crystal oscillates at a well-defined frequency.
Distributed Computing
● Clock Tick : after a predefined number of oscillations, the timer will generate a clock tick. This clock tick
generates a hardware interrupt that causes the computer’s operating system to enter a special routine
in which it can update the software clock and run the process scheduler.
1−p<=dC/dT<=1+p1−p<=dCdT<=1+p
In Fig. 3-13 we see what happens if two processes, 2 and 5, discover simultaneously that the previous coordinator,
process 7, has crashed. Each of these builds an ELECTION message and starts circulating it. Eventually, both messages will
go all the way around, and both 2 and 5 will convert them into COORDINATOR messages, with exactly the same members
and in the same order. When both have gone around again, both will be removed. It does no harm to have extra messages
circulating; at most it wastes a little bandwidth.
• On receiving a RELEASE message from i, i’s request is removed from the local request queue.
Peterson Algorithm : is a concurrent programming algorithm for mutual exclusion that allows two or more
processes to share a single-use resource without conflict, using only shared memory for communication. The
algorithm uses two variables, flag and turn. A flag[n] value of true indicates that the process n wants to enter the
critical section. Entrance to the critical section is granted for process P0 if P1 does not want to enter its critical
section or if P1 has given priority to P0 by setting turn to 0.
Distributed Computing
Clock drift : Clock drift refers to several related phenomena where a clock does not run at exactly the same
rate as a reference clock. That is, after some time the clock "drifts apart" or gradually desynchronizes from
the other clock.
Distributed Computing
(CORBA) is an architecture and specification for creating, distributing, and managing distributed program
objects in a network. It allows programs at different locations and developed by different vendors to
communicate in a network through an "interface broker." CORBA was developed by an association of vendors
through the Object Management Group (OMG), which currently includes over 500 member companies. Both
International Organization for Standardization (ISO) and X/Open have sanctioned CORBA as the standard
architecture for distributed objects (which are also known as components).
CORBA 3 is the latest level.
The essential concept in CORBA is the Object Request Broker (ORB). ORB support in a network of clients and
servers on different computers means that a client program (which may itself be an object) can request
services from a server program or object without having to understand where the server is in a distributed
network or what the interface to the server program looks like. To make requests or reponse between the
ORBs, programs use the General Inter-ORB Protocol (GIOP) and, for the Internet, its Internet Inter-ORB
Protocol (IIOP). IIOP maps GIOP requests and replies to the Internet's Transmission Control Protocol (TCP)
layer in each computer.
Distributed Computing Environment (DCE), a distributed programming architecture that preceded the trend
toward OOP and CORBA, is currently used by a number of large companies. DCE will perhaps continue to
exist along with CORBA and there will be "bridges" between the two.
Distributed Computing
Nested Transactions
* Structured in an invert-root tree.
* The outermost transaction is the top-level transaction. Others are sub-transactions.
* a sub-transaction is atomic to its parent transaction.
* Sub-transactions at the same level can run concurrently.
* Each sub-transaction can fail independently of its parent and of the other sub-transactions.
* More robust : Sub-transactions can commit or abort independently. For example, a transaction to deliver a
mail message to a list of recipients.
A deadlock is a condition in a system where a set of processes (or threads) have requests for resources that
can never be satisfied. Essentially, a process cannot proceed because it needs to obtain a resource held by
another process but it itself is holding a resource that the other process needs.
○ Wait,
○ Die, or
○ Wound
The Wait-Die algorithm:
Allow wait only if waiting process is older.
Since timestamps increase in any chain of waiting processes, cycles are impossible.
The Wound-Wait algorithm preempts the younger process. When the younger process re-requests resource,
it has to wait for older process to finish. This is the better of the two algorithms.
Note: To avoid starvation, a process should not be assigned a new timestamp each time it restarts.
• if the process itself is waiting on a resource, it updates the sending and destination fields of the
message and forwards it to the resource holder.
• If it is waiting on multiple resources, a message is sent to each process holding the resources. This
process continues as long as processes are waiting for resources. If the originator gets a message and
sees its own process number in the blocked field of the message, it knows that a cycle has been taken
and deadlock exists. In this case, some process (transaction) will have to die. The sender may choose
to commit suicide and abort itself or an election algorithm may be used to determine an alternate victim.
Distributed Computing
Lock Hierarchies
The idea of a lock hierarchy is to assign a numeric lock to every mutex in the system, and then consistently
follow two simple rules:
● Rule 1: While holding a lock on a mutex at level N, you may only acquire new locks on mutexes at
lower levels <N.
● Rule 2: Multiple locks at the same level must be acquired at the same time, which means we need a
"lock-multiple" operation such as lock( mut1, mut2, mut3, ... ). This operation internally make sure it
always takes the requested locks in some consistent global order.
If the entire program follows these rules, then there can be no deadlock among the mutex acquire operations,
because no two pieces of code can ever try to acquire two mutexes a and b in opposite orders.
e.) Two-version locking– allows more concurrency by postponing write locks till commit time
read operations are allowed while write operation is being performed
write operation is done on a tentative version of data items
read operation is done on committed version
– three types of locks: read, write, & commit locks
Distributed Computing
Hierarchic locks
– allows mixed granularity locks, building a hierarchy of locks
giving owner of lock explicit access to node in hierarchy and implicit access to its children
– introduces an additional type of lock: intention-Read/Write
before a child node is granted a read/write lock, an intention to read/write lock is set on the parent node