0% found this document useful (0 votes)
31 views52 pages

Chapter Three

Distributed system Lecture Note

Uploaded by

habtamu molla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views52 pages

Chapter Three

Distributed system Lecture Note

Uploaded by

habtamu molla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 52

Naming and Threads

Chapter 3
Introduction
 Name in a distributed system is a string of
bits or characters that is used to refer to an
entity.
 To operate on an entity, it is necessary to
access it, for which we need an access point.
 An access point is special kind of entity in a
distributed system.
 An entity can offer more than one access
point. (Phone owning).
 An entity may change its access points in
the course of time. (moving to another
country)
Cont..
 An address is a special kind of name: it
refers to an access point of an entity.
 There are other types of names that are
used to uniquely identify an entity
referred to us Identifiers.
 A true identifier is a name that has the
following properties
– 1. An identifier refers to at most one entity.
– 2. Each entity is referred to by at most one
identifier.
– 3. An identifier always refers to the same
entity.
Name Spaces

A general naming graph with a single root node


Name Spaces

The general organization of the UNIX file system


implementation on a logical, contiguous disk
blocks
Linking and Mounting

The concept of a symbolic link in a naming graph


Linking and Mounting

Mounting remote name spaces through a specific process protocol


Name Space Distribution

An example partitioning of the DNS name space,


including Internet-accessible files, into three layers
Name Space Distribution

Item Global Administrational Managerial

Geographical scale of network Worldwide Organization Department

Total number of nodes Few Many Vast numbers

Responsiveness to lookups Seconds Milliseconds Immediate

Update propagation Lazy Immediate Immediate

Number of replicas Many None or few None

Is client-side caching applied? Yes Yes Sometimes

A comparison between name servers for implementing nodes from a large-


scale name space partitioned into a global layer, as an administrational
layer, and a managerial layer
Implementation of Name
Resolution

The principle of iterative name resolution


Implementation of Name
Resolution

The principle of recursive name resolution


Implementation of Name
Resolution
Server for Should Passes to Receives Returns to
Looks up
node resolve child and caches requester

cs <ftp> #<ftp> -- -- #<ftp>

vu <cs,ftp> #<cs> <ftp> #<ftp> #<cs>


#<cs, ftp>

ni <vu,cs,ftp> #<vu> <cs,ftp> #<cs> #<vu>


#<cs,ftp> #<vu,cs>
#<vu,cs,ftp>
root <ni,vu,cs,ftp> #<nl> <vu,cs,ftp> #<vu> #<nl>
#<vu,cs> #<nl,vu>
#<vu,cs,ftp> #<nl,vu,cs>
#<nl,vu,cs,ftp>

Recursive name resolution of <nl, vu, cs, ftp>. Name servers


cache intermediate results for subsequent lookups
Implementation of Name
Resolution

The comparison between recursive and iterative name


resolution with respect to communication costs
The DNS Name Space
Type of Associated
Description
record entity
SOA Zone Holds information on the represented zone
A Host Contains an IP address of the host this node represents
MX Domain Refers to a mail server to handle mail addressed to this node
SRV Domain Refers to a server handling a specific service
NS Zone Refers to a name server that implements the represented zone
CNAME Node Symbolic link with the primary name of the represented node
PTR Host Contains the canonical name of a host
HINFO Host Holds information on the host this node represents
TXT Any kind Contains any entity-specific information considered useful

The most important types of resource records forming


the contents of nodes in the DNS name space
Home-Based Approaches

The principle of Mobile IP


Hierarchical Approaches

Hierarchical organization of a location service into


domains, each having an associated directory node
Java RMI
 Java Remote Method Invocation (Java
RMI) is an extension of the Java object
model to support distributed objects
• Methods of remote Java objects can be invoked from other Java
virtual machines, possibly on different hosts
 RMI uses object serialization to marshal
and unmarshal
• Any serializable object can be used as
parameter or method return
RMI Architecture &
Components
 Remote reference module is responsible for
providing addressing to the proxy (stub) object
 Proxy is used to implement a stub and provide
transparency to the client.
 Communication module is responsible for
networking
 Dispatcher selects the proper skeleton and forward
message to it
 Skeleton un-marshals the request and calls the
remote object
Cont..
RMI Architecture and Components
Cont..
RMI Steps
RMI Implementation
Cont..
Cont..
Process and Threads
 To execute a program, the OS creates a
number of virtual processors.
 Process: program in execution
 Execution environment:
• an address space
• higher level resources
• CPU context
 Expensive creation
 Expensive context switch (may also require
memory swap)
 The solution is
• Split a process in threads
Cont..
 Threads: no attempt to provide
concurrency transparency
 Executed in the same address space
threads of the same process share the
same execution environment
 Thread context only the CPU context
(+information for thread management)
 No performance overhead, but
 Harder to use, require extra effort to
protect against each other
More on process creation
Two issues
 Creation of an execution environment
(and an initial thread within it)
• Share
• Copy
• Copy-on-Write
 Choice of target host (in distributed
systems)
• Transfer Policy (local or remote host)
• Location Policy (to which node to transfer)
• Load sharing (sender vs receiver initiated)
Introduction: Processes vs
 Threads
Creating a new thread within an existing
process is cheaper than creating a
process (~10-20 times)
Traditional Unix process
 Child processes created from a parent
process using the command fork.
Drawbacks:
• fork is expensive: Memory is copied from a parent
to its children.
• Inter process communication is needed to pass
information from parent to children and vice versa
Cont..
 Switching to a different thread within the
same process is cheaper than switching
between threads belonging to different
processes (5-50 times).
 Threads within a process may share data
and other resources conveniently and
efficiently compared with separate
processes.
 Threads within a process are not
protected from one another.
Cont..
Thread implementation
Threads can be implemented:
 in the OS kernel (Win NT, Solaris, Mach)
 at user level (e.g. by a thread library: C
threads, pthreads), or in the language Java
 Advantages:
 lightweight - no system calls
 modifiable scheduler
 low cost enables more threads to be employed
 Disadvantages
 not pre-emptive, cannot schedule threads within different
procedures
 Cannot exploit multiple processors
 Blocking system calls (page fault) blocks the process and
thus all threads

􀂃
User-space solution
 have nothing to do with the kernel, so all
operations can be completely handled
within a single process
 so implementations can be extremely
efficient.
 All services provided by the kernel are
done on behalf of the process in which a
thread resides
 In practice we want to use threads when
there are lots of external events: threads
block on a per-event basis if the kernel
can’t distinguish threads.
Kernel solution
 The whole idea is to have the kernel contain
the implementation of a thread package.
 Operations that block a thread are no
longer a problem: the kernel schedules
another available thread within the same
process.
 Handling external events is simple: the
kernel (which catches all events) schedules
the thread associated with the event.
 The big problem is the loss of efficiency due
to the fact that each thread operation
requires a trap to the kernel.
Conclusion: mix user-level and kernel-level
Cont..
 Lightweight processes (LWP) that can execute
user-level threads.
 An LWP runs in the context of a single (heavy-weight)
process and a user level thread package (create,
destroy and synchronization thread )
 Assign a thread to an LWP (hidden from the
programmer)
 When an LWP is created (through a system call) gets
its own stack and execute a scheduling routine (that
searches for a thread to execute)
 If many LWPs, each executes the scheduler, they share
a thread table (with the current set of threads),
synchronization among them in user space.
Cont.. (LWP)
 When a thread calls a blocking user-level operation
(e.g., blocks on a mutex or a condition variable), it calls
the scheduling routine, when another runnable thread
is found, a context switch is made to that thread which
is then bound to the same LWP (the LWP that is
executing the thread need not be informed)
 When a user-level thread does a blocking system call,
the LWP that is executing that thread blocks. The
thread remains bound to the LWP.
 The kernel can simply schedule another LWP having a
runnable thread bound to it.
 Note that this thread can switch to any other runnable
thread currently in user space.
 When there are no threads to schedule, an LWP may
remain idle, and may even be removed (destroyed) by
the kernel.
Cont..
Multithreaded Servers
Why threads?
 In a single-threaded system process
whenever a blocking system call is
executed, the process as a whole is
blocked
 executing a program on a multiprocessor
system (assign each thread to a different
CPU)
Example
 Each request takes on average 2msecs of
processing and 8 msecs of I/O delay (no
caching)
 Maximum server throughput (measured as
client requests handled per sec)
 Single thread
– Turnaround time for each request: 2 + 8 = 10
msecs
– Throughput: 100 req/sec
 Two threads (if disk requests are serialized)
– Turnaround time for each request: 8 msecs
– Throughput: 125 req/sec
Cont..
 A pool of “worker” threads to process the
requests.
 One I/O (dispatcher) thread: receives
requests from a collections of ports and
places them on a shared request queue
for retrieval by the workers.
Cont..
Improve Performance
 Starting a thread to handle an incoming
request is much cheaper than starting a
new process
 Having a single-threaded server prohibits
simply scaling the server to a
multiprocessor system
 Hide network latency by reacting to next
request while previous one is being
replied
Cont..
Better Structure
 Most servers have high I/O demands.
Using simple, well-understood blocking
calls simplifies the overall structure.
 Multithreaded programs tend to be
smaller and easier to understand due to
simplified flow of control.
Multithreaded Clients
Multithreaded Web client
 Web browser scans an incoming HTML page,
and finds that more files need to be fetched
 Each file is fetched by a separate thread, each
doing a (blocking) HTTP request
 As files come in, the browser displays them
Multiple RPCs
 A client does several RPCs at the same time,
each one by a different thread
 It then waits until all results have been returned.
 Note: if RPCs are to different servers, we may
have a linear speed-up compared to doing RPCs
one after the other
Thread Functions
Thread(ThreadGroup group, Runnable target,
String name)
• Creates a new thread in the SUSPENDED
setPriority(int newPriority), getPriority()
• Set and return the thread’s priority.
run(): A thread executes the run() method of its
target object
start(): Change the state of the thread from
SUSPENDED to RUNNABLE.
sleep(int millisecs): Cause the thread to enter
the SUSPENDED state for the specified time.
yield(): Enter the READY state and invoke the
scheduler.
destroy(): Destroy the thread.
Java thread lifetimes
 New thread created in the same JVM as
its creator in the SUSPENDED state
 start() makes it runnable
• It executes the run() method of an object
designated in its constructor
 A thread ends its life when it returns
from run() or when its destroy() method
is called
 Execute on top of the OS
Servers
 Basic model: A server is a process that
waits for incoming service requests at a
specific transport address.
 Iterative server: the server itself
handles the request.
 Concurrent server: does not handle
the request itself, but passes it to a
separate thread or another process, and
then immediately waits for the next
request.
Endpoints
 Clients sends requests to an endpoint (port)
at the machine where the server is running.

One server per endpoint


Cont..
 Stateless server: does not keep information of
the state of its clients and can change its own
state without informing its clients (e.g., a web
server)
Examples:
 Don’t record whether a file has been opened (simply close it
again after access)
 Don’t keep track of your clients
 Clients and servers are completely independent
 State inconsistencies due to client or server
crashes are reduced
 Possible loss of performance because, e.g., a
server cannot anticipate client behaviour (think
of prefetching file blocks)
Cont..
 Stateful server: maintain information
about its clients
Examples:
 Record that a file has been opened, so that
prefetching can be done
 Knows which data a client has cached, and
allows clients to keep local copies of shared
data
 The performance of Stateful servers can
be extremely high, provided clients are
allowed to keep local copies.
 Cookies?
Code Migration
Reasons for Migrating Code
Performance
• Load balancing
• Process data close to where they reside
• Parallelism (e.g., web search)
Flexibility/dynamic configuration
• Dynamically downloading client-side software
Models for Code Migration
What is moved?
The three segments of a process:
 Code segment: the part that contains
the set of instructions that make up the
program that is being executed.
 Resource segment: references to
external resources needed by the process
(e.g., files, devices, other processes)
 Execution segment: the current
execution state of the process (program
counter, stack, etc)
Cont..
 Weak mobility: move only the code
segment (plus perhaps some
initialization data)
 Always start from its initial state
 Example: Java applets
 code shipping (push) code fetching (pull)
 Strong mobility: move also the
execution segment
 The process resumes execution from where
it was left off
 Harder to implement
Cont..
Who initiates the movement?
 Sender-initiated: migration is initiated
at the machine where the code currently
resides or is being executed
 Example: uploading programs, sending
programs across the Internet
 simpler to implement
 Receiver-initiated: the initiative for
migration is taken by the target machine
 Example: Java Applets
The End

Thanks

You might also like