0% found this document useful (0 votes)
33 views31 pages

Ds Chapter 3-Processes

Uploaded by

Enyew Beyene
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views31 pages

Ds Chapter 3-Processes

Uploaded by

Enyew Beyene
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Distributed Systems

Chapter 3 - Processes

1
Introduction

 communication takes place between processes


 a process is a program in execution
 from OS perspective, management and scheduling of
processes is important
 other important issues arise in distributed systems
 multithreading to enhance performance
 how are clients and servers organized
 process or code migration to achieve scalability and to
dynamically configure clients and servers

2
Process vs. Thread
 Process: unit of allocation
- Resources, privileges, etc
 Thread: unit of execution
- program counter, stack pointer, registers
 Each process has one or more threads
 Each thread belong to one process
 Processes
- Inter-process communication is expensive: need to
context switch
-Secure: one process cannot corrupt another process

3
 Threads
 Inter-thread communication cheap: can use process
memory and may not need to context switch
 Not secure: a thread can write the memory used by
another thread

. 4
User Level vs. Kernel Level Threads

 User level: use user-level thread package; totally


transparent to OS
 Light-weight
 If a thread blocks, all threads in the process block
 Kernel level: threads are scheduled by OS
 A thread blocking won’t affect other threads in the
same process
 Can take advantage of multi-processors
 Still requires context switch, but cheaper than
process context switching

5
3.1 Threads and their Implementation
 threads can be used in both distributed and nondistributed
systems
 Threads in Nondistributed Systems
 a process has an address space (containing program text
and data) and a single thread of control, as well as other
resources such as open files, child processes, accounting
information, etc.
Process 1 Process 2 Process 3

one process with three threads 6


 each thread has its own program counter, registers, stack,
and state; but all threads of a process share address space,
global variables and other resources such as open files, etc.

7
 Threads allow multiple executions to take place in the same
process environment, called multithreading
 Thread Usage – Why do we need threads?
 e.g., a wordprocessor has different parts; parts for
 interacting with the user
 formatting the page as soon as changes are made
 timed savings (for auto recovery)
 spelling and grammar checking, etc.
1. Simplifying the programming model: since many
activities are going on at once
2. They are easier to create and destroy than processes
since they do not have any resources attached to them
3. Performance improves by overlapping activities if there is
too much I/O; i.e., to avoid blocking when waiting for
input or doing calculations, say in a spreadsheet
4. Real parallelism is possible in a multiprocessor system
8
 Thread Implementation
 threads are usually provided in the form of a thread
package
 the package contains operations to create and destroy a
thread, operations on synchronization variables such as
condition variables
 two approaches of constructing a thread package
a. construct a thread library that is executed entirely in user
mode (the OS is not aware of threads)
 cheap to create and destroy threads; just allocate and
free memory
 context switching can be done using few instructions;
store and reload only CPU register values
 disadv: invocation of a blocking system call will block
the entire process to which the thread belongs and all
other threads in that process
b. implement them in the OS’s kernel
 let the kernel be aware of threads and schedule them
 expensive for thread operations such as creation and
deletion since each requires a system call 9
 solution: use a hybrid form of user-level and kernel-level
threads, called lightweight process (LWP)
 a LWP runs in the context of a single (heavy-weight) process,
and there can be several LWPs per process
 the system also offers a user-level thread package for some
operations such as creating and destroying threads, for
thread synchronization (mutexes and condition variables)
 the thread package can be shared by multiple LWPs

combining kernel-level lightweight processes and user-level threads 10


 Threads in Distributed Systems
 Multithreaded Clients
 consider a Web browser; fetching different parts of a page
can be implemented as a separate thread, each opening its
own TCP/IP connection to the server or to separate and
replicated servers
 each can display the results as it gets its part of the page
 Multithreaded Servers
 servers can be constructed in three ways
a. single-threaded process
 it gets a request, examines it, carries it out to completion
before getting the next request
 the server is idle while waiting for disk read, i.e., system
calls are blocking

11
b. threads
 threads are more important for implementing servers
 e.g., a file server
 the dispatcher thread reads incoming requests for a file
operation from clients and passes it to an idle worker
thread
 the worker thread performs a blocking disk read; in
which case another thread may continue, say the
dispatcher or another worker thread

a multithreaded server organized in a dispatcher/worker model 12


c. finite-state machine
 if threads are not available
 it gets a request, examines it, tries to fulfill the request
from cache, else sends a request to the file system; but
instead of blocking it records the state of the current
request and proceeds to the next request
 Summary

Model Characteristics
Single-threaded process No parallelism, blocking system calls
Parallelism, blocking system calls
Threads
(thread only)
Finite-state machine Parallelism, nonblocking system calls
three ways to construct a server

13
3.2 Servers and design issues
3.3.1 General Design Issues
 How to organize servers?
 Where do clients contact a server?
 Whether and how a server can be interrupted
 Whether or not the server is stateless
a. Wow to organize servers?
 Iterative server
 the server itself handles the request and returns the
result
 Concurrent server
 it passes a request to a separate process or thread and
waits for the next incoming request; e.g., a
multithreaded server; or by forking a new process as
is done in Unix
14
b. Where do clients contact a server?
 using endpoints or ports at the machine where the server
is running where each server listens to a specific
endpoint
 how do clients know the endpoint of a service?
 globally assign endpoints for well-known services; e.g.
FTP is on TCP port 21, HTTP is on TCP port 80
 for services that do not require preassigned endpoints,
it can be dynamically assigned by the local OS
 IANA (Internet Assigned Numbers Authority) Ranges
 IANA divided the port numbers into three ranges

 Well-known ports: assigned and controlled by IANA


for standard services, e.g., DNS uses port 53
15
 Registered ports: are not assigned and controlled by IANA;
can only be registered with IANA to prevent duplication e.g.,
MySQL uses port 3306
 Dynamic ports or ephemeral ports : neither controlled nor
registered by IANA
 how can the client know this endpoint? two approaches
i. have a daemon running and listening to a well-known
endpoint; it keeps track of all endpoints of services on the
collocated server
 the client will first contact the daemon which provides it
with the endpoint, and then the client contacts the
specific server

. 16
Client-to-server binding using a daemon
ii. use a superserver (as in UNIX) that listens to all endpoints
and then forks a process to take care of the request; this is
instead of having a lot of servers running simultaneously and
most of them idle

Client-to-Server binding using a superserver 17


c. Whether and how a server can be interrupted
 for instance, a user may want to interrupt a file transfer,
may be it was the wrong file
 let the client exit the client application; this will break the
connection to the server; the server will tear down the
connection assuming that the client had crashed
d. Whether or not the server is stateless
 a stateless server does not keep information on the state
of its clients; for instance a Web server
 soft state: a server promises to maintain state for a
limited time; e.g., to keep a client informed about
updates; after the time expires, the client has to poll

18
 a stateful server maintains information about its clients;
for instance a file server that allows a client to keep a
local copy of a file and can make update operations

3.3.2 Server Clusters


 a server cluster is a collection of machines connected
through a network (normally a LAN with high bandwidth and
low latency) where each machine runs one or more servers
 it is logically organized into three tiers

19
the general organization of a three-tiered server cluster

.
20
 Distributed Servers
.
 the problem with a server cluster is when the logical switch
(single access point) fails making the cluster unavailable
 hence, several access points can be provided where the
addresses are publicly available leading to a distributed
server
 e.g., the DNS can return several addresses for the same
host name

21
3.4 Code Migration
 so far, communication was concerned on passing data
 we may pass programs, even while running and in
heterogeneous systems
 code migration also involves moving data as well: when a
program migrates while running, its status, pending signals,
and other environment variables such as the stack and the
program counter also have to be moved

22
 Reasons for Migrating Code
 to improve performance; move processes from heavily-
loaded to lightly-loaded machines (load balancing)
 to reduce communication: move a client application that
performs many database operations to a server if the
database resides on the server; then send only results to the
client
 to exploit parallelism (for nonparallel programs): e.g., copies
of a mobile program (a crawler as is called in search
engines) moving from site to site searching the Web

23
 to have flexibility by dynamically configuring distributed
systems: instead of having a multitiered client-server
application deciding in advance which parts of a program
are to be run where

the principle of dynamically configuring a client to communicate to a


server; the client first fetches the necessary software, and then
invokes the server 24
 Models for Code Migration
 a process consists of three segments: code segment (set of
instructions), resource segment (references to external
resources such as files, printers, ...), and execution segment
(to store the current execution state of a process such as
private data, the stack, the program counter)
 Weak Mobility
 transfer only the code segment and may be some
initialization data; in this case a program always starts
from its initial stage, e.g. Java Applets
 execution can be by the target process (in its own
address space like in Java Applets) or by a separate
process

25
 Strong Mobility
 transfer code and execution segments; helps to migrate a
process in execution
 can also be supported by remote cloning; having an
exact copy of the original process and running on a
different machine; executed in parallel to the original
process; UNIX does this by forking a child process
 migration can be
 sender-initiated: the machine where the code resides or
is currently running; e.g., uploading programs to a
server; may need authentication or that the client is a
registered one
 receiver-initiated: by the target machine; e.g., Java
Applets; easier to implement

26
 Summary of models of code migration

alternatives for code migration

27
 Migration and Local Resources
 how to migrate the resource segment
 not always possible to move a resource; e.g., a reference to
TCP port held by a process to communicate with other
processes
 Types of Process-to-Resource Bindings
 Binding by identifier (the strongest): a resource is referred
by its identifier; e.g., a URL to refer to a Web page or an FTP
server referred by its Internet (IP) address
 Binding by value (weaker): when only the value of a
resource is needed; in this case another resource can
provide the same value; e.g., standard libraries of
programming languages such as C or Java which are
normally locally available, but their location in the file
system may vary from site to site
 Binding by type (weakest): a process needs a resource of a
specific type; reference to local devices, such as monitors,
printers, ...
28
 in migrating code, the above bindings cannot change, but the
references to resources can
 how can a reference be changed? depends whether the
resource can be moved along with the code, i.e., resource-to-
machine binding
 Types of Resource-to-Machine Bindings
 Unattached Resources: can be easily moved with the
migrating program (such as data files associated with the
program)
 Fastened Resources: such as local databases and complete
Web sites; moving or copying may be possible, but very
costly
 Fixed Resources: intimately bound to a specific machine or
environment such as local devices and cannot be moved
 we have nine combinations to consider

29
Resource-to machine binding
Unattached Fastened Fixed
By identifier MV (or GR) GR (or MV) GR
Process-to-
resource binding By value CP (or MV, GR) GR (or CP) GR
By type RB (or GR, CP) RB (or GR, CP) RB (or GR)

actions to be taken with respect to the references to local


resources when migrating code to another machine

 GR: Establish a global system wide reference


 MV: Move the resource
 CP: Copy the value of the resource
 RB: Rebind process to a locally available resource

 Exercise: for each of the nine combinations, give example


resources

30
 Migration in Heterogeneous Systems
 distributed systems are constructed on a heterogeneous
collection of platforms, each with its own OS and machine
architecture
 heterogeneity problems are similar to those of portability
 easier in some languages
 for scripting languages the source code is interpreted
 for Java an intermediary code is generated by the
compiler for a virtual machine
 in weak mobility
 since there is no runtime information, compile the source
code for each potential platform
 in strong mobility
 difficult to transfer the execution segment since there
may be platform-dependent information such as register
values; Read the book about possible solutions

31

You might also like