0% found this document useful (0 votes)
16 views58 pages

Lecture 12

Uploaded by

browninasia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views58 pages

Lecture 12

Uploaded by

browninasia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 58

Advanced Operating

System
Professor Mangal Sain
Lecture 12

Advanced Topics
Lecture 12 – Part 1

Network and Distributed Systems


CHAPTER OBJECTIVES
 Explain the advantages of networked and
distributed systems
 Provide a high-level overview of the networks
that interconnect distributed systems
 Define the roles and types of distributed systems
in use today
 Discuss issues concerning the design of
distributed file systems
OVERVIEW
 A distributed system is a collection
of loosely coupled nodes
interconnected by a communications
network
 Nodes variously called processors,
computers, machines, hosts
 Site is location of the machine, node
refers to specific system
 Generally a server has a resource a
client node at a different site wants to
use
OVERVIEW (CONT.)
 Nodes may exist in a client-server, peer-to-
peer, or hybrid configuration.
 In client-server configuration, server has a resource
that a client would like to use
 In peer-to-peer configuration, each node shares
equal responsibilities and can act as both clients
and servers
 Communication over a network occurs through
message passing
 All higher-level functions of a standalone system
can be expanded to encompass a distributed system
REASONS FOR DISTRIBUTED SYSTEMS
 Resource sharing
 Sharing files or printing at remote sites
 Processing information in a distributed database
 Using remote specialized hardware devices such as
graphics processing units (GPUs)
 Computation speedup
 Distribute subcomputations among various sites to
run concurrently
 Load balancing – moving jobs to more lightly-
loaded sites
 Reliability
 Detect and recover from site failure, function
transfer, reintegrate failed site
NETWORK STRUCTURE

Local-Area Network (LAN) – designed to cover


small geographical area
Consists of multiple computers (workstations, laptops,
mobile devices), peripherals (printers, storage arrays),
and routers providing access to other networks
Ethernet and/or Wireless (WiFi) most common way to
construct LANs
 Ethernet defined by standard IEEE 802.3 with speeds
typically varying from 10Mbps to over 10Gbps
 WiFi defined by standard IEEE 802.11 with speeds typically
varying from 11Mbps to over 400Mbps.
 Both standards constantly evolving
LOCAL-AREA NETWORK (LAN)
NETWORK STRUCTURE (CONT.)
 Wide-Area Network (WAN) – links geographically
separated sites
 Point-to-point connections via links
 Telephone lines, leased (dedicated data) lines, optical cable, microwave
links, radio waves, and satellite channels
 Implemented via routers to direct traffic from one network to
another
 Internet (World Wide Web) WAN enables hosts world wide to
communicate
 Speeds vary
 Many backbone providers have speeds at 40-100Gbps
 Local Internet Service Providers (ISPs) may be slower

 WAN links constantly being upgraded

 WANs and LANs interconnect, similar to cell phone network:


 Cell phones use radio waves to cell towers
 Towers connect to other towers and hubs
WIDE-AREA NETWORK (WAN)
NAMING AND NAME RESOLUTION
 Each computer system in the network has a
unique name
 Each process in a given system has a unique
name (process-id)
 Identify processes on remote systems by
<host-name, identifier> pair
 Domain name system (DNS) – specifies the
naming structure of the hosts, as well as
name to address resolution (Internet)
COMMUNICATION PROTOCOL
The communication network is partitioned into the following
multiple layers:
 Layer 1: Physical layer – handles the mechanical
and electrical details of the physical transmission of a
bit stream
 Layer 2: Data-link layer – handles the frames, or
fixed-length parts of packets, including any error
detection and recovery that occurred in the physical
layer
 Layer 3: Network layer – provides connections and
routes packets in the communication network,
including handling the address of outgoing packets,
decoding the address of incoming packets, and
maintaining routing information for proper response
to changing load levels
COMMUNICATION PROTOCOL (CONT.)
 Layer 4: Transport layer – responsible for low-
level network access and for message transfer
between clients, including partitioning messages
into packets, maintaining packet order,
controlling flow, and generating physical
addresses
 Layer 5: Session layer – implements sessions,
or process-to-process communications protocols
 Layer 6: Presentation layer – resolves the
differences in formats among the various sites in
the network, including character conversions, and
half duplex/full duplex (echoing)
 Layer 7: Application layer – interacts directly
with the users, deals with file transfer, remote-
login protocols and electronic mail, as well as
schemas for distributed databases
OSI NETWORK MODEL
Logical communication between two computers, with the three lowest-
level layers implemented in hardware
OSI PROTOCOL STACK
OSI NETWORK MESSAGE
THE OSI MODEL
 The OSI model formalizes some of the earlier
work done in network protocols but was
developed in the late 1970s and is currently not
in widespread use
 The most widely adopted protocol stack is the
TCP/IP model, which has been adopted by
virtually all Internet sites
 The TCP/IP protocol stack has fewer layers than
the OSI model. Theoretically, because it
combines several functions in each layer, it is
more difficult to implement but more efficient
than OSI networking
 The relationship between the OSI and TCP/IP
models is shown in the next slide
THE OSI AND TCP/IP PROTOCOL STACKS
TCP/IP EXAMPLE
 Every host has a name and an associated IP
address (host-id)
 Hierarchical and segmented
 Sending system checks routing tables and locates
a router to send packet
 Router uses segmented network part of host-id to
determine where to transfer packet
 This may repeat among multiple routers
 Destination system receives the packet
 Packet may be complete message, or it may need to be
reassembled into larger message spanning multiple
packets
TCP/IP EXAMPLE (CONT.)
 Within a network, how does a packet move from
sender (host or router) to receiver?
 Every Ethernet/WiFi device has a medium access
control (MAC) address
 Two devices on same LAN communicate via MAC
address
 If a system needs to send data to another system, it
needs to discover the IP to MAC address mapping
 Uses address resolution protocol (ARP)
 A broadcast uses a special network address to signal
that all hosts should receive and process the packet
 Not forwarded by routers to different networks
Lecture 12 – Part 2

Network and Distributed Systems


TRANSPORT PROTOCOLS UDP AND TCP
 Once a host with a specific IP address receives a packet, it
must somehow pass it to the correct waiting process
 Transport protocols TCP and UDP identify receiving and
sending processes through the use of a port number
 Allows host with single IP address to have multiple server/client
processes sending/receiving packets
 Well-known port numbers are used for many services
 FTP – port 21
 ssh – port 22
 SMTP – port 25
 HTTP – port 80
 Transport protocol can be simple or can add reliability to
network packet stream
USER DATAGRAM PROTOCOL
 UDP is unreliable – bare-bones extension to IP
with addition of port number
 Since there are no guarantees of delivery in the lower
network (IP) layer, packets may become lost
 Packets may also be received out-out-order

 UDP is also connectionless – no connection setup


at the beginning of the transmission to set up state
 Also no connection tear-down at the end of transmission
 UDP packets are also called datagrams
UDP DROPPED PACKET EXAMPLE
TRANSMISSION CONTROL PROTOCOL
 TCP is both reliable and connection-oriented
 In addition to port number, TCP provides abstraction
to allow in-order, uninterrupted byte-stream across
an unreliable network
 Whenever host sends packet, the receiver must send an
acknowledgement packet (ACK). If ACK not received
before a timer expires, sender will resend.
 Sequence numbers in TCP header allow receiver to put
packets in order and notice missing packets
 Connections are initiated with series of control packets
called a three-way handshake
 Connections also closed with series of control packets
TCP DATA TRANSFER SCENARIO
TRANSMISSION CONTROL PROTOCOL (CONT.)

 Receiver can send a cumulative ACK to acknowledge


series of packets
 Server can also send multiple packets before waiting for
ACKs
 Takes advantage of network throughput
 Flow of packets regulated through flow control and
congestion control
 Flow control – prevents sender from overrunning
capacity of receiver
 Congestion control – approximates congestion of the
network to slow down or speed up packet sending rate
NETWORK-ORIENTED OPERATING SYSTEMS
 Two main types
 Network Operating Systems
 Users are aware of multiplicity of machines
 Distributed Operating Systems
 Users not aware of multiplicity of machines
NETWORK OPERATING SYSTEMS
Users are aware of multiplicity of machines
Access to resources of various machines is done
explicitly by:
Remote logging into the appropriate remote machine
(ssh)
 ssh kristen.cs.yale.edu
Transferring data from remote machines to local
machines, via the File Transfer Protocol (FTP)
mechanism
Upload, download, access, or share files through cloud
storage
Users must change paradigms – establish a session,
give network-based commands, use a web browser
More difficult for users
DISTRIBUTED OPERATING SYSTEMS
 Users not aware of multiplicity of machines
 Access to remote resources similar to access to local
resources
 Data Migration – transfer data by transferring
entire file, or transferring only those portions of the
file necessary for the immediate task
 Computation Migration – transfer the
computation, rather than the data, across the
system
 Via remote procedure calls (RPCs)
 Via messaging system
DISTRIBUTED-OPERATING SYSTEMS (CONT.)
 Process Migration – execute an entire process, or
parts of it, at different sites
 Load balancing – distribute processes across network
to even the workload
 Computation speedup – subprocesses can run
concurrently on different sites
 Hardware preference – process execution may require
specialized processor
 Software preference – required software may be
available at only a particular site
 Data access – run process remotely, rather than
transfer all data locally
 Consider the World Wide Web
DESIGN ISSUES OF DISTRIBUTED SYSTEMS
 We investigate three design questions:
 Robustness – Can the distributed system withstand
failures?
 Transparency – Can the distributed system be
transparent to the user both in terms of where files are
stored and user mobility?
 Scalability – Can the distributed system be scalable to
allow addition of more computation power, storage, or
users?
ROBUSTNESS
 Hardware failures can include failure of a link,
failure of a site, and loss of a message.
 A fault-tolerant system can tolerate a certain
level of failure
 Degree of fault tolerance depends on design of
system and the specific fault
 The more fault tolerance, the better!

 Involves failure detection, reconfiguration,


and recovery
FAILURE DETECTION
 Detecting hardware failure is difficult
 To detect a link failure, a heartbeat protocol can
be used
 Assume Site A and Site B have established a link
 At fixed intervals, each site will exchange an I-am-up
message indicating that they are up and running
 If Site A does not receive a message within the
fixed interval, it assumes either (a) the other site
is not up or (b) the message was lost
 Site A can now send an Are-you-up? message to
Site B
 If Site A does not receive a reply, it can repeat the
message or try an alternate route to Site B
FAILURE DETECTION (CONT.)
 If Site A does not ultimately receive a reply
from Site B, it concludes some type of failure
has occurred
 Types of failures:
- Site B is down
- The direct link between A and B is down
- The alternate link from A to B is down
- The message has been lost
 However, Site A cannot determine exactly why
the failure has occurred
RECONFIGURATION AND RECOVERY
 When Site A determines a failure has occurred,
it must reconfigure the system:
 If the link from A to B has failed, this must be
broadcast to every site in the system
 If a site has failed, every other site must also be
notified indicating that the services offered by the
failed site are no longer available
 When the link or the site becomes available
again, this information must again be
broadcast to all other sites
TRANSPARENCY

 The distributed system should appear as a


conventional, centralized system to the user
 User interface should not distinguish between
local and remote resources
 Example: NFS
 User mobility allows users to log into any
machine in the environment and see his/her
environment
 Example: LDAP plus desktop virtualization
SCALABILITY

 As demands increase, the system should easily


accept the addition of new resources to
accommodate the increased demand
 Reacts gracefully to increased load
 Adding more resources may generate additional
indirect load on other resources if not careful
 Data compression or deduplication can cut down
on storage and network resources used
Lecture 12 – Part 3

Network and Distributed Systems


DISTRIBUTED FILE SYSTEM

 Distributed file system (DFS) – a file system


whose clients, servers, and storage devices are
dispersed among the machines of a distributed
system
 Should appear to its clients as a conventional,
centralized file system
 Key distinguishing feature is management of
dispersed storage devices
DISTRIBUTED FILE SYSTEM (CONT.)

 Service – software entity running on one or more


machines and providing a particular type of function to
a priori unknown clients
 Server – service software running on a single machine
 Client – process that can invoke a service using a set
of operations that forms its client interface
 A client interface for a file service is formed by a set of
primitive file operations (create, delete, read, write)
 Client interface of a DFS should be transparent; i.e.,
not distinguish between local and remote files
 Sometimes lower level inter-machine interface need
for cross-machine interaction
DISTRIBUTED FILE SYSTEM (CONT.)

 Two widely-used architectural models include


client-server model and cluster-based model
 Challenges include:
 Naming and transparency
 Remote file access
 Caching and cache consistency
CLIENT-SERVER DFS MODEL

 Server(s) store both files and metadata on attached


storage
 Clients contact the server to request files
 Sever responsible for authentication, checking file
permissions, and delivering the file
 Changes client makes to file must be propagated back
to the server
 Popular examples include NFS and OpenAFS
 Design suffers from single point of failure if server
crashes
 Server presents a bottleneck for all requests of
data and metadata
 Could pose problems with scalability and bandwidth
CLIENT-SERVER DFS MODEL (CONT.)
CLUSTER-BASED DFS MODEL

 Built to be more fault-tolerant and scalable than


client-server DFS
 Examples include the Google File System (GFS)
and Hadoop Distributed File System (HDFS)
 Clients connected to master metadata server and several
data servers that hold “chunks” (portions) of files
 Metadata server keeps mapping of which data servers hold
chunks of which files
 As well as hierarchical mapping of directories and files
 File chunks replicated n times
CLUSTER-BASED DFS MODEL (CONT.)
CLUSTER-BASED DFS MODEL (CONT.)
 GFS design was influenced by following observations:
 Hardware component failures are the norm rather than the
exception and should be routinely expected.
 Files stored on such a system are very large.
 Most files are changed by appending new data to the end rather
than overwriting existing data.
 Redesigning the applications and file system API increases
system flexibility
➢ Requires applications to be programmed specially with new API
 Modularized software layer MapReduce can sit on top of
GFS to carry out large-scale parallel computations while
utilizing benefits of GFS
 Hadoop framework also stackable and modularized
NAMING AND TRANSPARENCY

 Naming – mapping between logical and physical


objects
 Multilevel mapping – abstraction of a file that
hides the details of how and where on the disk the
file is actually stored
 A transparent DFS hides the location where in
the network the file is stored
 For a file being replicated in several sites, the
mapping returns a set of the locations of this file’s
replicas; both the existence of multiple copies and
their location are hidden
NAMING STRUCTURES
 Location transparency – file name does not
reveal the file’s physical storage location
 Location independence – file name does not
need to be changed when the file’s physical storage
location changes
 In practice most DFSs use static, location-
transparent mapping for user-level names
 Some support file migration (e.g. OpenAFS)
 Hadoop supports file migration but without following
POSIX standards; hides information from clients
 Amazon S3 provides blocks of storage on demand via
APIs, placing storage dynamically and moving data as
necessary
NAMING SCHEMES
 Three approaches:
 Files named by combination of their host name
and local name; guarantees a unique system-wide
name. This naming scheme is neither location
transparent nor location independent.
 Attach remote directories to local directories,
giving the appearance of a coherent directory
tree; only previously mounted remote directories
can be accessed transparently
 Single global name structures spans all files in
the system. If a server is unavailable, some
arbitrary set of directories on different machines
also becomes unavailable
REMOTE FILE ACCESS
 Consider a user who requests access to a
remote file. The server storing the file has been
located by the naming scheme, and now the
actual data transfer must take place.
 Remote-service mechanism is one transfer
approach.
 A requests for accesses are delivered to the server,
the server machine performs the accesses, and
their results are forwarded back to the user.
 One of the most common ways of implementing
remote service is the RPC paradigm
REMOTE FILE ACCESS (CONT.)
 Reduce network traffic by retaining recently
accessed disk blocks in a cache, so that repeated
accesses to the same information can be handled
locally
 If needed data not already cached, a copy of data is
brought from the server to the user
 Accesses are performed on the cached copy
 Files identified with one master copy residing at the
server machine, but copies of (parts of) the file are
scattered in different caches
 Cache-consistency problem – keeping the cached
copies consistent with the master file
 Could be called network virtual memory
CACHE LOCATION – DISK VS. MAIN MEMORY
 Advantages of disk caches
 More reliable
 Cached data kept on disk are still there during
recovery and don’t need to be fetched again
 Advantages of main-memory caches:
 Permit workstations to be diskless
 Data can be accessed more quickly
 Performance speedup in bigger memories
 Server caches (used to speed up disk I/O) are in main
memory regardless of where user caches are located;
using main-memory caches on the user machine
permits a single caching mechanism for servers and
users
CACHE UPDATE POLICY
 Write-through – write data through to disk as soon as
they are placed on any cache
 Reliable, but poor performance
 Delayed-write (write-back) – modifications are written
to the cache and then written through to the server later
 Write accesses complete quickly; some data may be overwritten
before they are written back, and so need never be written at all
 Poor reliability; unwritten data will be lost whenever a user
machine crashes
 Variation – scan cache at regular intervals and flush blocks that
have been modified since the last scan
 Variation – write-on-close, writes data back to the server
when the file is closed
 Best for files that are open for long periods and frequently modified
CONSISTENCY
 Is locally cached copy of the data consistent
with the master copy?
 Client-initiated approach
 Client initiates a validity check
 Server checks whether the local data are
consistent with the master copy
 Server-initiated approach
 Server records, for each client, the (parts of) files
it caches
 When server detects a potential inconsistency, it
must react
CONSISTENCY (CONT.)

 In cluster-based DFS, cache-consistency issue


more complicated due to presence of metadata
server and replicated file data chunks
 HDFS allows append-only write operations (no
random writes) and a single file writer
 GFS allows random writes with concurrent writers

 Complicates write consistency guarantees for


GFS while simplifying it for HDFS

You might also like