Study Guide: Distributed Systems - Processes
Key Concepts
Process: A program in execution, managed by the operating system.
It has its own context, including CPU registers, program counter,
memory maps, and open files. The OS creates a virtual processor for
each running program (process), providing isolation and transparency.
Thread: A lightweight subprocess that runs within a process. Multiple
threads can exist within a single process and operate independently,
sharing the process's resources.
Virtualization: The creation of a virtual version of something, such as
hardware, an operating system, or network resources. In distributed
systems, it provides abstraction and can enable the execution of
different environments on the same physical infrastructure.
Client/Server Organization: A fundamental architectural pattern
where a client process requests services from a server process. This
model underpins much of distributed computing.
Code Migration: The transfer of executable code from one machine
to another. This can be done for various reasons, such as improving
performance, accessing specific resources, or dynamically configuring
clients.
Process Context: The minimal set of information about a process that
the operating system needs to save and restore to allow the process to
be interrupted and later resumed correctly.
Virtual Processor: An abstraction created by the operating system
that gives each running program the illusion of having its own
dedicated CPU.
Context Switching: The process of saving the state of one process or
thread and loading the saved state for another, allowing the CPU to be
shared among multiple execution units.
Kernel-level Threads: Threads managed directly by the operating
system kernel.
User-level Threads: Threads managed by a user-level threads library
within a process, without direct kernel involvement for scheduling and
context switching.
Multithreaded Server: A server that uses multiple threads to handle
concurrent client requests, often following a dispatcher/worker model.
Virtual Machine Monitor (VMM) or Hypervisor: Software that
creates and manages virtual machines, allowing multiple operating
systems to run concurrently on a single physical machine.
Process Virtual Machine: A type of virtual machine that provides an
environment for a single process to run, often including a runtime
environment (e.g., Java Virtual Machine).
Networked User Interface: An interface that allows users to interact
with applications running on remote machines.
Distribution Transparency: The concealment from the user and the
application programmer of the separation of components in a
distributed system, making it appear as a single coherent system.
Daemon: A background process that waits for and services requests.
Superserver: A single server process that listens on multiple ports
and dispatches incoming requests to the appropriate service handler.
Server Cluster: A group of interconnected servers that work together
to provide a service, often for increased availability and scalability.
TCP Handoff: A technique used in server clusters where an initial
server receiving a TCP connection can transfer the connection to
another server in the cluster for processing.
Distributed Server: A server whose components are spread across
multiple machines, often for performance or fault tolerance.
Slice (in PlanetLab): A virtual execution environment on PlanetLab
nodes that allows multiple users or services to share resources without
interfering with each other.
Slice Authority (in PlanetLab): An entity responsible for creating
and managing slices on PlanetLab nodes.
Short-Answer Quiz
1. Define a process in the context of operating systems and distributed
systems. What key information is typically stored in a process context?
2. Explain the relationship between a process and a thread. Can a thread
exist independently of a process? Why or why not?
3. What is virtualization and what is its general role in distributed systems
as described in the source material?
4. Describe the client/server organization model. How does a client
typically interact with a server in this model?
5. What is code migration? Provide one reason why code migration might
be beneficial in a distributed system.
6. Briefly explain the difference between kernel-level threads and user-
level threads.
7. Describe the dispatcher/worker model in the context of multithreaded
servers. What is the role of the dispatcher and the workers?
8. What are the different levels of interfaces offered by computer systems
as discussed in the text regarding virtual machines? Provide one
example of each.
9. Explain the concept of distribution transparency. How might client-side
software contribute to achieving this?
10. Briefly describe the concept of a server cluster. What is one
advantage of using a server cluster?
Answer Key
1. A process is a program in execution, managed by the operating
system. The process context stores the minimal information needed to
interrupt and restart a process, including CPU register values, the
program counter, memory maps, open files, accounting information,
and privileges.
2. A thread is a sub-process or a smaller unit of execution that runs within
a process. A thread cannot exist independently of a process; it is
contained within and utilizes the resources of its parent process.
3. Virtualization is the creation of a virtual version of a resource. In
distributed systems, it provides abstraction, allowing different
operating systems and applications to run on shared infrastructure and
simplifying resource management.
4. In the client/server organization, a client process initiates a request for
a service, and a server process responds to that request by providing
the service. Communication typically involves sending messages
between the client and the server.
5. Code migration is the transfer of executable code from one machine to
another. It can be beneficial for reasons such as dynamically
configuring a client to communicate with a specific server by fetching
the necessary software.
6. Kernel-level threads are managed directly by the operating system
kernel, which handles their scheduling and context switching. User-
level threads are managed by a user-level threads library within a
process, and the kernel is often unaware of their existence.
7. In the dispatcher/worker model for multithreaded servers, the
dispatcher thread receives incoming client requests and then assigns
or dispatches each request to a separate worker thread for processing.
This allows the server to handle multiple requests concurrently.
8. The text mentions interfaces at the hardware/software level (machine
instructions), a privileged hardware/software interface (invoked by the
OS), system calls offered by the OS, and library calls forming an API. An
example of a hardware/software interface is the x86 instruction set.
System calls like read() or write() are examples of OS interfaces.
Library calls like those in stdio.h are part of an API.
9. Distribution transparency aims to hide the fact that a system's
components are located on different machines, presenting it to users
and applications as a single, unified entity. Client-side software can
contribute to this by handling communication details and presenting a
consistent interface regardless of the server's location or replication
status.
10. A server cluster is a group of interconnected servers that work
together as a single system to provide a service. One advantage of
using a server cluster is increased availability, as the failure of one
server may not disrupt the overall service if other servers in the cluster
can take over.
Essay Format Questions
1. Discuss the trade-offs between using processes and threads in the
design of distributed systems. Consider factors such as resource
sharing, isolation, context switching overhead, and programming
complexity.
2. Explain the different types of virtualization presented in the source
material (process virtual machines and virtual machine monitors).
Compare and contrast their architectures and typical use cases in
distributed environments.
3. Analyze the role of client/server organization in distributed systems.
Discuss its advantages and disadvantages, and provide examples of
how it is used in different distributed applications.
4. Evaluate the concept of distribution transparency. Why is it a desirable
goal in many distributed systems? What are some challenges in
achieving complete transparency?
5. Based on the PlanetLab example, discuss the key management
challenges that arise when dealing with large-scale, heterogeneous
distributed systems owned by multiple organizations.
Glossary of Key Terms
API (Application Programming Interface): A set of routines,
protocols, and tools for building software applications. It specifies how
software components should interact.
Context: The complete state of a process or thread at a given point in
time, necessary for its execution.
Daemon: A background process that performs system-wide tasks
without direct user interaction.
Distributed System: A collection of independent computer systems
that cooperate to achieve a common goal.
Hypervisor (Virtual Machine Monitor - VMM): Software that
creates and runs virtual machines.
IPC (Inter-Process Communication): Mechanisms that allow
different processes to exchange data.
Kernel: The core of an operating system that has complete control
over the system's hardware.
Machine Instructions: Low-level commands that a computer's CPU
can directly execute.
Memory Map: A data structure that describes how memory is
organized and allocated to a process.
Operating System (OS): Software that manages computer hardware
and software resources and provides common services for computer
programs.
Process: An instance of a computer program that is being executed.
Program Counter: A processor register that indicates the address of
the next instruction to be executed.
Server: A computer program or device that provides a service to other
computer programs (clients).
Slice (in a distributed testbed): A virtualized portion of resources in
a shared infrastructure, isolated for a specific user or application.
System Call: A request from a user-level process to the operating
system kernel to perform a privileged operation.
Thread: A basic unit of CPU utilization; a lightweight subprocess within
a process.
Virtual Machine (VM): A software-based emulation of a physical
computer system.
Virtualization: The act of creating a virtual (rather than actual)
version of something, such as a hardware platform, operating system,
storage device, or network resources.