System Programming
(ELEC462)
Connecting to Processes Near and Far:
Servers and Sockets
Dukyun Nam
HPC Lab@KNU
Contents
● Introduction
● Four Types of Data Sources
● bc: a Unix Calculator
● popen: Making Processes Look Like Files
● Sockets: Connecting to Remote Processes
● Terminal (tty) and Pseudoterminal (pty)
● Summary
2
Introduction
● Ideas and Skills
○ The client/server model
○ Using pipes for two-way communication
○ Coroutines
○ The file/process similarity
○ Sockets: Why, What, How?
○ Network services
○ Using sockets for client/server programs
● System Calls and Functions
○ fdopen, popen, socket
○ bind, listen, accept, connect
3
Recap: Pipes
● Pipes can be used to pass data
between processes
○ e.g., ls | wc -l < Using a pipe to connect two processes >
● Creating and using pipes
< Process file descriptors after creating a pipe >
< Setting up a pipe to transfer data
from a parent to a child> 4
Recap: Stream
● Stream
○ A sequence of data elements
○ A stream can be thought of as items on a conveyor belt
< Standard streams for input,
being processed one at a time rather than in large batches
output, and error >
● Byte stream
○ The data exchanged via pipes, FIFOs, and stream sockets is an undelimited byte stream
< Separating messages in a byte stream >
5
Four Types of Data Sources
● Unix presents one interface, even though data come from different types of
sources
○ (1)/(2) Disk/device files
■ Use open to connect
■ Use read and write to transfer data
○ (3) Pipes
■ Use pipe to create
■ Use fork to share
■ Use read and write to transfer data
○ (4) Sockets < One interface, different sources >
■ Use socket, listen, and connect to connect
■ Use read and write to transfer data
■ Basis on a client-server model
6
bc: A Unix/Linux Basic Calculator
● bc has variables, loops, and functions
○ Can handle very long numbers
■ The trailing backslashes indicate continuation
7
bc: A Unix/Linux Basic Calculator (cont.)
● But bc is NOT a real calculator but actually runs dc
○ dc: a stack-based (desktop) calculator requiring the user to
(i) enter both values and then
(ii) specify the operation
■ e.g., 2 + 2 = 4
< bc and dc as coroutines >
○ How bc works:
■ Reads the expression from stdin and parses out the values and the operation
■ Sends via a pipe the sequence commands, “2”, “2”, “+”, and “p” to dc
■ Later reads the result through the pipe it attached to stdout of dc
■ Forwards that message to the user
○ How dc works:
■ Stacks up the received two values, applies the “+” operation,
and then prints to stdout the value on the top of the stack
8
Coding bc: pipe, fork, dup, exec
● Data connections in the kernel from user to bc and bc to dc
● Guidelines
p1
○ Create todc / fromdc
○ Create a process p1 to run dc (via fork)
■ bc: will run in the parent
○ In p1, redirect to stdin and stdout
to the pipes, and then exec dc
○ In the parent (bc),
■ Read and parse user input,
■ Write commands to dc
■ Read response from dc
■ Send response to user < bc, dc and kernel >
9
Tiny bc
● A simple version of bc
○ Uses sscanf to parse and speaks with dc through two pipes
● Program outline
○ a. Get two pipes
○ b. Fork (get another process)
○ c. In the child process to be dc, connect stdin and out to pipes then execl dc
○ d. The parent (tinybc) receives input and sends it via pipe
○ e. Then close pipe, and dc dies
10
Writing tiny bc
● tinybc.c
11
Writing tiny bc (cont.)
● tinybc.c - be_bc
todc todc
fromdc
fromdc
12
Writing tiny bc (cont.)
● tinybc.c - be_bc
todc [1]
todc
fromdc [0]
fromdc 13
Writing tiny bc (cont.)
● tinybc.c - be_dc
todc [1] in [0]
todc
fromdc [0] out [1]
fromdc 14
Writing tiny bc (cont.)
● Execution
15
fdopen: Making File Descriptors Look Like Files
● fdopen: a library function
○ Works like fopen,
returning a FILE*
○ Takes a file descriptor not a filename as argument
○ Used when
■ You have a file descriptor but no filename
● c.f., fopen if you know a filename
■ You want to convert the pipe connection into a FILE*
● So you can use standard, buffered I/O operations
● Notice how the tinybc.c code uses
○ fprintf and fgets to send data through the pipes to dc
○ Makes a remote access feel even more like a file
16
Lessons from bc/dc
● Client/Server Model
○ bc/dc: an example of the client/server model of a program design
● Bi-directional Communication
○ Requires one process to communicate with both stdin and stdout of another process
■ Traditionally, pipes can carry data in a unidirectional way
● Persistent service
○ Recall that on a shell each command creates a new process
○ bc keeps a single dc process running
○ bc uses that same instance of dc over and over again:
■ Sends dc commands in response to each line of user input
○ The bc/dc pair is treated as coroutines: c.f., subroutines applied to function calls
■ Both “continue to run,” but control passes from one to another
■ e.g., parsing and printing => bc, computing => dc
17
popen: Making Processes Look Like Files
● fopen: a library function
○ Opens a buffered connection to a file
● popen: a library function
○ Opens a buffered connection to a process
18
What popen Does: Use Case
● Using popen to obtain a sorted list of current users
○ By the command of “who | sort”
19
What popen Does: Use Case (cont.)
● pclose is required when popen gets invoked
○ NOT fclose
○ A callee process needs to be waited for
○ The callee process becomes a zombie process unless being awaited…
■ So its parent needs to retrieve its exit value
○ pclose calls wait
20
How Does popen Work? How to Write It?
● popen
○ Runs a program in a new process: use fork
○ Returns a connection to stdin or stdout of that program
■ Use pipe: for the connection
■ Use dup2(): for the redirection
■ Use fdopen: to make a file descriptor (fd) into a buffered stream
■ Use exec: to run any shell command in that process
● /bin/sh: can execute any program on the shell
● The “-c” option: tells the shell to run a command and then exit.
● e.g., “sh -c “who | sort””: what does this do?
21
How Does popen Work? How to Write It? (cont.)
● popen (Cont’d)
○ Flow chart and illustration for writing popen
< Reading from a shell command > 22
How Does popen Work? How to Write It? (cont.)
● popen.c
23
How Does popen Work? How to Write It? (cont.)
● popen.c (Cont’d)
24
Access to Data: Files, APIs, and Servers
● Method 1: Getting data (directly) from files
○ By reading from a file: who for the utmp file
○ Not a perfect solution, as a client needs to know a file format and specific
names in structures
25
Access to Data: Files, APIs, and Servers (cont.)
● Method 2: Getting data from functions
○ A library function can hide all the details behind a standard function
interface
○ Application programming interface (API)-based information services are not
always a right solution
○ Two methods for using system library functions
■ Static library (or static linking) including actual function code
● but potentially containing a bug or using out-of-date file formats
■ Dynamic library (or shared libraries)
● not always installed on a system or version error
26
Access to Data: Files, APIs, and Servers (cont.)
● Method 3: Getting data from processes; the bc/dc example
○ May require a network connection
○ Good for a client-server model
■ Server program can be written in any language: C, C++, Java, Perl, Python …
○ A server can be at a different machine from a client machine
■ How to connect to a process on a different machine?
■ Solution: IP address and port #
● What mechanism to allow us to connect to a process on a different
computer?
27
Recall) What Pipes Can and Cannot Do
● Pros
○ Simple, easy, less complicated, no need of network
○ Allowing processes to send data to other processes as easily as they send data to
files
● Cons
○ Created in one process and shared by calling fork
■ Can only connect related processes
○ Can only connect process on the SAME machine
■ What if you want to send your data to another remote host?
● Linux provides another method of IPC for remote connection: Sockets!
28
Sockets: Connecting to Remote Processes
● Allow processes to create pipelike connections to
○ Not only unrelated processes but also ones on other machines
● We’ll study the basic ideas of sockets
● We’ll see how to use sockets to connect clients and servers on
different machines
< Connecting to a remote process >
29
An Analogy: “At the Tone, the Time Will
Be…”
<A time service >
30
Four Important Concepts
● 1. Client and Server
○ Server: a program (rather than a machine) that provides “services”
■ Its process waits for a request, processes that request, and then loops back to take the next
request
○ Client: a program (rather than a machine) that requests “services”
■ Connects to and exchanges some data with the server, and then continues (its own task) and
later terminates
● Note that it does not loop
● 2. Protocol
○ The rules of interaction between the client and the server
○ In the time service, the protocol is simple:
■ If the client calls, the server answers, sends the time and then hangs up
31
Four Important Concepts (cont.)
● 3. Hostname and port
○ Host (identified by Internet Protocol (IP) address)
■ A server on the Internet
■ A running process on a machine, or host
■ Has its assigned name (hostname) and a port number
● These two determines a server (address), or an end of communication.
● e.g., cse.knu.ac.kr: cse as hostname, 80: port number (hidden)
● 4. Address family
○ A group (or, set) of different addresses for indicating a service
■ Telephone + ext. number (for telephone): maybe denoted as AF_PHONE
■ Street address + zip code (postal code) (for mailing): maybe denoted as AF_MAIL
■ Longitude + latitude (for GPS): maybe denoted as AF_GLOBAL
○ IP address + port number (for network connection): AF_INET
32
Lists of Services: Well-Known Ports
● 119 for emergences, 112 for spy,
114 for phone, …
● How can we know what services
available on my machine?
33
How Do We Write Time Server and Time Client?
● Six steps for our ● Four steps for our
telephone-based time server telephone-based time client
34
Active and Passive Sockets
● Stream sockets are often distinguished as being
either active or passive:
○ By default, a socket that has been created using
socket() is active. An active socket can be used in
a connect() call to establish a connection to a
passive socket. This is referred to as performing an
active open.
○ A passive socket (also called a listening socket) is one
that has been marked to allow incoming connections
by calling listen(). Accepting an incoming
connection is referred to as performing a passive
open.
35
Working Principle of a Time Server
● Step 1: Ask kernel for a socket
○ A socket: a place from which calls can be made and a place to
which calls can be directed
● Step 2: Bind address to a socket.
○ Address is hostname and port
● Step 3: Allow incoming calls with queue size=1 on socket
○ A server accepts incoming calls.
● Step 4: Wait for/Accept a Call.
○ Once the socket is created, assigned an address, and set up receive incoming calls,
then the program is ready to go!
● Steps 5 and 6: Transfer Data and then Hang Up
36
Step 1: Ask kernel for a socket
● socket creates an endpoint for communication
and returns an identifier for that socket
○ Various sorts of communication systems,
each called domain (e.g., Internet)
○ The type of a socket specifies
the type of data flow
■ SOCK_STREAM: a bidirectional type (like TCP)
■ SOCK_DGRAM: connectionless (like UDP)
○ Protocol used within the network code in the kernel
■ c.f., /etc/protocols
37
Step 2: Bind address to a socket
● bind assigns a network address to a socket
○ The Internet address family (AF_INET) uses host and port
■ Port 13000 will be used; port 13 reserved for the real time server
38
Step 3: Allow Incoming calls with Queue size=1 on
Socket
● listen asks the kernel to allow the specified socket to receive incoming
calls.
○ Applied to SOCK_STREAM (not to SOCK_DGRAM)
○ Queue for incoming calls
○ Queue size=1 means a queue of one call
■ Maximum queue size depends on the socket implementation
39
Step 4: Wait for / Accept a Call
● accept suspends the current process until an incoming connection on
the specified socket is established
○ The socket has an address, consisting of a hostname and port number
○ Returns a file descriptor (fd) opened for reading and writing
■ fd: a connection to a file descriptor in the calling process
40
The Remaining Steps
● Step 5: Transfer Data
○ The fd returned by accept is a regular file descriptor.
○ Use fdopen to make the fd into a buffered data stream for fprintf
● Step 6: Close Connection
○ The fd returned by accept should be closed with close
○ When one process closes one end,
■ The other end will see EOF for a data read (as seen in pipes)
41
A Time Server: timeserv.c
42
A Time Server: timeserv.c (cont.)
43
A Time Server: timeserv.c (cont.)
44
Working Principle of a Time Client
● Step 1: Ask Kernel for a Socket
○ Needs a socket to connect to the network
○ Like a client needing a phone line in the phone network
● Step 2: Connect to Server
○ Uses the connect system call
● Step 3: Transfer Data
○ Reads one line from the server through the connected socket
● Step 4: Hang Up
○ closes the file descriptor for the connected socket and exits
45
More Details about Step 2
● connect attempts to connect the socket specified by sockid to
the socket address pointed by serv_addrp
○ If the attempt succeeds, result will get 0.
■ sockid now then becomes a fd open for reading and writing
■ Data written into are sent to and read from the socket’s ends
46
A Time Client: timeclnt.c
47
Testing timeserv and timeclnt
● The server process runs on one machine
● A client process on another machine connects to the server over the
network
● Then the server sends data
to client by write
● The client receives
that message by read
< Processes on different computers >
48
Testing timeserv and timeclnt (cont.)
● Execution
○ Server
■ $ ./timeserv &
○ Client
■ $ ./timeclnt localhost 1{last 4 digit of your s#}
49
Example of Another Server and Client: Remote ls
● Listing files on a remote computer
○ Could log in to that machine and run ls
■ e.g., $./rls 155.230.100.100 /home/username
○ rls needs a server process running on the other machine
■ To receive the request, do the work, and return the result
○ The server runs on one computer
○ A client on another computer connects to the server
■ Sends the name of a requesting directory: e.g., ‘/home/username’
○ The server sends back to the client a list of the files in that requested directory
○ The client displays the list by writing to stdout
● This two-process system really provides access to directories on a different
machine!
50
Remote ls
● Three things to implement
the remote ls system
○ 1) A protocol
■ Consists a request and a reply
between a client and a server program
< A remote ls system >
○ 2) A client program
■ Sends a single-line containing the name of a requested directory
■ Reads the list of files line by line until the server closes the connection
○ 3) A server program
■ The server opens and reads that directory and sends back to the client the list of files
■ When closing the connection, it generates an EOF condition
51
Remote ls (cont.)
● The client: rls
○ Differences of this client from the time client
■ 1. Writing the directory name into the socket
■ 2. Entering a loop, copying data from the socket to stdout until EOF
● The loop uses a standard buffer size for efficiency
■ 3. Using write and read for data transfer (with the server)
52
Remote ls (cont.)
● The server: rlsd
○ Has to get a socket
■ Bind and listen, and then accept a call.
○ Read then the name of a requested directory from the socket
○ Lists the contents of that directory
■ Use popen to read the output from the regular version of ls
● Notes
○ The server uses standard buffered streams for reading and writing
■ Use fgets to read the directory name from the client
○ After popen, it transfers data using getc/putc for a file copy
■ Actually copying data from one process to a process on another machine
53
Remote ls (cont.)
● Additional notes
○ The string the program receives
■ Does not overflow the input buffer
■ Does not overflow the buffer
< Using popen(“ls”) to list remote directories >
for the command
■ Doesn’t allow special characters in the directory name
○ popen: indeed too risky for network services
■ Because it passes a string to a shell
■ It’s a poor idea to write any server-passing strings to a shell!
■ Two reasons of why to use this example
● For showing another use of popen
● For alerting you guys to this danger
54
Writing remote ls: rlsd.c
●
55
Writing remote ls: rlsd.c (cont.)
56
Writing remote ls: rls.c
●
57
Execution of Remote ls
58
Software Daemons
● Many Linux server programs ending in the letter ‘d’
○ e.g., httpd, inetd, syslogd, atd, ntpd, sshd, …
○ ‘d’ stands for daemon
■ Daemon: a supernatural helper floating around waiting you to help out
● Provides a variety of services and performs system maintenance
○ Alerting, flushing, logging, printing, accepting network connections, …
○ Most daemon processes get started at the boot-up time
■ /etc/rcX.d, where X depends on system
● Starts these servers in the background for providing services, with being detached from any
terminals
59
Terminal (tty) and Pseudoterminal (pty) (cont.)
● How can we enable a user on one host to operate a terminal-oriented program on
another host connected via a network?
○ However, we can’t connect the standard input, output, and error of a terminal-oriented program
directly to a socket.
60
Terminal (tty) and Pseudoterminal (pty) (cont.)
● A pseudoterminal is a virtual device that provides an IPC channel
○ On one end of the channel is a program that expects to be connected to a
terminal device
○ On the other end is a program that drives the terminal-oriented program by
using the channel to send it input and read its output
61
Terminal (tty) and Pseudoterminal (pty) (cont.)
< How ssh uses a pseudoterminal >
62
Summary
● Some programs are written as separate processes for data transfer
○ A server process responsible for processing or data delivery in the CS model
● A CS system consists of a communication system and a protocol
■ Protocol: a set of rules for the structure of a conversation
○ Clients/servers can communicate through pipes or sockets
● popen can make any shell command into a server program.
○ Makes access to the server look like having access to buffered files
63
Summary (cont.)
● A pipe: a connected pair of fds
○ Socket: an unconnected communication endpoint (potential fd)
■ A client creates a comm. link by connecting its socket to a server socket
● Connections between sockets from one machine to another.
○ Each socket is identified by an IP address and a port number
● Connections to pipes and sockets use fds
○ Fds provide programs with a single interface for communication with
different objects: files, devices, and other processes
64