0% found this document useful (0 votes)
6 views25 pages

Unit 3 Cna

hh

Uploaded by

Prem TV
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views25 pages

Unit 3 Cna

hh

Uploaded by

Prem TV
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

CHAPTER 25 INTRODUCTION TO APPLICATION LAYER 823

25.2 CLIENT-SERVER PROGRAMMING


In a client-server paradigm, communication at the application layer is between two run-
ning application programs called processes: a client and a server. A client is a running
program that initializes the communication by sending a request; a server is another
application program that waits for a request from a client. The server handles the
request received from a client, prepares a result, and sends the result back to the client.
This definition of a server implies that a server must be running when a request from a
client arrives, but the client needs to be run only when it is needed. This means that if
we have two computers connected to each other somewhere, we can run a client pro-
cess on one of them and the server on the other. However, we need to be careful that the
server program is started before we start running the client program. In other words, the
lifetime of a server is infinite: it should be started and run forever, waiting for the cli-
ents. The lifetime of a client is finite: it normally sends a finite number of requests to
the corresponding server, receives the responses, and stops.

25.2.1 Application Programming Interface


How can a client process communicate with a server process? A computer program is nor-
mally written in a computer language with a predefined set of instructions that tells the
computer what to do. A computer language has a set of instructions for mathematical
operations, a set of instructions for string manipulation, a set of instructions for input/
output access, and so on. If we need a process to be able to communicate with another pro-
cess, we need a new set of instructions to tell the lowest four layers of the TCP/IP suite to
open the connection, send and receive data from the other end, and close the connection. A
set of instructions of this kind is normally referred to as an application programming
interface (API). An interface in programming is a set of instructions between two entities.
In this case, one of the entities is the process at the application layer and the other is the
operating system that encapsulates the first four layers of the TCP/IP protocol suite. In
other words, a computer manufacturer needs to build the first four layers of the suite in the
operating system and include an API. In this way, the processes running at the application
layer are able to communicate with the operating system when sending and receiving mes-
sages through the Internet. Several APIs have been designed for communication. Three
among them are common: socket interface, Transport Layer Interface (TLI), and
STREAM. In this chapter, we briefly discuss only socket interface, the most common one,
to give a general idea of network communication at the application layer.
Socket interface started in the early 1980s at UC Berkeley as part of a UNIX environ-
ment. The socket interface is a set of instructions that provide communication between
the application layer and the operating system, as shown in Figure 25.4. It is a set of
instructions that can be used by a process to communicate with another process.
The idea of sockets allows us to use the set of all instructions already designed in a
programming language for other sources and sinks. For example, in most computer lan-
guages, like C, C++, or Java, we have several instructions that can read and write data
to other sources and sinks such as a keyboard (a source), a monitor (a sink), or a file
(source and sink). We can use the same instructions to read from or write to sockets. In
other words, we are adding only new sources and sinks to the programming language
824 PART VI APPLICATION LAYER

Figure 25.4 Position of the socket interface

Client site Server site

Application layer Application layer

Socket interface Socket interface

Transport layer Transport layer

Network layer Network layer


Data-link layer Data-link layer
Physical layer Physical layer

Operating system Operating system

without changing the way we send data or receive data. Figure 25.5 shows the idea and
compares the sockets with other sources and sinks.

Figure 25.5 Sockets used the same way as other sources and sinks

Application program

Read Write Read Write Read Write

Keyboard Monitor File Socket


(source) (sink) (sink and source) (sink and source)

Sockets
Although a socket is supposed to behave like a terminal or a file, it is not a physical
entity like them; it is an abstraction. It is an object that is created and used by the appli-
cation program.
We can say that, as far as the application layer is concerned, communication
between a client process and a server process is communication between two sockets,
created at two ends, as shown in Figure 25.6. The client thinks that the socket is the
entity that receives the request and gives the response; the server thinks that the socket
is the one that has a request and needs the response. If we create two sockets, one at
each end, and define the source and destination addresses correctly, we can use the
available instructions to send and receive data. The rest is the responsibility of the oper-
ating system and the embedded TCP/IP protocol.
CHAPTER 25 INTRODUCTION TO APPLICATION LAYER 825

Figure 25.6 Use of sockets in process-to-process communication

Client Server
process process
Application Application
layer Request Response Response Request layer

Socket Socket

Logical connection

Socket Addresses
The interaction between a client and a server is two-way communication. In a two-way
communication, we need a pair of addresses: local (sender) and remote (receiver). The
local address in one direction is the remote address in the other direction and vice
versa. Since communication in the client-server paradigm is between two sockets, we
need a pair of socket addresses for communication: a local socket address and a
remote socket address. However, we need to define a socket address in terms of identi-
fiers used in the TCP/IP protocol suite.
A socket address should first define the computer on which a client or a server is
running. As we discussed in Chapter 18, a computer in the Internet is uniquely defined
by its IP address, a 32-bit integer in the current Internet version. However, several client
or server processes may be running at the same time on a computer, which means that
we need another identifier to define the specific client or server involved in the commu-
nication. As we discussed in Chapter 24, an application program can be defined by a
port number, a 16-bit integer. This means that a socket address should be a combination
of an IP address and a port number as shown in Figure 25.7.

Figure 25.7 A socket address

32 bits 16 bits
IP address Port number

Socket address

Since a socket defines the end-point of the communication, we can say that a
socket is identified by a pair of socket addresses, a local and a remote.
Finding Socket Addresses
How can a client or a server find a pair of socket addresses for communication? The sit-
uation is different for each site.
Server Site
The server needs a local (server) and a remote (client) socket address for communication.
826 PART VI APPLICATION LAYER

Local Socket Address The local (server) socket address is provided by the operating
system. The operating system knows the IP address of the computer on which the
server process is running. The port number of a server process, however, needs to be
assigned. If the server process is a standard one defined by the Internet authority, a port
number is already assigned to it. For example, the assigned port number for a Hypertext
Transfer Protocol (HTTP) is the integer 80, which cannot be used by any other process.
We discussed these well-known port numbers in Chapter 24. If the server process is
not standard, the designer of the server process can choose a port number, in the range
defined by the Internet authority, and assign it to the process. When a server starts run-
ning, it knows the local socket address.
Remote Socket Address The remote socket address for a server is the socket address
of the client that makes the connection. Since the server can serve many clients, it does
not know beforehand the remote socket address for communication. The server can find
this socket address when a client tries to connect to the server. The client socket
address, which is contained in the request packet sent to the server, becomes the remote
socket address that is used for responding to the client. In other words, although the
local socket address for a server is fixed and used during its lifetime, the remote socket
address is changed in each interaction with a different client.
Client Site
The client also needs a local (client) and a remote (server) socket address for
communication.
Local Socket Address The local (client) socket address is also provided by the oper-
ating system. The operating system knows the IP address of the computer on which the
client is running. The port number, however, is a 16-bit temporary integer that is
assigned to a client process each time the process needs to start the communication.
The port number, however, needs to be assigned from a set of integers defined by the
Internet authority and called the ephemeral (temporary) port numbers, which we dis-
cussed in Chapter 24. The operating system, however, needs to guarantee that the new
port number is not used by any other running client process. The operating system
needs to remember the port number to be able to redirect the response received from
the server process to the client process that sent the request.
Remote Socket Address Finding the remote (server) socket address for a client, how-
ever, needs more work. When a client process starts, it should know the socket address
of the server it wants to connect to. We will have two situations in this case.
❑ Sometimes, the user who starts the client process knows both the server port
number and IP address of the computer on which the server is running. This usu-
ally occurs in situations when we have written client and server applications and
we want to test them. For example, at the end of this chapter we write some sim-
ple client and server programs and we test them using this approach. In this situ-
ation, the programmer can provide these two pieces of information when he runs
the client program.
❑ Although each standard application has a well-known port number, most of the
time, we do not know the IP address. This happens in situations such as when we
CHAPTER 25 INTRODUCTION TO APPLICATION LAYER 827

need to contact a web page, send an e-mail to a friend, copy a file from a remote
site, and so on. In these situations, the server has a name, an identifier that
uniquely defines the server process. Examples of these identifiers are URLs,
such as www.xxx.yyy, or e-mail addresses, such as [email protected]. The client
process should now change this identifier (name) to the corresponding server
socket address. The client process normally knows the port number because it
should be a well-known port number, but the IP address can be obtained using
another client-server application called the Domain Name System (DNS). We
will discuss DNS in Chapter 26, but it is enough to know that it acts as a directory in
the Internet. Compare the situation with the telephone directory. We want to call
someone whose name we know but whose telephone number can be obtained
from the telephone directory. The telephone directory maps the name to the tele-
phone number; DNS maps the server name to the IP address of the computer run-
ning that server.

25.2.2 Using Services of the Transport Layer


A pair of processes provide services to the users of the Internet, human or programs.
A pair of processes, however, need to use the services provided by the transport layer
for communication because there is no physical communication at the application
layer. As we discussed in Chapters 23 and 24, there are three common transport-layer
protocols in the TCP/IP suite: UDP, TCP, and SCTP. Most standard applications have
been designed to use the services of one of these protocols. When we write a new
application, we can decide which protocol we want to use. The choice of the transport-
layer protocol seriously affects the capability of the application processes. In this
section, we first discuss the services provided by each protocol to help understand
why a standard application uses it or which one we need to use if we decide to write
a new application.
UDP Protocol
UDP provides connectionless, unreliable, datagram service. Connectionless service
means that there is no logical connection between the two ends exchanging messages.
Each message is an independent entity encapsulated in a datagram. UDP does not see
any relation (connection) between consequent datagrams coming from the same source
and going to the same destination.
UDP is not a reliable protocol. Although it may check that the data is not corrupted
during the transmission, it does not ask the sender to resend the corrupted or lost data-
gram. For some applications, UDP has an advantage: it is message-oriented. It gives
boundaries to the messages exchanged. An application program may be designed to use
UDP if it is sending small messages and the simplicity and speed is more important for
the application than reliability. For example, some management and multimedia appli-
cations fit in this category.

TCP Protocol
TCP provides connection-oriented, reliable, byte-stream service. TCP requires that two
ends first create a logical connection between themselves by exchanging some
830 PART VI APPLICATION LAYER

Figure 25.9 Flow diagram for iterative UDP communication

Start Server
Socket

Create socket

Bind socket

Infinite
Clients loop
Start
Socket Receive request
Create socket
Block

Send request Request Unblock


Datagram
Receive response Handle request
and create
response
Block

Unblock Response Send response


Datagram
Handle
response
Legend

Empty socket
Destroy socket
Half-filled socket

Stop Filled socket

25.2.4 Iterative Communication Using TCP


As we described before, TCP is a connection-oriented protocol. Before sending or
receiving data, a connection needs to be established between the client and the server.
After the connection is established, the two parties can send and receive chunks of
data as long as they have data to do so. Although iterative communication using TCP
is not very common, because it is simpler we discuss this type of communication in
this section.
Sockets Used in TCP
The TCP server uses two different sockets, one for connection establishment and the
other for data transfer. We call the first one the listen socket and the second the
socket. The reason for having two types of sockets is to separate the connection phase
from the data exchange phase. A server uses a listen socket to listen for a new client
CHAPTER 25 INTRODUCTION TO APPLICATION LAYER 831

trying to establish connection. After the connection is established, the server creates
a socket to exchange data with the client and finally to terminate the connection. The
client uses only one socket for both connection establishment and data exchange
(see Figure 25.10).

Figure 25.10 Sockets used in TCP communication

Server

1 Connection establishment
Client 1
3
2
Data transfer and termination
Create
4 Connection establishment
Client 2
6
5
Data transfer and termination
Create

Legend
Listen socket
Socket

Flow Diagram
Figure 25.11 shows a simplified flow diagram for iterative communication using TCP.
There are multiple clients, but only one server. Each client is served in each iteration of
the loop. The flow diagram is almost similar to the one for UDP, but there are differ-
ences that we explain for each site.
Server Process
In Figure 25.11, the TCP server process, like the UDP server process, creates a socket
and binds it, but these two commands create the listen socket to be used only for the
connection establishment phase. The server process then calls the listen procedure, to
allow the operating system to start accepting the clients, completing the connection
phase, and putting them in the waiting list to be served.
The server process now starts a loop and serves the clients one by one. In each iter-
ation, the server process issues the accept procedure that removes one client from the
waiting list of the connected clients for serving. If the list is empty, the accept proce-
dure blocks until there is a client to be served. When the accept procedure returns, it
creates a new socket for data transfer. The server process now uses the client socket
address obtained during the connection establishment to fill the remote socket address
field in the newly created socket. At this time the client and server can exchange data.
Client Process
The client flow diagram is almost similar to the UDP version except that the client
data-transfer box needs to be defined for each specific case. We do so when we write a
specific program later.
832 PART VI APPLICATION LAYER

Figure 25.11 Flow diagram for iterative TCP communication

Legend Server
Start
Listen
Empty socket socket
Half-filled socket Create socket
Filled socket
Bind socket
Clients
Start Listen
ment
Socket b lish
esta Infinite
Create socket on
necti loop
Con
Connect Accept

Block Block
Unblock Unblock

Data transfer Socket


Client Server
Data-transfer Data-transfer

Handle Handle request


response and create
response

Connection termination
Destroy socket Destroy socket

Stop

25.2.5 Concurrent Communication


A concurrent server can process several client requests at the same time. This can be
done using the available provisions in the underlying programming language. In C, a
server can create several child processes, in which a child can handle a client. In Java,
threading allows several clients to be handled by each thread. We do not discuss con-
current server communication in this chapter, but we briefly discuss it in the book web-
site in the Extra Material section.
CHAPTER 26

Standard Client-Server
Protocols

A fter introducing the application layer in the previous chapter, we discuss some
standard application-layer protocols in this chapter. During the lifetime of the
Internet, several client-server application programs have been developed. We do not
have to redefine them, but we need to understand what they do. For each application,
we also need to know the options available to us. The study of these applications and
the ways they provide different services can help us to create customized applications
in the future.
We have selected six standard application programs in this section. Some other
applications have been or will be discussed in other chapters. Dynamic Host Configura-
tion Protocol (DHCP) was discussed in Chapter 18 and Simple Network Management
Protocol (SNMP) will be discussed in Chapter 27.
This chapter is made of six sections:
❑ The first section introduces the World Wide Web. It then discusses the HyperText
Transfer Protocol, the most common client-server application program used in
relation to the World Wide Web.
❑ The second section discusses the File Transfer Protocol, which is the standard
protocol provided by TCP/IP for copying a file from one host to another.
❑ The third section discusses electronic mail, which involves two protocols: SMPT
and POP. As we will see, the nature of this application is different from the other
two previous applications. We need two different protocols to handle electronic
mail.
❑ The fourth section discusses TELNET, a general client-server program that allows
users to log in to a remote machine and use any application available on the remote
host.
❑ The fifth section discusses Secure Shell, which can be used as a secured TELNET,
but it can also provide a secure tunnel for other applications.
❑ The sixth section talks about the Domain Name System, which acts as the direc-
tory system in the Internet. It maps the name of an entity to its IP address.

871
872 PART VI APPLICATION LAYER

26.1 WORLD WIDE WEB AND HTTP


In this section, we first introduce the World Wide Web (abbreviated WWW or Web).
We then discuss the HyperText Transfer Protocol (HTTP), the most common client-
server application program used in relation to the Web.

26.1.1 World Wide Web


The idea of the Web was first proposed by Tim Berners-Lee in 1989 at CERN†, the
European Organization for Nuclear Research, to allow several researchers at different
locations throughout Europe to access each others’ researches. The commercial Web
started in the early 1990s.
The Web today is a repository of information in which the documents, called web
pages, are distributed all over the world and related documents are linked together. The
popularity and growth of the Web can be related to two terms in the above statement:
distributed and linked. Distribution allows the growth of the Web. Each web server in
the world can add a new web page to the repository and announce it to all Internet users
without overloading a few servers. Linking allows one web page to refer to another web
page stored in another server somewhere else in the world. The linking of web pages
was achieved using a concept called hypertext, which was introduced many years
before the advent of the Internet. The idea was to use a machine that automatically
retrieved another document stored in the system when a link to it appeared in the docu-
ment. The Web implemented this idea electronically to allow the linked document to be
retrieved when the link was clicked by the user. Today, the term hypertext, coined to
mean linked text documents, has been changed to hypermedia, to show that a web page
can be a text document, an image, an audio file, or a video file.
The purpose of the Web has gone beyond the simple retrieving of linked docu-
ments. Today, the Web is used to provide electronic shopping and gaming. One can use
the Web to listen to radio programs or view television programs whenever one desires
without being forced to listen to or view these programs when they are broadcast.
Architecture
The WWW today is a distributed client-server service, in which a client using a
browser can access a service using a server. However, the service provided is distrib-
uted over many locations called sites. Each site holds one or more web pages. Each
web page, however, can contain some links to other web pages in the same or other
sites. In other words, a web page can be simple or composite. A simple web page has
no links to other web pages; a composite web page has one or more links to other web
pages. Each web page is a file with a name and address.

Example 26.1
Assume we need to retrieve a scientific document that contains one reference to another text file
and one reference to a large image. Figure 26.1 shows the situation.
The main document and the image are stored in two separate files (file A and file B) in the
same site; the referenced text file (file C) is stored in another site. Since we are dealing with three

† In French: Conseil Européen pour la Recherche Nucléaire


CHAPTER 26 STANDARD CLIENT-SERVER PROTOCOLS 873

Figure 26.1 Example 26.1

Site I Site II
Client

A B C
1 Request 1

Response 1 2
A: Original document
3 Request 2 B: Image
C: Referenced file
Response 2 4

5 Request 3
Response 3 6

different files, we need three transactions if we want to see the whole document. The first transac-
tion (request/response) retrieves a copy of the main document (file A), which has references (point-
ers) to the second and third files. When a copy of the main document is retrieved and browsed, the
user can click on the reference to the image to invoke the second transaction and retrieve a copy of
the image (file B). If the user needs to see the contents of the referenced text file, she can click on its
reference (pointer) invoking the third transaction and retrieving a copy of file C. Note that although
files A and B both are stored in site I, they are independent files with different names and addresses.
Two transactions are needed to retrieve them. A very important point we need to remember is that
file A, file B, and file C in Example 26.1 are independent web pages, each with independent names
and addresses. Although references to file B or C are included in file A, it does not mean that each
of these files cannot be retrieved independently. A second user can retrieve file B with one transac-
tion. A third user can retrieve file C with one transaction.
Web Client (Browser)
A variety of vendors offer commercial browsers that interpret and display a web
page, and all of them use nearly the same architecture. Each browser usually consists
of three parts: a controller, client protocols, and interpreters. (see Figure 26.2).

Figure 26.2 Browser

Browser
HTML

Controller JavaScript

Java
HTTP FTP SSH SMTP
Interpreters
874 PART VI APPLICATION LAYER

The controller receives input from the keyboard or the mouse and uses the client
programs to access the document. After the document has been accessed, the controller
uses one of the interpreters to display the document on the screen. The client protocol
can be one of the protocols described later, such as HTTP or FTP. The interpreter can
be HTML, Java, or JavaScript, depending on the type of document. Some commercial
browsers include Internet Explorer, Netscape Navigator, and Firefox.
Web Server
The web page is stored at the server. Each time a request arrives, the corresponding
document is sent to the client. To improve efficiency, servers normally store requested
files in a cache in memory; memory is faster to access than a disk. A server can also
become more efficient through multithreading or multiprocessing. In this case, a server
can answer more than one request at a time. Some popular web servers include Apache
and Microsoft Internet Information Server.
Uniform Resource Locator (URL)
A web page, as a file, needs to have a unique identifier to distinguish it from other
web pages. To define a web page, we need three identifiers: host, port, and path.
However, before defining the web page, we need to tell the browser what client-
server application we want to use, which is called the protocol. This means we need
four identifiers to define the web page. The first is the type of vehicle to be used to
fetch the web page; the last three make up the combination that defines the destina-
tion object (web page).
❑ Protocol. The first identifier is the abbreviation for the client-server program that
we need in order to access the web page. Although most of the time the protocol is
HTTP (HyperText Transfer Protocol), which we will discuss shortly, we can also
use other protocols such as FTP (File Transfer Protocol).
❑ Host. The host identifier can be the IP address of the server or the unique name
given to the server. IP addresses can be defined in dotted decimal notation, as
described in Chapter 18 (such as 64.23.56.17); the name is normally the domain
name that uniquely defines the host, such as forouzan.com, which we discuss in
Domain Name System (DNS) later in this chapter.
❑ Port. The port, a 16-bit integer, is normally predefined for the client-server appli-
cation. For example, if the HTTP protocol is used for accessing the web page, the
well-known port number is 80. However, if a different port is used, the number can
be explicitly given.
❑ Path. The path identifies the location and the name of the file in the underlying
operating system. The format of this identifier normally depends on the operat-
ing system. In UNIX, a path is a set of directory names followed by the file
name, all separated by a slash. For example, /top/next/last/myfile is a path that
uniquely defines a file named myfile, stored in the directory last, which itself is
part of the directory next, which itself is under the directory top. In other words,
the path lists the directories from the top to the bottom, followed by the file
name.
CHAPTER 26 STANDARD CLIENT-SERVER PROTOCOLS 875

To combine these four pieces together, the uniform resource locator (URL) has
been designed; it uses three different separators between the four pieces as shown
below:
protocol://host/path Used most of the time
protocol://host:port/path Used when port number is needed

Example 26.2
The URL http://www.mhhe.com/compsci/forouzan/ defines the web page related to one of the
authors of this book. The string www.mhhe.com is the name of the computer in the McGraw-Hill
company (the three letters www are part of the host name and are added to the commercial host).
The path is compsci/forouzan/, which defines Forouzan’s web page under the directory compsci
(computer science).
Web Documents
The documents in the WWW can be grouped into three broad categories: static, dynamic,
and active.
Static Documents
Static documents are fixed-content documents that are created and stored in a server.
The client can get a copy of the document only. In other words, the contents of the file
are determined when the file is created, not when it is used. Of course, the contents in
the server can be changed, but the user cannot change them. When a client accesses the
document, a copy of the document is sent. The user can then use a browser to see the
document. Static documents are prepared using one of several languages: HyperText
Markup Language (HTML), Extensible Markup Language (XML), Extensible Style
Language (XSL), and Extensible Hypertext Markup Language (XHTML). We discuss
these languages in Appendix C.
Dynamic Documents
A dynamic document is created by a web server whenever a browser requests the docu-
ment. When a request arrives, the web server runs an application program or a script that
creates the dynamic document. The server returns the result of the program or script as a
response to the browser that requested the document. Because a fresh document is created
for each request, the contents of a dynamic document may vary from one request to
another. A very simple example of a dynamic document is the retrieval of the time and
date from a server. Time and date are kinds of information that are dynamic in that they
change from moment to moment. The client can ask the server to run a program such as
the date program in UNIX and send the result of the program to the client. Although the
Common Gateway Interface (CGI) was used to retrieve a dynamic document in the past,
today’s options include one of the scripting languages such as Java Server Pages (JSP),
which uses the Java language for scripting, or Active Server Pages (ASP), a Microsoft
product that uses Visual Basic language for scripting, or ColdFusion, which embeds que-
ries in a Structured Query Language (SQL) database in the HTML document.
Active Documents
For many applications, we need a program or a script to be run at the client site. These are
called active documents. For example, suppose we want to run a program that creates
animated graphics on the screen or a program that interacts with the user. The program
876 PART VI APPLICATION LAYER

definitely needs to be run at the client site where the animation or interaction takes place.
When a browser requests an active document, the server sends a copy of the document or
a script. The document is then run at the client (browser) site. One way to create an active
document is to use Java applets, a program written in Java on the server. It is compiled
and ready to be run. The document is in bytecode (binary) format. Another way is to use
JavaScripts but download and run the script at the client site.

26.1.2 HyperText Transfer Protocol (HTTP)


The HyperText Transfer Protocol (HTTP) is used to define how the client-server
programs can be written to retrieve web pages from the Web. An HTTP client sends a
request; an HTTP server returns a response. The server uses the port number 80; the cli-
ent uses a temporary port number. HTTP uses the services of TCP, which, as discussed
before, is a connection-oriented and reliable protocol. This means that, before any
transaction between the client and the server can take place, a connection needs to be
established between them. After the transaction, the connection should be terminated.
The client and server, however, do not need to worry about errors in messages
exchanged or loss of any message, because the TCP is reliable and will take care of this
matter, as we saw in Chapter 24.
Nonpersistent versus Persistent Connections
As we discussed in the previous section, the hypertext concept embedded in web page
documents may require several requests and responses. If the web pages, objects to be
retrieved, are located on different servers, we do not have any other choice than to cre-
ate a new TCP connection for retrieving each object. However, if some of the objects
are located on the same server, we have two choices: to retrieve each object using a new
TCP connection or to make a TCP connection and retrieve them all. The first method is
referred to as a nonpersistent connection, the second as a persistent connection. HTTP,
prior to version 1.1, specified nonpersistent connections, while persistent connections
are the default in version 1.1, but it can be changed by the user.
Nonpersistent Connections
In a nonpersistent connection, one TCP connection is made for each request/response.
The following lists the steps in this strategy:
1. The client opens a TCP connection and sends a request.
2. The server sends the response and closes the connection.
3. The client reads the data until it encounters an end-of-file marker; it then closes the
connection.
In this strategy, if a file contains links to N different pictures in different files (all
located on the same server), the connection must be opened and closed N + 1 times.
The nonpersistent strategy imposes high overhead on the server because the server
needs N + 1 different buffers each time a connection is opened.

Example 26.3
Figure 26.3 shows an example of a nonpersistent connection. The client needs to access a file that
contains one link to an image. The text file and image are located on the same server. Here we
need two connections. For each connection, TCP requires at least three handshake messages to
CHAPTER 26 STANDARD CLIENT-SERVER PROTOCOLS 877

establish the connection, but the request can be sent with the third one. After the connection is
established, the object can be transferred. After receiving an object, another three handshake
messages are needed to terminate the connection, as we saw in Chapter 24. This means that the

Figure 26.3 Example 26.3

Client Server
Image File
First handshake

Second handshake
Third handshake + request

Connection
Response

File First handshake

Second handshake
Third handshake

First handshake

Second handshake
Third handshake + request

Connection
Response

Image First handshake

Second handshake
Third handshake

Time Time

client and server are involved in two connection establishments and two connection terminations.
If the transaction involves retrieving 10 or 20 objects, the round trip times spent for these hand-
shakes add up to a big overhead. When we describe the client-server programming at the end of
the chapter, we will show that for each connection the client and server need to allocate extra
resources such as buffers and variables. This is another burden on both sites, but especially on the
server site.
Persistent Connections
HTTP version 1.1 specifies a persistent connection by default. In a persistent connec-
tion, the server leaves the connection open for more requests after sending a response.
878 PART VI APPLICATION LAYER

The server can close the connection at the request of a client or if a time-out has been
reached. The sender usually sends the length of the data with each response. However,
there are some occasions when the sender does not know the length of the data. This is
the case when a document is created dynamically or actively. In these cases, the server
informs the client that the length is not known and closes the connection after sending
the data so the client knows that the end of the data has been reached. Time and
resources are saved using persistent connections. Only one set of buffers and variables
needs to be set for the connection at each site. The round trip time for connection estab-
lishment and connection termination is saved.

Example 26.4
Figure 26.4 shows the same scenario as in Example 26.3, but using a persistent connection.
Only one connection establishment and connection termination is used, but the request for the
image is sent separately.

Figure 26.4 Example 26.4

Server
Client Image File
First handshake

Second handshake
Third handshake + request

Response

Connection
File Request

Response

First handshake
Image
Second handshake
Third handshake

Time Time

Message Formats
The HTTP protocol defines the format of the request and response messages, as shown
in Figure 26.5. We have put the two formats next to each other for comparison. Each
message is made of four sections. The first section in the request message is called the
request line; the first section in the response message is called the status line. The other
three sections have the same names in the request and response messages. However, the
CHAPTER 26 STANDARD CLIENT-SERVER PROTOCOLS 879

Figure 26.5 Formats of the request and response messages

Legend sp: Space cr: Carriage Return lf: Line Feed

Request Status Status


line Method sp URL sp Version cr lf Version sp sp Phrase cr lf line
code

Header name : sp Value cr lf Header name : sp Value cr lf


Header Header
lines lines
Header name : sp Value cr lf Header name : sp Value cr lf

Blank Blank
line cr lf cr lf line

Variable number of lines Variable number of lines


Body (Present only in some messages) Body
(Present only in some messages)

Request message Response message

similarities between these sections are only in the names; they may have different con-
tents. We discuss each message type separately.
Request Message
As we said before, the first line in a request message is called a request line. There are
three fields in this line separated by one space and terminated by two characters (car-
riage return and line feed) as shown in Figure 26.5. The fields are called method, URL,
and version.
The method field defines the request types. In version 1.1 of HTTP, several
methods are defined, as shown in Table 26.1. Most of the time, the client uses the
GET method to send a request. In this case, the body of the message is empty. The
HEAD method is used when the client needs only some information about the web
page from the server, such as the last time it was modified. It can also be used to test
the validity of a URL. The response message in this case has only the header section;
the body section is empty. The PUT method is the inverse of the GET method; it
allows the client to post a new web page on the server (if permitted). The POST
method is similar to the PUT method, but it is used to send some information to the
server to be added to the web page or to modify the web page. The TRACE method is
used for debugging; the client asks the server to echo back the request to check
whether the server is getting the requests. The DELETE method allows the client to
delete a web page on the server if the client has permission to do so. The CONNECT
method was originally made as a reserve method; it may be used by proxy servers, as
discussed later. Finally, the OPTIONS method allows the client to ask about the prop-
erties of a web page.
The second field, URL, was discussed earlier in the chapter. It defines the address
and name of the corresponding web page. The third field, version, gives the version of
the protocol; the most current version of HTTP is 1.1.
880 PART VI APPLICATION LAYER

Table 26.1 Methods


Method Action
GET Requests a document from the server
HEAD Requests information about a document but not the document itself
PUT Sends a document from the client to the server
POST Sends some information from the client to the server
TRACE Echoes the incoming request
DELETE Removes the web page
CONNECT Reserved
OPTIONS Inquires about available options

After the request line, we can have zero or more request header lines. Each
header line sends additional information from the client to the server. For example,
the client can request that the document be sent in a special format. Each header line
has a header name, a colon, a space, and a header value (see Figure 26.5). Table 26.2
shows some header names commonly used in a request. The value field defines the
values associated with each header name. The list of values can be found in the corre-
sponding RFCs.
The body can be present in a request message. Usually, it contains the comment
to be sent or the file to be published on the website when the method is PUT or
POST.

Table 26.2 Request header names


Header Description
User-agent Identifies the client program
Accept Shows the media format the client can accept
Accept-charset Shows the character set the client can handle
Accept-encoding Shows the encoding scheme the client can handle
Accept-language Shows the language the client can accept
Authorization Shows what permissions the client has
Host Shows the host and port number of the client
Date Shows the current date
Upgrade Specifies the preferred communication protocol
Cookie Returns the cookie to the server (explained later)
If-Modified-Since If the file is modified since a specific date

Response Message
The format of the response message is also shown in Figure 26.5. A response mes-
sage consists of a status line, header lines, a blank line, and sometimes a body. The
first line in a response message is called the status line. There are three fields in this
line separated by spaces and terminated by a carriage return and line feed. The first
field defines the version of HTTP protocol, currently 1.1. The status code field
defines the status of the request. It consists of three digits. Whereas the codes in the
100 range are only informational, the codes in the 200 range indicate a successful
request. The codes in the 300 range redirect the client to another URL, and the codes
CHAPTER 26 STANDARD CLIENT-SERVER PROTOCOLS 881

in the 400 range indicate an error at the client site. Finally, the codes in the 500 range
indicate an error at the server site. The status phrase explains the status code in text
form.
After the status line, we can have zero or more response header lines. Each header
line sends additional information from the server to the client. For example, the sender
can send extra information about the document. Each header line has a header name, a
colon, a space, and a header value. We will show some header lines in the examples at
the end of this section. Table 26.3 shows some header names commonly used in a
response message.

Table 26.3 Response header names


Header Description
Date Shows the current date
Upgrade Specifies the preferred communication protocol
Server Gives information about the server
Set-Cookie The server asks the client to save a cookie
Content-Encoding Specifies the encoding scheme
Content-Language Specifies the language
Content-Length Shows the length of the document
Content-Type Specifies the media type
Location To ask the client to send the request to another site
Accept-Ranges The server will accept the requested byte-ranges
Last-modified Gives the date and time of the last change

The body contains the document to be sent from the server to the client. The body
is present unless the response is an error message.

Example 26.5
This example retrieves a document (see Figure 26.6). We use the GET method to retrieve an
image with the path /usr/bin/image1. The request line shows the method (GET), the URL, and
the HTTP version (1.1). The header has two lines that show that the client can accept images in
the GIF or JPEG format. The request does not have a body. The response message contains the
status line and four lines of header. The header lines define the date, server, content encoding
(MIME version, which will be described in electronic mail), and length of the document. The
body of the document follows the header.

Example 26.6
In this example, the client wants to send a web page to be posted on the server. We use the PUT
method. The request line shows the method (PUT), URL, and HTTP version (1.1). There are four
lines of headers. The request body contains the web page to be posted. The response message
contains the status line and four lines of headers. The created document, which is a CGI docu-
ment, is included as the body (see Figure 26.7).

Conditional Request
A client can add a condition in its request. In this case, the server will send the
requested web page if the condition is met or inform the client otherwise. One of
the most common conditions imposed by the client is the time and date the web
882 PART VI APPLICATION LAYER

Figure 26.6 Example 26.5

Request Server
Client GET /usr/bin/image1 HTTP/1.1
1 Accept: image/gif
Accept: image/jpeg

Response
HTTP/1.1 200 OK
Date: Mon, 10-Jan-2011 13:15:14 GMT
Server: Challenger
Content-encoding: MIME-version 1.0
Content-length: 2048 2

Time (Body of the document) Time

page is modified. The client can send the header line If-Modified-Since with the
request to tell the server that it needs the page only if it is modified after a certain
point in time.

Figure 26.7 Example 26.6

Request
Server
Client PUT /cgi-bin/doc.pl HTTP/1.1
1 Accept: */*
Accept: image/gif
Accept: image/jpeg
Content-length: 50

(Input information)

Response
HTTP/1.1 200 OK
Date: Mon, 10-Jan-2011 13:15:14 GMT
Server: Challenger
Content-encoding: MIME-version 1.0
Content-length: 2000 2

Time Time
(Body of the document)
CHAPTER 26 STANDARD CLIENT-SERVER PROTOCOLS 883

Example 26.7
The following shows how a client imposes the modification data and time condition on
a request.
GET http://www.commonServer.com/information/file1 HTTP/1.1 Request line
If-Modified-Since: Thu, Sept 04 00:00:00 GMT Header line
Blank line

The status line in the response shows the file was not modified after the defined point in
time. The body of the response message is also empty.
HTTP/1.1 304 Not Modified Status line
Date: Sat, Sept 06 08 16:22:46 GMT First header line
Server: commonServer.com Second header line
Blank line
(Empty Body) Empty body

Cookies
The World Wide Web was originally designed as a stateless entity. A client sends a request;
a server responds. Their relationship is over. The original purpose of the Web, retrieving
publicly available documents, exactly fits this design. Today the Web has other functions
that need to remember some information about the clients; some are listed below:
❑ Websites are being used as electronic stores that allow users to browse through the
store, select wanted items, put them in an electronic cart, and pay at the end with a
credit card.
❑ Some websites need to allow access to registered clients only.
❑ Some websites are used as portals: the user selects the web pages he wants to see.
❑ Some websites are just advertising agencies.
For these purposes, the cookie mechanism was devised.
Creating and Storing Cookies
The creation and storing of cookies depend on the implementation; however, the princi-
ple is the same.
1. When a server receives a request from a client, it stores information about the client
in a file or a string. The information may include the domain name of the client, the
contents of the cookie (information the server has gathered about the client such as
name, registration number, and so on), a timestamp, and other information depend-
ing on the implementation.
2. The server includes the cookie in the response that it sends to the client.
3. When the client receives the response, the browser stores the cookie in the cookie
directory, which is sorted by the server domain name.
Using Cookies
When a client sends a request to a server, the browser looks in the cookie directory to
see if it can find a cookie sent by that server. If found, the cookie is included in the
884 PART VI APPLICATION LAYER

request. When the server receives the request, it knows that this is an old client, not a
new one. Note that the contents of the cookie are never read by the browser or disclosed
to the user. It is a cookie made by the server and eaten by the server. Now let us see how
a cookie is used for the four previously mentioned purposes:
❑ An electronic store (e-commerce) can use a cookie for its client shoppers. When a
client selects an item and inserts it in a cart, a cookie that contains information
about the item, such as its number and unit price, is sent to the browser. If the client
selects a second item, the cookie is updated with the new selection information,
and so on. When the client finishes shopping and wants to check out, the last
cookie is retrieved and the total charge is calculated.
❑ The site that restricts access to registered clients only sends a cookie to the client
when the client registers for the first time. For any repeated access, only those cli-
ents that send the appropriate cookie are allowed.
❑ A web portal uses the cookie in a similar way. When a user selects her favorite
pages, a cookie is made and sent. If the site is accessed again, the cookie is sent to
the server to show what the client is looking for.
❑ A cookie is also used by advertising agencies. An advertising agency can place ban-
ner ads on some main website that is often visited by users. The advertising agency
supplies only a URL that gives the advertising agency’s address instead of the ban-
ner itself. When a user visits the main website and clicks the icon of a corporation, a
request is sent to the advertising agency. The advertising agency sends the requested
banner, but it also includes a cookie with the ID of the user. Any future use of
the banners adds to the database that profiles the Web behavior of the user. The
advertising agency has compiled the interests of the user and can sell this informa-
tion to other parties. This use of cookies has made them very controversial. Hope-
fully, some new regulations will be devised to preserve the privacy of users.

Example 26.8
Figure 26.8 shows a scenario in which an electronic store can benefit from the use of cookies.
Assume a shopper wants to buy a toy from an electronic store named BestToys. The shopper
browser (client) sends a request to the BestToys server. The server creates an empty shopping cart
(a list) for the client and assigns an ID to the cart (for example, 12343). The server then sends a
response message, which contains the images of all toys available, with a link under each toy that
selects the toy if it is being clicked. This response message also includes the Set-Cookie header
line whose value is 12343. The client displays the images and stores the cookie value in a file
named BestToys. The cookie is not revealed to the shopper. Now the shopper selects one of the
toys and clicks on it. The client sends a request, but includes the ID 12343 in the Cookie header
line. Although the server may have been busy and forgotten about this shopper, when it receives
the request and checks the header, it finds the value 12343 as the cookie. The server knows that
the customer is not new; it searches for a shopping cart with ID 12343. The shopping cart (list) is
opened and the selected toy is inserted in the list. The server now sends another response to the
shopper to tell her the total price and ask her to provide payment. The shopper provides
information about her credit card and sends a new request with the ID 12343 as the cookie value.
When the request arrives at the server, it again sees the ID 12343, and accepts the order and the
payment and sends a confirmation in a response. Other information about the client is stored in
CHAPTER 26 STANDARD CLIENT-SERVER PROTOCOLS 885

Figure 26.8 Example 26.8

Client Server
Request A customer file is
GET BestToys.com HTTP/1.1 created with ID: 12343
1

Response
A vendor file is created HTTP/1.1 200 OK
with cookie: 12343 Set-Cookie: 12343
Update
2

Page representing the toys


Request
GET image HTTP/1.1
Cookie
3 Cookie: 12343

Response
HTTP/1.1 200 OK

Update
4
Page representing the price

Request
GET image HTTP/1.1
Cookie Cookie: 12343
5

Information about the payment

Response
HTTP/1.1 200 OK
Update
6
Order confirmation
Time Time

the server. If the shopper accesses the store sometime in the future, the client sends the cookie
again; the store retrieves the file and has all the information about the client.

Web Caching: Proxy Servers


HTTP supports proxy servers. A proxy server is a computer that keeps copies of
responses to recent requests. The HTTP client sends a request to the proxy server. The
proxy server checks its cache. If the response is not stored in the cache, the proxy
server sends the request to the corresponding server. Incoming responses are sent to the
proxy server and stored for future requests from other clients.
The proxy server reduces the load on the original server, decreases traffic, and
improves latency. However, to use the proxy server, the client must be configured to
access the proxy instead of the target server.
886 PART VI APPLICATION LAYER

Note that the proxy server acts as both server and client. When it receives a request
from a client for which it has a response, it acts as a server and sends the response to the
client. When it receives a request from a client for which it does not have a response, it
first acts as a client and sends a request to the target server. When the response has been
received, it acts again as a server and sends the response to the client.
Proxy Server Location
The proxy servers are normally located at the client site. This means that we can have a
hierarchy of proxy servers, as shown below:
1. A client computer can also be used as a proxy server, in a small capacity, that
stores responses to requests often invoked by the client.
2. In a company, a proxy server may be installed on the computer LAN to reduce the
load going out of and coming into the LAN.
3. An ISP with many customers can install a proxy server to reduce the load going
out of and coming into the ISP network.

Example 26.9
Figure 26.9 shows an example of a use of a proxy server in a local network, such as the network

Figure 26.9 Example of a proxy server

Client Client Client


Web Web
server server

WAN Internet

Proxy
server Local Network

Web Web
server server

on a campus or in a company. The proxy server is installed in the local network. When an HTTP
request is created by any of the clients (browsers), the request is first directed to the proxy server.
If the proxy server already has the corresponding web page, it sends the response to the client.
Otherwise, the proxy server acts as a client and sends the request to the web server in the Internet.
When the response is returned, the proxy server makes a copy and stores it in its cache before
sending it to the requesting client.
Cache Update
A very important question is how long a response should remain in the proxy server
before being deleted and replaced. Several different strategies are used for this purpose.
One solution is to store the list of sites whose information remains the same for a while.
For example, a news agency may change its news page every morning. This means that
CHAPTER 26 STANDARD CLIENT-SERVER PROTOCOLS 887

a proxy server can get the news early in the morning and keep it until the next day.
Another recommendation is to add some headers to show the last modification time of
the information. The proxy server can then use the information in this header to guess
how long the information would be valid.
HTTP Security
HTTP per se does not provide security. However, as we show in Chapter 32, HTTP can
be run over the Secure Socket Layer (SSL). In this case, HTTP is referred to as HTTPS.
HTTPS provides confidentiality, client and server authentication, and data integrity.

26.2 FTP
File Transfer Protocol (FTP) is the standard protocol provided by TCP/IP for copy-
ing a file from one host to another. Although transferring files from one system to
another seems simple and straightforward, some problems must be dealt with first.
For example, two systems may use different file name conventions. Two systems may
have different ways to represent data. Two systems may have different directory
structures. All of these problems have been solved by FTP in a very simple and ele-
gant approach. Although we can transfer files using HTTP, FTP is a better choice to
transfer large files or to transfer files using different formats. Figure 26.10 shows the

Figure 26.10 FTP

Client

User
interface Server
Control
Control connection Control
Local process process Remote
file system file system
Data transfer Data transfer
process Data process
connection

basic model of FTP. The client has three components: the user interface, the client
control process, and the client data transfer process. The server has two components:
the server control process and the server data transfer process. The control connec-
tion is made between the control processes. The data connection is made between the
data transfer processes.
Separation of commands and data transfer makes FTP more efficient. The control
connection uses very simple rules of communication. We need to transfer only a line of
command or a line of response at a time. The data connection, on the other hand, needs
more complex rules due to the variety of data types transferred.

You might also like