Unit 3 Cna
Unit 3 Cna
without changing the way we send data or receive data. Figure 25.5 shows the idea and
compares the sockets with other sources and sinks.
Figure 25.5 Sockets used the same way as other sources and sinks
Application program
Sockets
Although a socket is supposed to behave like a terminal or a file, it is not a physical
entity like them; it is an abstraction. It is an object that is created and used by the appli-
cation program.
We can say that, as far as the application layer is concerned, communication
between a client process and a server process is communication between two sockets,
created at two ends, as shown in Figure 25.6. The client thinks that the socket is the
entity that receives the request and gives the response; the server thinks that the socket
is the one that has a request and needs the response. If we create two sockets, one at
each end, and define the source and destination addresses correctly, we can use the
available instructions to send and receive data. The rest is the responsibility of the oper-
ating system and the embedded TCP/IP protocol.
CHAPTER 25 INTRODUCTION TO APPLICATION LAYER 825
Client Server
process process
Application Application
layer Request Response Response Request layer
Socket Socket
Logical connection
Socket Addresses
The interaction between a client and a server is two-way communication. In a two-way
communication, we need a pair of addresses: local (sender) and remote (receiver). The
local address in one direction is the remote address in the other direction and vice
versa. Since communication in the client-server paradigm is between two sockets, we
need a pair of socket addresses for communication: a local socket address and a
remote socket address. However, we need to define a socket address in terms of identi-
fiers used in the TCP/IP protocol suite.
A socket address should first define the computer on which a client or a server is
running. As we discussed in Chapter 18, a computer in the Internet is uniquely defined
by its IP address, a 32-bit integer in the current Internet version. However, several client
or server processes may be running at the same time on a computer, which means that
we need another identifier to define the specific client or server involved in the commu-
nication. As we discussed in Chapter 24, an application program can be defined by a
port number, a 16-bit integer. This means that a socket address should be a combination
of an IP address and a port number as shown in Figure 25.7.
32 bits 16 bits
IP address Port number
Socket address
Since a socket defines the end-point of the communication, we can say that a
socket is identified by a pair of socket addresses, a local and a remote.
Finding Socket Addresses
How can a client or a server find a pair of socket addresses for communication? The sit-
uation is different for each site.
Server Site
The server needs a local (server) and a remote (client) socket address for communication.
826 PART VI APPLICATION LAYER
Local Socket Address The local (server) socket address is provided by the operating
system. The operating system knows the IP address of the computer on which the
server process is running. The port number of a server process, however, needs to be
assigned. If the server process is a standard one defined by the Internet authority, a port
number is already assigned to it. For example, the assigned port number for a Hypertext
Transfer Protocol (HTTP) is the integer 80, which cannot be used by any other process.
We discussed these well-known port numbers in Chapter 24. If the server process is
not standard, the designer of the server process can choose a port number, in the range
defined by the Internet authority, and assign it to the process. When a server starts run-
ning, it knows the local socket address.
Remote Socket Address The remote socket address for a server is the socket address
of the client that makes the connection. Since the server can serve many clients, it does
not know beforehand the remote socket address for communication. The server can find
this socket address when a client tries to connect to the server. The client socket
address, which is contained in the request packet sent to the server, becomes the remote
socket address that is used for responding to the client. In other words, although the
local socket address for a server is fixed and used during its lifetime, the remote socket
address is changed in each interaction with a different client.
Client Site
The client also needs a local (client) and a remote (server) socket address for
communication.
Local Socket Address The local (client) socket address is also provided by the oper-
ating system. The operating system knows the IP address of the computer on which the
client is running. The port number, however, is a 16-bit temporary integer that is
assigned to a client process each time the process needs to start the communication.
The port number, however, needs to be assigned from a set of integers defined by the
Internet authority and called the ephemeral (temporary) port numbers, which we dis-
cussed in Chapter 24. The operating system, however, needs to guarantee that the new
port number is not used by any other running client process. The operating system
needs to remember the port number to be able to redirect the response received from
the server process to the client process that sent the request.
Remote Socket Address Finding the remote (server) socket address for a client, how-
ever, needs more work. When a client process starts, it should know the socket address
of the server it wants to connect to. We will have two situations in this case.
❑ Sometimes, the user who starts the client process knows both the server port
number and IP address of the computer on which the server is running. This usu-
ally occurs in situations when we have written client and server applications and
we want to test them. For example, at the end of this chapter we write some sim-
ple client and server programs and we test them using this approach. In this situ-
ation, the programmer can provide these two pieces of information when he runs
the client program.
❑ Although each standard application has a well-known port number, most of the
time, we do not know the IP address. This happens in situations such as when we
CHAPTER 25 INTRODUCTION TO APPLICATION LAYER 827
need to contact a web page, send an e-mail to a friend, copy a file from a remote
site, and so on. In these situations, the server has a name, an identifier that
uniquely defines the server process. Examples of these identifiers are URLs,
such as www.xxx.yyy, or e-mail addresses, such as [email protected]. The client
process should now change this identifier (name) to the corresponding server
socket address. The client process normally knows the port number because it
should be a well-known port number, but the IP address can be obtained using
another client-server application called the Domain Name System (DNS). We
will discuss DNS in Chapter 26, but it is enough to know that it acts as a directory in
the Internet. Compare the situation with the telephone directory. We want to call
someone whose name we know but whose telephone number can be obtained
from the telephone directory. The telephone directory maps the name to the tele-
phone number; DNS maps the server name to the IP address of the computer run-
ning that server.
TCP Protocol
TCP provides connection-oriented, reliable, byte-stream service. TCP requires that two
ends first create a logical connection between themselves by exchanging some
830 PART VI APPLICATION LAYER
Start Server
Socket
Create socket
Bind socket
Infinite
Clients loop
Start
Socket Receive request
Create socket
Block
Empty socket
Destroy socket
Half-filled socket
trying to establish connection. After the connection is established, the server creates
a socket to exchange data with the client and finally to terminate the connection. The
client uses only one socket for both connection establishment and data exchange
(see Figure 25.10).
Server
1 Connection establishment
Client 1
3
2
Data transfer and termination
Create
4 Connection establishment
Client 2
6
5
Data transfer and termination
Create
Legend
Listen socket
Socket
Flow Diagram
Figure 25.11 shows a simplified flow diagram for iterative communication using TCP.
There are multiple clients, but only one server. Each client is served in each iteration of
the loop. The flow diagram is almost similar to the one for UDP, but there are differ-
ences that we explain for each site.
Server Process
In Figure 25.11, the TCP server process, like the UDP server process, creates a socket
and binds it, but these two commands create the listen socket to be used only for the
connection establishment phase. The server process then calls the listen procedure, to
allow the operating system to start accepting the clients, completing the connection
phase, and putting them in the waiting list to be served.
The server process now starts a loop and serves the clients one by one. In each iter-
ation, the server process issues the accept procedure that removes one client from the
waiting list of the connected clients for serving. If the list is empty, the accept proce-
dure blocks until there is a client to be served. When the accept procedure returns, it
creates a new socket for data transfer. The server process now uses the client socket
address obtained during the connection establishment to fill the remote socket address
field in the newly created socket. At this time the client and server can exchange data.
Client Process
The client flow diagram is almost similar to the UDP version except that the client
data-transfer box needs to be defined for each specific case. We do so when we write a
specific program later.
832 PART VI APPLICATION LAYER
Legend Server
Start
Listen
Empty socket socket
Half-filled socket Create socket
Filled socket
Bind socket
Clients
Start Listen
ment
Socket b lish
esta Infinite
Create socket on
necti loop
Con
Connect Accept
Block Block
Unblock Unblock
Connection termination
Destroy socket Destroy socket
Stop
Standard Client-Server
Protocols
A fter introducing the application layer in the previous chapter, we discuss some
standard application-layer protocols in this chapter. During the lifetime of the
Internet, several client-server application programs have been developed. We do not
have to redefine them, but we need to understand what they do. For each application,
we also need to know the options available to us. The study of these applications and
the ways they provide different services can help us to create customized applications
in the future.
We have selected six standard application programs in this section. Some other
applications have been or will be discussed in other chapters. Dynamic Host Configura-
tion Protocol (DHCP) was discussed in Chapter 18 and Simple Network Management
Protocol (SNMP) will be discussed in Chapter 27.
This chapter is made of six sections:
❑ The first section introduces the World Wide Web. It then discusses the HyperText
Transfer Protocol, the most common client-server application program used in
relation to the World Wide Web.
❑ The second section discusses the File Transfer Protocol, which is the standard
protocol provided by TCP/IP for copying a file from one host to another.
❑ The third section discusses electronic mail, which involves two protocols: SMPT
and POP. As we will see, the nature of this application is different from the other
two previous applications. We need two different protocols to handle electronic
mail.
❑ The fourth section discusses TELNET, a general client-server program that allows
users to log in to a remote machine and use any application available on the remote
host.
❑ The fifth section discusses Secure Shell, which can be used as a secured TELNET,
but it can also provide a secure tunnel for other applications.
❑ The sixth section talks about the Domain Name System, which acts as the direc-
tory system in the Internet. It maps the name of an entity to its IP address.
871
872 PART VI APPLICATION LAYER
Example 26.1
Assume we need to retrieve a scientific document that contains one reference to another text file
and one reference to a large image. Figure 26.1 shows the situation.
The main document and the image are stored in two separate files (file A and file B) in the
same site; the referenced text file (file C) is stored in another site. Since we are dealing with three
Site I Site II
Client
A B C
1 Request 1
Response 1 2
A: Original document
3 Request 2 B: Image
C: Referenced file
Response 2 4
5 Request 3
Response 3 6
different files, we need three transactions if we want to see the whole document. The first transac-
tion (request/response) retrieves a copy of the main document (file A), which has references (point-
ers) to the second and third files. When a copy of the main document is retrieved and browsed, the
user can click on the reference to the image to invoke the second transaction and retrieve a copy of
the image (file B). If the user needs to see the contents of the referenced text file, she can click on its
reference (pointer) invoking the third transaction and retrieving a copy of file C. Note that although
files A and B both are stored in site I, they are independent files with different names and addresses.
Two transactions are needed to retrieve them. A very important point we need to remember is that
file A, file B, and file C in Example 26.1 are independent web pages, each with independent names
and addresses. Although references to file B or C are included in file A, it does not mean that each
of these files cannot be retrieved independently. A second user can retrieve file B with one transac-
tion. A third user can retrieve file C with one transaction.
Web Client (Browser)
A variety of vendors offer commercial browsers that interpret and display a web
page, and all of them use nearly the same architecture. Each browser usually consists
of three parts: a controller, client protocols, and interpreters. (see Figure 26.2).
Browser
HTML
Controller JavaScript
Java
HTTP FTP SSH SMTP
Interpreters
874 PART VI APPLICATION LAYER
The controller receives input from the keyboard or the mouse and uses the client
programs to access the document. After the document has been accessed, the controller
uses one of the interpreters to display the document on the screen. The client protocol
can be one of the protocols described later, such as HTTP or FTP. The interpreter can
be HTML, Java, or JavaScript, depending on the type of document. Some commercial
browsers include Internet Explorer, Netscape Navigator, and Firefox.
Web Server
The web page is stored at the server. Each time a request arrives, the corresponding
document is sent to the client. To improve efficiency, servers normally store requested
files in a cache in memory; memory is faster to access than a disk. A server can also
become more efficient through multithreading or multiprocessing. In this case, a server
can answer more than one request at a time. Some popular web servers include Apache
and Microsoft Internet Information Server.
Uniform Resource Locator (URL)
A web page, as a file, needs to have a unique identifier to distinguish it from other
web pages. To define a web page, we need three identifiers: host, port, and path.
However, before defining the web page, we need to tell the browser what client-
server application we want to use, which is called the protocol. This means we need
four identifiers to define the web page. The first is the type of vehicle to be used to
fetch the web page; the last three make up the combination that defines the destina-
tion object (web page).
❑ Protocol. The first identifier is the abbreviation for the client-server program that
we need in order to access the web page. Although most of the time the protocol is
HTTP (HyperText Transfer Protocol), which we will discuss shortly, we can also
use other protocols such as FTP (File Transfer Protocol).
❑ Host. The host identifier can be the IP address of the server or the unique name
given to the server. IP addresses can be defined in dotted decimal notation, as
described in Chapter 18 (such as 64.23.56.17); the name is normally the domain
name that uniquely defines the host, such as forouzan.com, which we discuss in
Domain Name System (DNS) later in this chapter.
❑ Port. The port, a 16-bit integer, is normally predefined for the client-server appli-
cation. For example, if the HTTP protocol is used for accessing the web page, the
well-known port number is 80. However, if a different port is used, the number can
be explicitly given.
❑ Path. The path identifies the location and the name of the file in the underlying
operating system. The format of this identifier normally depends on the operat-
ing system. In UNIX, a path is a set of directory names followed by the file
name, all separated by a slash. For example, /top/next/last/myfile is a path that
uniquely defines a file named myfile, stored in the directory last, which itself is
part of the directory next, which itself is under the directory top. In other words,
the path lists the directories from the top to the bottom, followed by the file
name.
CHAPTER 26 STANDARD CLIENT-SERVER PROTOCOLS 875
To combine these four pieces together, the uniform resource locator (URL) has
been designed; it uses three different separators between the four pieces as shown
below:
protocol://host/path Used most of the time
protocol://host:port/path Used when port number is needed
Example 26.2
The URL http://www.mhhe.com/compsci/forouzan/ defines the web page related to one of the
authors of this book. The string www.mhhe.com is the name of the computer in the McGraw-Hill
company (the three letters www are part of the host name and are added to the commercial host).
The path is compsci/forouzan/, which defines Forouzan’s web page under the directory compsci
(computer science).
Web Documents
The documents in the WWW can be grouped into three broad categories: static, dynamic,
and active.
Static Documents
Static documents are fixed-content documents that are created and stored in a server.
The client can get a copy of the document only. In other words, the contents of the file
are determined when the file is created, not when it is used. Of course, the contents in
the server can be changed, but the user cannot change them. When a client accesses the
document, a copy of the document is sent. The user can then use a browser to see the
document. Static documents are prepared using one of several languages: HyperText
Markup Language (HTML), Extensible Markup Language (XML), Extensible Style
Language (XSL), and Extensible Hypertext Markup Language (XHTML). We discuss
these languages in Appendix C.
Dynamic Documents
A dynamic document is created by a web server whenever a browser requests the docu-
ment. When a request arrives, the web server runs an application program or a script that
creates the dynamic document. The server returns the result of the program or script as a
response to the browser that requested the document. Because a fresh document is created
for each request, the contents of a dynamic document may vary from one request to
another. A very simple example of a dynamic document is the retrieval of the time and
date from a server. Time and date are kinds of information that are dynamic in that they
change from moment to moment. The client can ask the server to run a program such as
the date program in UNIX and send the result of the program to the client. Although the
Common Gateway Interface (CGI) was used to retrieve a dynamic document in the past,
today’s options include one of the scripting languages such as Java Server Pages (JSP),
which uses the Java language for scripting, or Active Server Pages (ASP), a Microsoft
product that uses Visual Basic language for scripting, or ColdFusion, which embeds que-
ries in a Structured Query Language (SQL) database in the HTML document.
Active Documents
For many applications, we need a program or a script to be run at the client site. These are
called active documents. For example, suppose we want to run a program that creates
animated graphics on the screen or a program that interacts with the user. The program
876 PART VI APPLICATION LAYER
definitely needs to be run at the client site where the animation or interaction takes place.
When a browser requests an active document, the server sends a copy of the document or
a script. The document is then run at the client (browser) site. One way to create an active
document is to use Java applets, a program written in Java on the server. It is compiled
and ready to be run. The document is in bytecode (binary) format. Another way is to use
JavaScripts but download and run the script at the client site.
Example 26.3
Figure 26.3 shows an example of a nonpersistent connection. The client needs to access a file that
contains one link to an image. The text file and image are located on the same server. Here we
need two connections. For each connection, TCP requires at least three handshake messages to
CHAPTER 26 STANDARD CLIENT-SERVER PROTOCOLS 877
establish the connection, but the request can be sent with the third one. After the connection is
established, the object can be transferred. After receiving an object, another three handshake
messages are needed to terminate the connection, as we saw in Chapter 24. This means that the
Client Server
Image File
First handshake
Second handshake
Third handshake + request
Connection
Response
Second handshake
Third handshake
First handshake
Second handshake
Third handshake + request
Connection
Response
Second handshake
Third handshake
Time Time
client and server are involved in two connection establishments and two connection terminations.
If the transaction involves retrieving 10 or 20 objects, the round trip times spent for these hand-
shakes add up to a big overhead. When we describe the client-server programming at the end of
the chapter, we will show that for each connection the client and server need to allocate extra
resources such as buffers and variables. This is another burden on both sites, but especially on the
server site.
Persistent Connections
HTTP version 1.1 specifies a persistent connection by default. In a persistent connec-
tion, the server leaves the connection open for more requests after sending a response.
878 PART VI APPLICATION LAYER
The server can close the connection at the request of a client or if a time-out has been
reached. The sender usually sends the length of the data with each response. However,
there are some occasions when the sender does not know the length of the data. This is
the case when a document is created dynamically or actively. In these cases, the server
informs the client that the length is not known and closes the connection after sending
the data so the client knows that the end of the data has been reached. Time and
resources are saved using persistent connections. Only one set of buffers and variables
needs to be set for the connection at each site. The round trip time for connection estab-
lishment and connection termination is saved.
Example 26.4
Figure 26.4 shows the same scenario as in Example 26.3, but using a persistent connection.
Only one connection establishment and connection termination is used, but the request for the
image is sent separately.
Server
Client Image File
First handshake
Second handshake
Third handshake + request
Response
Connection
File Request
Response
First handshake
Image
Second handshake
Third handshake
Time Time
Message Formats
The HTTP protocol defines the format of the request and response messages, as shown
in Figure 26.5. We have put the two formats next to each other for comparison. Each
message is made of four sections. The first section in the request message is called the
request line; the first section in the response message is called the status line. The other
three sections have the same names in the request and response messages. However, the
CHAPTER 26 STANDARD CLIENT-SERVER PROTOCOLS 879
Blank Blank
line cr lf cr lf line
similarities between these sections are only in the names; they may have different con-
tents. We discuss each message type separately.
Request Message
As we said before, the first line in a request message is called a request line. There are
three fields in this line separated by one space and terminated by two characters (car-
riage return and line feed) as shown in Figure 26.5. The fields are called method, URL,
and version.
The method field defines the request types. In version 1.1 of HTTP, several
methods are defined, as shown in Table 26.1. Most of the time, the client uses the
GET method to send a request. In this case, the body of the message is empty. The
HEAD method is used when the client needs only some information about the web
page from the server, such as the last time it was modified. It can also be used to test
the validity of a URL. The response message in this case has only the header section;
the body section is empty. The PUT method is the inverse of the GET method; it
allows the client to post a new web page on the server (if permitted). The POST
method is similar to the PUT method, but it is used to send some information to the
server to be added to the web page or to modify the web page. The TRACE method is
used for debugging; the client asks the server to echo back the request to check
whether the server is getting the requests. The DELETE method allows the client to
delete a web page on the server if the client has permission to do so. The CONNECT
method was originally made as a reserve method; it may be used by proxy servers, as
discussed later. Finally, the OPTIONS method allows the client to ask about the prop-
erties of a web page.
The second field, URL, was discussed earlier in the chapter. It defines the address
and name of the corresponding web page. The third field, version, gives the version of
the protocol; the most current version of HTTP is 1.1.
880 PART VI APPLICATION LAYER
After the request line, we can have zero or more request header lines. Each
header line sends additional information from the client to the server. For example,
the client can request that the document be sent in a special format. Each header line
has a header name, a colon, a space, and a header value (see Figure 26.5). Table 26.2
shows some header names commonly used in a request. The value field defines the
values associated with each header name. The list of values can be found in the corre-
sponding RFCs.
The body can be present in a request message. Usually, it contains the comment
to be sent or the file to be published on the website when the method is PUT or
POST.
Response Message
The format of the response message is also shown in Figure 26.5. A response mes-
sage consists of a status line, header lines, a blank line, and sometimes a body. The
first line in a response message is called the status line. There are three fields in this
line separated by spaces and terminated by a carriage return and line feed. The first
field defines the version of HTTP protocol, currently 1.1. The status code field
defines the status of the request. It consists of three digits. Whereas the codes in the
100 range are only informational, the codes in the 200 range indicate a successful
request. The codes in the 300 range redirect the client to another URL, and the codes
CHAPTER 26 STANDARD CLIENT-SERVER PROTOCOLS 881
in the 400 range indicate an error at the client site. Finally, the codes in the 500 range
indicate an error at the server site. The status phrase explains the status code in text
form.
After the status line, we can have zero or more response header lines. Each header
line sends additional information from the server to the client. For example, the sender
can send extra information about the document. Each header line has a header name, a
colon, a space, and a header value. We will show some header lines in the examples at
the end of this section. Table 26.3 shows some header names commonly used in a
response message.
The body contains the document to be sent from the server to the client. The body
is present unless the response is an error message.
Example 26.5
This example retrieves a document (see Figure 26.6). We use the GET method to retrieve an
image with the path /usr/bin/image1. The request line shows the method (GET), the URL, and
the HTTP version (1.1). The header has two lines that show that the client can accept images in
the GIF or JPEG format. The request does not have a body. The response message contains the
status line and four lines of header. The header lines define the date, server, content encoding
(MIME version, which will be described in electronic mail), and length of the document. The
body of the document follows the header.
Example 26.6
In this example, the client wants to send a web page to be posted on the server. We use the PUT
method. The request line shows the method (PUT), URL, and HTTP version (1.1). There are four
lines of headers. The request body contains the web page to be posted. The response message
contains the status line and four lines of headers. The created document, which is a CGI docu-
ment, is included as the body (see Figure 26.7).
Conditional Request
A client can add a condition in its request. In this case, the server will send the
requested web page if the condition is met or inform the client otherwise. One of
the most common conditions imposed by the client is the time and date the web
882 PART VI APPLICATION LAYER
Request Server
Client GET /usr/bin/image1 HTTP/1.1
1 Accept: image/gif
Accept: image/jpeg
Response
HTTP/1.1 200 OK
Date: Mon, 10-Jan-2011 13:15:14 GMT
Server: Challenger
Content-encoding: MIME-version 1.0
Content-length: 2048 2
page is modified. The client can send the header line If-Modified-Since with the
request to tell the server that it needs the page only if it is modified after a certain
point in time.
Request
Server
Client PUT /cgi-bin/doc.pl HTTP/1.1
1 Accept: */*
Accept: image/gif
Accept: image/jpeg
Content-length: 50
(Input information)
Response
HTTP/1.1 200 OK
Date: Mon, 10-Jan-2011 13:15:14 GMT
Server: Challenger
Content-encoding: MIME-version 1.0
Content-length: 2000 2
Time Time
(Body of the document)
CHAPTER 26 STANDARD CLIENT-SERVER PROTOCOLS 883
Example 26.7
The following shows how a client imposes the modification data and time condition on
a request.
GET http://www.commonServer.com/information/file1 HTTP/1.1 Request line
If-Modified-Since: Thu, Sept 04 00:00:00 GMT Header line
Blank line
The status line in the response shows the file was not modified after the defined point in
time. The body of the response message is also empty.
HTTP/1.1 304 Not Modified Status line
Date: Sat, Sept 06 08 16:22:46 GMT First header line
Server: commonServer.com Second header line
Blank line
(Empty Body) Empty body
Cookies
The World Wide Web was originally designed as a stateless entity. A client sends a request;
a server responds. Their relationship is over. The original purpose of the Web, retrieving
publicly available documents, exactly fits this design. Today the Web has other functions
that need to remember some information about the clients; some are listed below:
❑ Websites are being used as electronic stores that allow users to browse through the
store, select wanted items, put them in an electronic cart, and pay at the end with a
credit card.
❑ Some websites need to allow access to registered clients only.
❑ Some websites are used as portals: the user selects the web pages he wants to see.
❑ Some websites are just advertising agencies.
For these purposes, the cookie mechanism was devised.
Creating and Storing Cookies
The creation and storing of cookies depend on the implementation; however, the princi-
ple is the same.
1. When a server receives a request from a client, it stores information about the client
in a file or a string. The information may include the domain name of the client, the
contents of the cookie (information the server has gathered about the client such as
name, registration number, and so on), a timestamp, and other information depend-
ing on the implementation.
2. The server includes the cookie in the response that it sends to the client.
3. When the client receives the response, the browser stores the cookie in the cookie
directory, which is sorted by the server domain name.
Using Cookies
When a client sends a request to a server, the browser looks in the cookie directory to
see if it can find a cookie sent by that server. If found, the cookie is included in the
884 PART VI APPLICATION LAYER
request. When the server receives the request, it knows that this is an old client, not a
new one. Note that the contents of the cookie are never read by the browser or disclosed
to the user. It is a cookie made by the server and eaten by the server. Now let us see how
a cookie is used for the four previously mentioned purposes:
❑ An electronic store (e-commerce) can use a cookie for its client shoppers. When a
client selects an item and inserts it in a cart, a cookie that contains information
about the item, such as its number and unit price, is sent to the browser. If the client
selects a second item, the cookie is updated with the new selection information,
and so on. When the client finishes shopping and wants to check out, the last
cookie is retrieved and the total charge is calculated.
❑ The site that restricts access to registered clients only sends a cookie to the client
when the client registers for the first time. For any repeated access, only those cli-
ents that send the appropriate cookie are allowed.
❑ A web portal uses the cookie in a similar way. When a user selects her favorite
pages, a cookie is made and sent. If the site is accessed again, the cookie is sent to
the server to show what the client is looking for.
❑ A cookie is also used by advertising agencies. An advertising agency can place ban-
ner ads on some main website that is often visited by users. The advertising agency
supplies only a URL that gives the advertising agency’s address instead of the ban-
ner itself. When a user visits the main website and clicks the icon of a corporation, a
request is sent to the advertising agency. The advertising agency sends the requested
banner, but it also includes a cookie with the ID of the user. Any future use of
the banners adds to the database that profiles the Web behavior of the user. The
advertising agency has compiled the interests of the user and can sell this informa-
tion to other parties. This use of cookies has made them very controversial. Hope-
fully, some new regulations will be devised to preserve the privacy of users.
Example 26.8
Figure 26.8 shows a scenario in which an electronic store can benefit from the use of cookies.
Assume a shopper wants to buy a toy from an electronic store named BestToys. The shopper
browser (client) sends a request to the BestToys server. The server creates an empty shopping cart
(a list) for the client and assigns an ID to the cart (for example, 12343). The server then sends a
response message, which contains the images of all toys available, with a link under each toy that
selects the toy if it is being clicked. This response message also includes the Set-Cookie header
line whose value is 12343. The client displays the images and stores the cookie value in a file
named BestToys. The cookie is not revealed to the shopper. Now the shopper selects one of the
toys and clicks on it. The client sends a request, but includes the ID 12343 in the Cookie header
line. Although the server may have been busy and forgotten about this shopper, when it receives
the request and checks the header, it finds the value 12343 as the cookie. The server knows that
the customer is not new; it searches for a shopping cart with ID 12343. The shopping cart (list) is
opened and the selected toy is inserted in the list. The server now sends another response to the
shopper to tell her the total price and ask her to provide payment. The shopper provides
information about her credit card and sends a new request with the ID 12343 as the cookie value.
When the request arrives at the server, it again sees the ID 12343, and accepts the order and the
payment and sends a confirmation in a response. Other information about the client is stored in
CHAPTER 26 STANDARD CLIENT-SERVER PROTOCOLS 885
Client Server
Request A customer file is
GET BestToys.com HTTP/1.1 created with ID: 12343
1
Response
A vendor file is created HTTP/1.1 200 OK
with cookie: 12343 Set-Cookie: 12343
Update
2
Response
HTTP/1.1 200 OK
Update
4
Page representing the price
Request
GET image HTTP/1.1
Cookie Cookie: 12343
5
Response
HTTP/1.1 200 OK
Update
6
Order confirmation
Time Time
the server. If the shopper accesses the store sometime in the future, the client sends the cookie
again; the store retrieves the file and has all the information about the client.
Note that the proxy server acts as both server and client. When it receives a request
from a client for which it has a response, it acts as a server and sends the response to the
client. When it receives a request from a client for which it does not have a response, it
first acts as a client and sends a request to the target server. When the response has been
received, it acts again as a server and sends the response to the client.
Proxy Server Location
The proxy servers are normally located at the client site. This means that we can have a
hierarchy of proxy servers, as shown below:
1. A client computer can also be used as a proxy server, in a small capacity, that
stores responses to requests often invoked by the client.
2. In a company, a proxy server may be installed on the computer LAN to reduce the
load going out of and coming into the LAN.
3. An ISP with many customers can install a proxy server to reduce the load going
out of and coming into the ISP network.
Example 26.9
Figure 26.9 shows an example of a use of a proxy server in a local network, such as the network
WAN Internet
Proxy
server Local Network
Web Web
server server
on a campus or in a company. The proxy server is installed in the local network. When an HTTP
request is created by any of the clients (browsers), the request is first directed to the proxy server.
If the proxy server already has the corresponding web page, it sends the response to the client.
Otherwise, the proxy server acts as a client and sends the request to the web server in the Internet.
When the response is returned, the proxy server makes a copy and stores it in its cache before
sending it to the requesting client.
Cache Update
A very important question is how long a response should remain in the proxy server
before being deleted and replaced. Several different strategies are used for this purpose.
One solution is to store the list of sites whose information remains the same for a while.
For example, a news agency may change its news page every morning. This means that
CHAPTER 26 STANDARD CLIENT-SERVER PROTOCOLS 887
a proxy server can get the news early in the morning and keep it until the next day.
Another recommendation is to add some headers to show the last modification time of
the information. The proxy server can then use the information in this header to guess
how long the information would be valid.
HTTP Security
HTTP per se does not provide security. However, as we show in Chapter 32, HTTP can
be run over the Secure Socket Layer (SSL). In this case, HTTP is referred to as HTTPS.
HTTPS provides confidentiality, client and server authentication, and data integrity.
26.2 FTP
File Transfer Protocol (FTP) is the standard protocol provided by TCP/IP for copy-
ing a file from one host to another. Although transferring files from one system to
another seems simple and straightforward, some problems must be dealt with first.
For example, two systems may use different file name conventions. Two systems may
have different ways to represent data. Two systems may have different directory
structures. All of these problems have been solved by FTP in a very simple and ele-
gant approach. Although we can transfer files using HTTP, FTP is a better choice to
transfer large files or to transfer files using different formats. Figure 26.10 shows the
Client
User
interface Server
Control
Control connection Control
Local process process Remote
file system file system
Data transfer Data transfer
process Data process
connection
basic model of FTP. The client has three components: the user interface, the client
control process, and the client data transfer process. The server has two components:
the server control process and the server data transfer process. The control connec-
tion is made between the control processes. The data connection is made between the
data transfer processes.
Separation of commands and data transfer makes FTP more efficient. The control
connection uses very simple rules of communication. We need to transfer only a line of
command or a line of response at a time. The data connection, on the other hand, needs
more complex rules due to the variety of data types transferred.