Chap 1 Web Technology
Chap 1 Web Technology
Although the physical network connections, the hardware communication devices and
the software communication protocols are required for communication across the
Internet, the application software provides useful functionalities.
In a network application, two application programs participate in any communication:
one application initiates communication and the other accepts it. This is known as the
Client-Server interaction. This is the methodology used for internet communication.
1.2 Client-Server
Client and Server are two applications involved in communication. These components
work together over a network. It involves the client requesting served from the server.
The Server provides the requested service.
The typical features of the Client are:
It is front-end of an application.
It manages user-interface portion.
It validates data entered by the user.
It dispatches requests to server program.
The typical features of the Server are:
1
1. Introduction to Web Technology
1.2.1 Client
Thin Client: A client application with minimum functions that uses the resources provided
by a host computer and its job is usually just to display results processed by a server. It
simply relies on a server to do most or all of its processing.
Thick/Fat Client: This is the opposite of the thin client. It can do most of its processing
and does not necessarily rely on a central server, but may need to connect to one for
some information, uploading, or to update data or the program itself.
Example: Anti-virus software belong to this category because they do not really need to
connect to a server to do their job, although they must connect periodically to download
new virus definitions and upload data.
Hybrid Client: Exhibits characteristics from the two above types. It can do most processes
on its own but may rely on a server for critical data or for storage.
A thin client is a networked computer with few locally stored programs and a heavy
dependence on network resources. It may have very limited resources of its own, perhaps
operating without auxiliary drives, CD-R/W/DVD drives or even software applications.
Typically, a thin client is one of many network computers that share computation needs
by using the resources of one server. A thin client often has low cost hardware with few
moving parts and can usually function better in a hostile environment than a fat or rich
client.
2
1. Introduction to Web Technology
A thick client is a computing workstation that includes most or all of the components
essential for operating and executing software applications independently.
A thick client is a type of client device in client-server architecture that has most
hardware resources on board to perform computation operations, run applications
and perform other functions independently.
Although a thick client can perform most operations, it still needs to be connected to
the primary server to download programs and data, and to update the operating
system.
In contrast to a thin client, a fat or rich client is a computer with many locally stored
programs and resources and little dependence on network resources.
3
1. Introduction to Web Technology
Web Browsers
A Web browser is a software program that is used to access the World Wide Web(WWW).
It allows users to view Web pages and navigate between them.
Examples of Web Browsers are: Mozilla, Microsoft Internet Explorer, Opera, Chrome,
Netscape etc. Web Browsers are known as Universal Clients because they act as the
common Client for all Web-based applications. They are the Web Clients that request
services from a Web Server, which is located somewhere on the Internet or Intranet.
1.2.2 Server Program & Server System
Generally, the term `Server' refers to a program that waits for a request and provides
service. However, a computer that runs many such Server programs is also known as a
Server.
Computers that have fast CPUs, large memories and powerful operating systems are also
called Server Machines (or Server Systems or Server Computers).
“A Server is the program that provides Service to a client".
Working of Server
A server offers one or more Services to clients. By default, it does not do any processing
until a client sends in a request. It waits for a client to make a request. This is known as
“listening mode” of the server.
4
1. Introduction to Web Technology
Client–server model:
Client–server model is a distributed application structure that partitions tasks or
workloads between the providers of a resource or service, called servers, and service
requesters, called clients.
Often clients and servers communicate over a computer network on separate hardware,
but both client and server may reside in the same system.
A server host runs one or more server programs, which share their resources with clients.
A client usually does not share any of its resources, but it requests content or service from
a server. Clients initiate communication sessions with servers, which await incoming
requests.
Examples of computer applications that use the client–server model are email, network
printing, and the World Wide Web.
5
1. Introduction to Web Technology
To communicate, the computers must have a common language, and they must follow rules so
that both the client and the server know what to expect.
The language and rules of communication are defined in a communications protocol. All
protocols operate in the application layer.
The application layer protocol defines the basic patterns of the dialogue. To formalize the data
exchange even further, the server may implement an application programming
interface (API). The API is an abstraction layer for accessing a service.
A server may receive requests from many distinct clients in a short period of time. A computer
can only perform a limited number of tasks at any moment, and relies on a scheduling system to
prioritize incoming requests from clients to accommodate them.
To prevent abuse and maximize availability, the server software may limit the availability to
clients.
Denial of service attacks are designed to exploit a server's obligation to process requests by
overloading it with excessive request rates.
Encryption should be applied if sensitive information is to be communicated between the client
and the server.
Example
When a bank customer accesses online banking services with a web browser (the client), the
client initiates a request to the bank's web server. The customer's login credentials may be stored
in a database, and the web server accesses the database server as a client. An application
server interprets the returned data by applying the bank's business logic, and provides
the output to the web server. Finally, the web server returns the result to the client web browser
for display.
In each step of this sequence of client–server message exchanges, a computer processes a
request and returns data. This is the request-response messaging pattern. When all the requests
are met, the sequence is complete and the web browser presents the data to the customer.
1.4 Internet
Internet is defined as an Information super Highway, to access information over the web.
Internet is a world-wide global system of interconnected computer networks.
Internet uses the standard Internet Protocol (TCP/IP).
Every computer in internet is identified by a unique IP address.
IP Address is a unique set of numbers (such as 110.22.33.114) which identifies a
computer location.
A special computer DNS (Domain Name Server) is used to give name to the IP Address so
that user can locate a computer by a name.
6
1. Introduction to Web Technology
Internet allows us to communicate with the people sitting at remote locations. There are
various apps available on the web that uses Internet as a medium for communication.
One can find various social networking sites such as:
7
1. Introduction to Web Technology
o Facebook
o Twitter
o Yahoo
o Google+
o Flickr
o Orkut
One can surf for any kind of information over the internet. Information regarding various
topics such as Technology, Health & Science, Social Studies, Geographical Information,
Information Technology, Products etc can be surfed with help of a search engine.
Apart from communication and source of information, internet also serves a medium for
entertainment. Following are the various modes for entertainment over internet.
o Online Television
o Online Games
o Songs
o Videos
o Matrimonial Services
o Online Shopping
o Data Sharing
o E-mail
Internet provides concept of electronic commerce, that allows the business deals to be
conducted on electronic systems
Disadvantages
However, Internet has proved to be a powerful source of information in almost every field, yet
there exists many disadvantages discussed below:
8
1. Introduction to Web Technology
There are always chances to loose personal information such as name, address, credit
card number. Therefore, one should be very careful while sharing such information. One
should use credit cards only through authenticated sites.
Spamming corresponds to the unwanted e-mails in bulk. These e-mails serve no purpose
and lead to obstruction of entire system.
Virus can easily be spread to the computers connected to internet. Such virus attacks
may cause your system to crash or your important data may get deleted.
Also a biggest threat on internet is pornography. There are many pornographic sites that
can be found, letting your children to use internet which indirectly affects the children
healthy mental life.
There are various websites that do not provide the authenticated information. This leads
to misconception among many people.
1.5 Internet Protocol
The Internet Protocol (IP) is a set of requirements for addressing and routing data on the Internet.
IP can be used with several transport protocols, including TCP and UDP.
1.5.1 Transmission Control Protocol (TCP)
TCP is a connection oriented protocol and offers end-to-end packet delivery. It acts as back bone
for connection. It exhibits the following key features:
Transmission Control Protocol (TCP) corresponds to the Transport Layer of OSI Model.
TCP is a reliable and connection oriented protocol.
9
1. Introduction to Web Technology
TCP offers:
o Stream Data Transfer.
TCP provides reliable message delivery. TCP ensures that data is not damaged,
lost, duplicated, or delivered out of order to a receiving process. It retransmits
the bytes not acknowledged with in specified time period.
o Efficient Flow Control
TCP is a full duplex protocol, meaning that each TCP connection supports a pair
of byte streams, one flowing in each direction.
o Multiplexing.
Multiplexing is the process of combining two or more data streams into a single
physical connection. TCP provides multiplexing facilities by using source and
destination port numbers. These port numbers allow TCP to set up a number of
virtual connections over a physical connection and multiplex the data stream
through that connection.
TCP Services
TCP offers following services to the processes at the application layer:
Stream Deliver Service
TCP protocol is stream oriented because it allows the sending process to send data as stream of
bytes and the receiving process to obtain data as stream of bytes.
Sending and Receiving Buffers
It may not be possible for sending and receiving process to produce and obtain data at same
speed, therefore, TCP needs buffers for storage at sending and receiving ends.
Bytes and Segments
The Transmission Control Protocol (TCP), at transport layer groups the bytes into a packet. This
packet is called segment. Before transmission of these packets, these segments are
encapsulated into an IP datagram.
10
1. Introduction to Web Technology
Source Port address and Destination Port address fields (16 bits each) identify the end
points of the connection.
Sequence Number field (32 bits) specifies the number assigned to the first byte of data in the
current message.
11
1. Introduction to Web Technology
Acknowledgement Number field (32 bits) contains the value of the next sequence number
that the sender of the segment is expecting to receive.
Header length (variable length) tells how many 32-bit words are contained in the TCP header
(length of TCP header).
Reserved field (6 bits) must be zero. This is for future use.
Flags field (6 bits) contains the various flags:
URG—Indicates that some urgent data has been placed.
ACK—Indicates that acknowledgement number is valid.
PSH—Indicates that data should be passed to the application as soon as possible.
RST—Resets the connection.
SYN—Synchronizes sequence numbers to initiate a connection.
FIN—Means that the sender of the flag has finished sending data.
Window field (16 bits) specifies the size of the sender's receive window (that is, buffer space
available for incoming data).
Checksum field (16 bits) indicates whether the header was damaged in transit. This field
holds the checksum for error control.
Urgent pointer field (16 bits) This field (valid only if the URG control flag is set) is used to
point to data that is urgently required that needs to reach the receiving process at the
earliest. The value of this field is added to the sequence number to get the byte number of
the last urgent byte.
Options field (variable length) specifies various TCP options.
Data field (variable length) contains upper-layer information.
12
1. Introduction to Web Technology
Fragment Offset: Fragment Offset represents the number of Data Bytes ahead of the
particular fragment in the specific Datagram. It is specified in terms of the number of 8 bytes,
which has a maximum value of 65,528 bytes.
Time to live: It is an 8-bit field that indicates the maximum time the Datagram will be live in
the internet system. The time duration is measured in seconds, and when the value of TTL is
zero, the Datagram will be erased.
Every time a datagram is processed its TTL value is decreased by one second. TTL are used so
that datagrams are not delivered and discarded automatically. The value of TTL can be 0 to
255.
Protocol: This header is reserved to denote that internet protocol is used in the latter portion
of the Datagram. For Example, 6 number digit is mostly used to indicate TCP, and 17 is used
to denote the UDP protocol.
Header Checksum: The next component is a 16 bits header checksum field, which is used to
check the header for any errors. The IP header is compared to the value of its checksum.
When the header checksum is not matching, then the packet will be discarded.
13
1. Introduction to Web Technology
Source Address: The source address is a 32-bit address of the source used for the packet.
Destination address: The destination address is also 32 bit in size stores the address of the
receiver.
IP Options: It is an optional field of IP header used when the value of IHL (Internet Header
Length) is set to greater than 5. It contains values and settings related with security, record
route and time stamp, etc.
Data: This field stores the data from the protocol layer, which has handed over the data to
the IP layer.
User Datagram Protocol (UDP) is a Transport Layer protocol. UDP is a part of Internet
Protocol suite, referred as UDP/IP suite.
It is unreliable and connectionless protocol. So, there is no need to establish connection
prior to data transfer.
Though Transmission Control Protocol (TCP) is the dominant transport layer protocol
used with most of Internet services; provides assured delivery, reliability and much more
but all these services cost us with additional overhead (Overhead is the excess resources
required to perform a specific task such as transfer data) and latency (Latency by
definition means the amount of time it takes for data to travel from source to
destination). Here, UDP comes into picture.
For the real-time services like computer gaming, voice or video communication, live
conferences; we need UDP. Since high performance is needed, UDP permits packets to
be dropped instead of processing delayed packets.
There is no error checking in UDP, so it also save bandwidth.
UDP transmits the data in form of a datagram. Following figure shows the datagram
format of UDP :
14
1. Introduction to Web Technology
1. Source Port: Source Port is 2 Byte long field used to identify port number of source.
2. Destination Port: It is 2 Byte long field, used to identify the port of destined packet.
3. Length: Length is the length of UDP including header and the data. It is 16-bits field.
4. Checksum: Checksum is 2 Bytes long field. It is the 16-bit one’s complement of the one’s
complement sum of the UDP header.
FTP Session:
When a FTP session is started between a client and a server, the client initiates a control
TCP connection with the server side. The client sends control information over this.
When the server receives this, it initiates a data connection to the client side. Only one
file can be sent over one data connection. But the control connection remains active
throughout the user session. FTP maintains a state about its user throughout the session.
15
1. Introduction to Web Technology
Data Structures:
FTP allows three types of data structures:
File Structure – In file-structure there is no internal structure and the file is considered to
be a continuous sequence of data bytes.
Record Structure – In record-structure the file is made up of sequential records.
Page Structure – In page-structure the file is made up of independent indexed pages.
Anonymous FTP:
Anonymous FTP is enabled on some sites whose files are available for public access. A
user can access these files without having any username or password.
Instead, the username is set to anonymous and password to the guest by default. Here,
user access is very limited.
For example, the user can be allowed to copy the files but not to navigate through
directories.
The World Wide Web (WWW) is an information sharing model that allows accessing
information over the medium of the Internet.
It is the collection of electronic documents that are linked together. These electronic
documents are known as `Web Pages'.
A collection of related Web Pages is known as a `Web Site. A Web Site resides on Server
computers that are located around the world.
16
1. Introduction to Web Technology
Information on the WWW is always accessible, from anywhere in the world. The basic
architecture is characterized by a Web Browser that displays information content and
Web Server that transfer's information to the client.
This architecture depends on three key standards for creating, publishing and finding Web
documents on the Web:
HTML: Hyper Text Markup Language is used for creating and editing document content.
HTML is the authoring language used to create documents on the WWW. HTML
makes documents readable across variety of computing platforms.
URL: Uniform Resource Locator is used for locating resource on the Internet. URL is the
unique address that identifies each web page or a resource on the Internet. It
indicates where the web pages are stored on the Internet. URL is the standard way
of addressing resources on the Internet that are part of WWW.It supplies the
Internet Address of a resource on the WWW, alone with protocol by which the
resource is accessed. URLs are used by Web Browsers to connect to a specific
server and to get a specific document or page on the Web.
HTTP: Hyper Text Transfer Protocol is used to transfer the data.
The URL looks like
Protocol://ServerDomainName/Path
Examples
http://www.google.com
Protocol Resource
http://192.168.10.1/download
Protocol IP address of the Resource
17
1. Introduction to Web Technology
Web browsers and Web Servers communicate with each other using the HTTP.
It is a simple protocol, which standardizes the way requests are sent and processed. This
allows different Clients to communicate with any vendor‘s server without compatibility
problems.
HTTP is an application level protocol of the TCP/IP suite, which is used to deliver virtually
all files and other data on WWW. It is used to transmit resources that are identified by
URL.
The most common kinds of resources can be a file, but it can also be dynamically
generated content, which is the result of execution of a script or an application on the
server.
1.7.1 Features of the HTTP protocol:
HTTP is connectionless: It is a connectionless approach in which HTTP client i.e., a browser
initiates the HTTP request and after the request is sent the client disconnects from server
and waits for the response.
HTTP is stateless: The client and server are aware of each other during a current request
only. Afterwards, both of them forget each other. Due to the stateless nature of protocol,
neither the client nor the server can retain the information about different request across
the web pages.
HTTP is simple: HTTP/2 does the encapsulation of HTTP messages into frames; i.e., HTTP is
typically designed to be plain and human-readable.
HTTP is extensible/customized: HTTP can be integrated with new functionality by providing
a simple agreement between a client and a server.
18
1. Introduction to Web Technology
The client initiates an HTTP session by opening a TCP connection to the HTTP server with
which it wishes to communicate.
It then sends request messages to the server, each of which specifies a particular type of
action that the user of the HTTP client would like the server to take.
Requests can be generated either by specific user action (such as clicking a hyperlink in a
Web browser) or indirectly as a result of a prior action (such as a reference to an inline
image in an HTML document leading to a request for that image.)
<request-line>
<general-headers>
<request-headers>
<entity-headers>
<empty-line>
[<message-body>]
[<message-trailers>]
19
1. Introduction to Web Technology
Request Line
The generic start line that begins all HTTP messages is called a request line in request
messages.
It has a three-fold purpose:
1. to indicate the command or action that the client wants performed;
2. to specify a resource upon which the action should be taken;
3. to indicate to the server what version of HTTP the client is using.
The formal syntax for the request line is:
The method is simply the type of action that the client wants the server to take.
It is always specified in upper case letters.
There are eight standard methods defined in HTTP/1.1, of which three are widely
used: GET, HEAD and POST.
They are called “methods” rather than “commands” because the HTTP standard uses
terminology from object-oriented programming.
Request URI
The request URI is the uniform resource identifier of the resource to which the request
applies.
20
1. Introduction to Web Technology
While URIs can theoretically refer to either uniform resource locators (URLs) or uniform
resource names (URNs), at the present time a URI is almost always an HTTP URL that
follows the standard syntax rules of Web URLs.
The exact form of the URL used in the HTTP request line usually differs from that used in
HTML documents or entered by users.
This is because some of the information in a full URL is used to control HTTP itself. It is
needed as part of the communication between the user and the HTTP client, but not in
the request from the client to the server.
The standard method of specifying a resource in a request is to include the path and file
name in the request line (as well as any optional query information) while specifying the
host in the special Host header that must be used in HTTP/1.1 requests.
For example, suppose the user enters a URL such as this:
http://www.myfavoritewebsite.com:8080/chatware/chatroom.php
There is no need to send the “http:” to the server. The client would take the remaining
information and split it so the URI was specified as “/chatware/chatroom.php” and
the Host line would contain “www.myfavoritewebsite.com:8080”. Thus, the start of the
request would look like this:
The exception to this rule is when a request is being made to a proxy server. In that case,
the request is made using the full URL in its original form, so that it can be processed by
the proxy just as the original client did. The request would be:
Finally, there is one special case where a single asterisk can be used instead of a real URL.
This is for the OPTIONS method, which does not require the specification of a resource.
(Nominally, the asterisk means the method refers to the server itself.)
HTTP Version
The HTTP-VERSION element tells the server what version the client is using so the server
knows how to interpret the request, and what to send and not to send the client in its
response.
For example, a server receiving a request from a client using versions 0.9 or 1.0 will
assume that a transitory connection is being used rather than a persistent one, and will
avoid using version 1.1 headers in its reply. The version token is sent in upper case as
“HTTP/0.9”, “HTTP/1.0” or “HTTP/1.1
21
1. Introduction to Web Technology
Headers
After the request line there can be any of the headers that the client wants to include in
the message; in these headers all the details are provided to the server about the request.
The all headers use the same structure, but are organized into categories based on the
functions they serve, and whether they are specific to one kind of message or not:
o General Headers: General headers refer mainly to the message itself, as opposed to its
contents, and are used to control its processing or provide the recipient with extra
information. They are not particular to either request or response messages, so they can
appear in either. They are likewise not specifically relevant to any entity the message may
be carrying.
o Request Headers: These headers convey to the server more details about the nature of
the client's request, and give the client more control over how the request is handled. For
example, special request headers can be used by the client to specify a conditional
request—one that is only filled if certain criteria are met. Others can tell the server which
formats or encodings the client is able to process in a response message.
o Entity Headers: These are headers that describe the entity contained in the body of the
request, if any.
Request headers are obviously used only in request messages, but both general headers
and entity headers can appear in either a request or a response message.
Each request message sent by an HTTP client to a server prompts the server to send back
a response message.
In certain cases the server may in fact send two responses, a preliminary response
followed by the real one. Usually though, one request yields one response, which
indicates the results of the server's processing of the request, and often also carries an
entity (file or resource) in the message body.
Specific message format of response message:
<status-line>
<general-headers>
<response-headers>
<entity-headers>
<empty-line>
[<message-body>]
[<message-trailers>]
22
1. Introduction to Web Technology
Status Line
The status line is the start line used for response messages.
It has two functions: to tell the client what version of the protocol the server is using,
and to communicate a summary of the results of processing the client's request.
The formal syntax for the status line is:
<HTTP-VERSION> <status-code> <reason-phrase>
HTTP Version
The HTTP-VERSION label in the status line tells the client the version number that the
server is using for its response.
It uses the same format as in the request line, in upper case as “HTTP/0.9”, “HTTP/1.0”
or “HTTP/1.1”.
The server is required to return an HTTP version number that is no greater than that the
client sent in its request.
The status code and reason phrase provide information about the results of processing
the client's request in two different forms.
The status code is a three-digit number that indicates the formal result that the server is
communicating to the client; it is intended for the client HTTP implementation to
process so the software can take appropriate action.
The reason phrase is an additional, descriptive text string, which can be displayed to the
human user of the HTTP client so he or she can see how the server responded.
23
1. Introduction to Web Technology
Headers
The response message will always include a number of headers that provide extra information
about it. Response message headers fall into these categories:
o General Headers: General headers that refer to the message itself and are not specific to
response messages or the entity in the message body. These are the same as the generic
headers that can appear in request messages (though certain headers appear more often
in responses and others are more common in requests).
o Response Headers: These headers provide additional data that expands upon the
summary result information in the status line. The server may also return extra result
information in the body of the message, especially when an error occurs.
o Entity Headers: These are headers that describe the entity contained in the body of the
response, if any. These are the same entity headers that can appear in a request message,
but they are seen more often in response messages. The reason for this is simply that
responses more often carry an entity than requests, because most requests are to
retrieve a resource.
HTTP commands
A browser uses some commands when it sends an HTTP request to a Web server. These
commands are case-sensitive.
GET: A browser uses this command for requesting a Web server for sending a particular
Web page.
HEAD: This command does not request for a Web page, but only requests for its header.
For instance, if a browser wants to know the last modified date of a Web page, it would
use the HEAD command, rather than the GET command.
PUT: This command is exactly opposite of the GET command. Rather than requesting for
a file, it sends a file to the server for storing it there.
POST: This command is very similar to the PUT command. However, whereas the PUT
command is used to send a new file, the POST command is used to update an existing file
with additional data.
24
1. Introduction to Web Technology
DELETE: This command allows a browser to send an HTTP request for deleting a particular
Web page.
LINK: This command is used to establish hyperlinks between two pages.
UNLINK: This command is used to remove existing hyperlinks between two pages
1.8 Web Browser
Web browser is a client, program, software or tool through which we sent HTTP
request to web server. The main purpose of web browser is to locate the content on
the World Wide Web and display in the shape of web page, image, audio or video form.
We can also call it a client server because it contacts the web server for desired
information. If the requested data is available in the web server data then it will send
back the requested information again via web browser.
Microsoft Internet Explorer, Mozilla Firefox, Safari, Opera and Google Chrome are
examples of web browser and they are more advanced than earlier web browser
because they are capable to understand the HTML, JavaScript, AJAX, etc. Now days,
web browser for mobiles are also available, which are called micro browser.
25
1. Introduction to Web Technology
26