0% found this document useful (0 votes)
23 views46 pages

Unit 2 - HTTP

Uploaded by

tanishgontlag
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views46 pages

Unit 2 - HTTP

Uploaded by

tanishgontlag
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 46

Unit 2 - HTTP

HTTP – Hypertext transfer protocol


• Web browsers communicate web servers using a protocol called HTTP.
• HTTP is a client-server protocol that defines how messages are
formatted and transmitted, and what action web servers and
browsers should take in response to various commands.
Characteristics of HTTP
• The fundamental characteristics of the HTTP protocol are:
• The HTTP protocol uses the request/response paradigm, which is an
HTTP client program sends an HTTP request message to an HTTP
server that returns an HTTP response message.
• HTTP is a pull protocol; the client pulls information from the server
(instead of server pushing information down to the client).
• HTTP is a stateless protocol, that is, each request-response exchange
is treated independently.
• HTTP is media independent: Any type of data can be sent by HTTP if
both the client and the server know how to handle the data content.
HTTP Request -Response
HTTP
• Latest Version of HTTP is HTTP/ 3 (2022)
• HTTP uses a <major>.<minor> numbering scheme to indicate versions
of the protocol. The version of an HTTP message is indicated by an
HTTP-Version field in the first line. Here is the general syntax of
specifying HTTP version number:HTTP/1.1

Ack : from Wikipedia


HTTP Connections
• There are 2 types of connections on how client and server
communicate with each other .
• Persistent and non-persistent
• Non-persistent HTTP was used by HTTP/1.0
• HTTP/1.1 uses the persistent type of connection, which is also known
as a kept-alive type connection with multiple messages or objects
being sent over a single TCP connection between client and server.
Non-persistent HTTP
• HTTP/1.0 used a non-persistent connection in which only one object can be
sent over a TCP connection.
• The steps involved in setting up of a connection with non-persistent HTTP are:
• 1. Client (Browser) initiates a TCP connection to www.anyCollege.edu
(Server): Handshake.
• 2. Server at host www.anyCollege.edu accepts connection and acknowledges.
• 3. Client sends HTTP request for file /someDir/file.html.
• 4. Server receives message, finds and sends file in HTTP response.
• 5. Client receives response. It terminates connection, parseObject.
• 6. Steps 1–5 are repeated for each embedded object.
Non-persistent HTTP
Persistent HTTP
• To overcome problems with HTTP/1.0 ,HTTP/1.1 came with persistent
connections through which multiple objects can be sent over a single TCP
connection between the client and server.
• The steps involved in setting the connection of persistent HTTP are:
• 1. Client (Browser) initiates a TCP connection to www.anyCollege.edu
(Server): Handshake.
• 2. Server at host www.anyCollege.edu accepts connection and acknowledges.
• 3. Client sends HTTP request for file /someDir/file.html.
• 4. Server receives request, finds and sends object in HTTP response.
• 5. Client receives response. It terminates connection, parseobject.
• 6. Steps 3–5 are repeated for each embedded object.
Persistent HTTP
Non-persistent Vs Persistent
• Thus, the overhead of HTTP/1.0 is 1 RTT for each start (each
request/response), that is if there are 10 objects, then the Total
Transmission Time is as follows:
• TTT = [10 * 1 TCP/RTT] + [10 * 1 REQ/RESP RTT] = 20RTT
• Whereas for HTTP/1.1, persistent connections1 are very helpful with
multi-object requests as the server keeps TCP connection open by
default.
• TTT = [1 * 1 TCP/RTT] + [10 * 1 REQ/RESP RTT] = 11 RTT

Note : Round-trip time (RTT) is the duration in milliseconds (ms) it takes for a network
request to go from a starting point to a destination and back again to the starting point.
Non-persistent Vs Persistent
• Advantages of a persistent connection include,
• saving of CPU time in routers and hosts,
• the reduction in network congestion
• Reduction in latency on subsequent requests.
• Persistent HTTP connections are used with or without pipelining.
• Without pipelining means new requests can be made by client only after
previous response is received.
• Whereas, in persistent connections with pipelining, the client sends the
request as soon as it encounters a reference, i.e., multiple requests/responses.
• HTTP/1.1 supports pipelining
HTTP Communication
• The basic HTTP communication model has four steps:
• Handshaking: Opening a TCP connection to the web server.
• Client request: After a TCP connection is created, the HTTP client
sends a request message formatted according to the rules of the HTTP
standard—an HTTP Request.
• Server response: The server reads and interprets the request. It takes
action relevant to the request and creates an HTTP response message,
which it sends back to the client.
• Closing: Closing the connection (optional).Let us first understand how
a request and response look like (next slide)
Request Message
• After handshaking in the first step, the client (browser) requests an
object (file) from the server. This is done with a human-readable
message. Every HTTP request message has the same basic structure
as shown in Figure 4.4.
HTTP Request
Sample HTTP Request
HTTP Request
• Start line
• Every start line consists of three parts: the request method, Request
URI, and the HTTP version—with a single space used to separate
these adjacent parts.
• Request method: It indicates the type of the request a client wants to
send. They are also called methods. A method makes a message
either a request or a command to the server. Request messages are
used to retrieve data from the server whereas a command tells the
server to do a specific task.
Request Methods
• Method = GET | HEAD | POST | PUT| DELETE| TRACE | OPTIONS|
CONNECT | COPY| MOVE
• GET:
• Request server to return the resource specified by the Request-URI as
the body of a response. It is the most frequently used method in the
web.
• The message body is empty for the GET method.
• The information is sent by appending it to the URL using name-value
pair.
Request Methods
• HEAD: Requests server to return the same HTTP header fields that
would be returned if a GET method was used, but not return the
message body that would be returned to a GET method. It is useful to
inspect the characteristics of the resource (possibly large) without
actually downloading it.
• POST: Request server to pass the body of this request message as
data to be processed by the resource specified by the Request-URI.
The actual information is included in the body part of the request
message instead of appending it to the URL as done in the GET
method.
Request Methods
• PUT: Request server to store the body of this message on the server
and assign the specified Request-URI to the data stored so that future
GET request messages containing this Request-URI will receive this
data in their response messages. It is used to upload a new resource
or replace an existing document. The actual document is specified in
the body part.
• DELETE: Request server to respond to future HTTP request messages
that contain the specified Request-URI with a response indicating that
there is no resource associated with this Request-URI.
Request Methods
• TRACE: Request server to return a copy of the complete HTTP request
message, including start line, header fields, and body, received by the
server.
• OPTIONS: Request server to return a list of HTTP methods that may
be used to access the resource specified by the Request-URI. It is
usually used to check whether a server is functioning properly before
performing other tasks.
Request Methods
• CONNECT: It is used to convert a request connection into the
transparent TCP/ IP tunnel. It is usually done to facilitate secured socket
layer (SSL) encrypted communication, such as HTTPS through an
unencrypted HTTP proxy server.
• COPY: The HTTP protocol may be used to copy a file from one location
to another. The method COPY is used for this purpose. The URL
specified in the request line specifies the location of the source file. The
location of the target file is specified in the entity header. This method
is also vulnerable.
• MOVE: It is similar to the COPY method except that it deletes the source
file.
Request
• Request-URI: The second part of the start line is known as the
Request-URI. It identifies the resource for which the request needs to
be applied. • HTTP version: HTTP/1.1 version is used as it is the latest
available version.
Headers
• HTTP headers are a form of message metadata.
• Headers are used
• to construct sophisticated applications that establish and maintain
sessions,
• set caching policies,
• control authentication, and
• implement business logic, as they collectively specify the
characteristics of the resource requested and the data that is
provided.
Headers
• The HTTP protocol specification makes a clear distinction between
general headers, request headers, response headers, and entity
headers. The format

• A header consists of a single line or multiple lines. Each line is a single


header of the following form:
• Header-name: Header-value
• Each header line consists of a header name followed by a colon (:),
one or more spaces, and the header value, as shown in the following:
General Headers
• General headers do not describe the body of the message. They provide information about the
messages instead of what content they carry. They are primarily used to specify the method for
processing and handling messages.
• Some of the general headers with their brief description are given in the following:
• Date: Mon, 12 Sept 2016 16:09:05 GMT
• This header specifies the time and date of the creation of the message.
• Connection: Close
• This header indicates whether the client or server, which generated the message, intends to keep the
connection open.
• Warning: Danger.
• This site may be hacked! This header stores text for human consumption, something that would be
useful when tracing a problem.
• Cache-Control: no-cache
• This header shows whether the caching should be used.
Request Header
• Request Header It allows the client to pass additional information about
themselves and about the request, such as the data format that the client expects.
• Some of the request headers with their brief description are given in the
following:
• User-Agent: Mozilla/4.75 Identifies the software (e.g., a web browser) responsible
for making the request.
• Host: www.netsurf.com This header was introduced to support virtual hosting, a
feature that allows a web server to service more than one domain.
• Referer: http://wwwdtu.ac.in/∼akshi/index.html This header provides the server
with context information about the request. If the request came about because a
user clicked on a link found on a web page, this header contains the URL of that
referring page.
• Accept: text/plain This header specifies the format of the media that the client
can accept.
Entity Header
• It contains the information about the message body or the target messages in case of the
request messages have no body.
• Some of the entity headers with their brief description are:
• Content-Type: mime-type/mime-subtype This header specifies the MIME-type of the
content of the message body.
• Content-Length: 546 This optional header provides the length of the message body.
Although it is optional, it is useful for clients such as web browsers that wish to impart
information about the progress of a request. Where this header is omitted, the browser can
only display the amount of data downloaded. But when the header is included, the browser
can display the amount of data as a percentage of the total size of the message body.
• Last-Modified: Sun, 11 Sept 2016 13:28:31 GMT This header provides the last modification
date of the content that is transmitted in the body of the message. It is critical for the
proper functioning of caching mechanisms.
• Allow: GET, HEAD, POST This header specifies the list of the valid methods that can be
applied on a URL.
Message Body
• The message body part is optional for an HTTP message, but, if it is
available, then it is used to carry the entity-body associated with the
request.
• If the entity-body is associated, then usually Content-Type and
Content-Length header lines specify the nature of the associated
body.
• A message body is the one that carries the actual HTTP request data
(including form data and uploaded, etc.).
• The example of HTTP request message with no message body is given
in Figure 4.5.
HTTP Request
HTTP Response
• A server responds to the HTTP request of the client by providing the
requested document (object, entity), then it transfers an HTML file.
• Before sending the real data, a status code is sent (e.g., 202,
“Accepted”), as well as information about the document and the
server.
• Similar to an HTTP request message, an HTTP response message
consists of a status line, header fields, and the body of the response,
in the following format (Figure 4.6).
HTTP Response-Status Line
• Status line consists of three parts: HTTP version, Status code, and Status
phrase. Two consecutive parts are separated by a space.

• Status code: It is a three-digit code that indicates the status of the response.
The status codes are classified with respect to their functionality into five
groups as follows:
• 1xx series (Informational)
• 2xx series (Success)
• 3xx series (Re-directional)
• 4xx series (Client error)
• 5xx series (Server error)
• Status phrase: It is also known as Reason-phrase and is intended to give a
Headers
• Response headers help the server to pass additional information about the response that cannot be
inferred from the status code alone, like the information about the server and the data being sent. Some
response headers are :
• Location: http://www.mywebsite.com/relocatedPage.html
• This header specifies a URL towards which the client should redirect its original request. It always
accompanies the “301” and “302” status codes that direct clients to try a new location.
• WWW-Authenticate: Basic
• This header accompanies the “401” status code that indicates an authorization challenge. It specifies
the authentication scheme which should be used to access the requested entity. In the case of web
browsers, the combination of the “401” status code and the WWW-Authenticate header causes users to
be prompted for ids and passwords.
• Server: Apache/1.2.5
• This header is not tied to a particular status code. It is an optional header that identifies the server
software.
• Age:22
• This header specifies the age of the resource in the proxy cache in seconds.
Message Body
• Similar to HTTP request messages, the message body in an HTTP
response message is also optional. The message body carries the
actual HTTP response data from the server (including files, images,
and so on) to the client.
Sample HTTP Response
HTTP Secure
• HTTPS is a protocol for secure communication over the Internet.
• It was developed by Netscape. It is not a protocol, but it is just the
result of the combination of the HTTP and SSL/ TLS (Secure Socket
Layer/Transport Layer Security) protocol.
• It is also called secure HTTP, as it sends and receives everything in the
encrypted form, adding the element of safety.
HTTPS
• HTTPS is often used to protect highly confidential online transactions like online
banking and online shopping order forms.
• The use of HTTPS protects against eavesdropping and man-in-the-middle attacks.
• While using HTTP, servers and clients still speak exactly the same HTTP to each
other, but over a secure SSL connection that encrypts and decrypts their
requests and responses.
• The SSL layer has two main purposes:
• Verifying that you are talking directly to the server that you think you are talking
to.
• Ensuring that only the server can read what you send it, and only you can read
what it sends back.
HTTPS
• Let us understand the working process of HTTPS with an example:
• Suppose you visit a web site to view their online catalog. When you’re ready to order, you
will be given a web page order form with a Uniform Resource Locator (URL) that starts
with https://.
• When you click “Send,” to send the page back to the catalog retailer, your browser’s
HTTPS layer will encrypt it.
• The acknowledgement you receive from the server will also travel in encrypted form,
arrive with an https:// URL, and be decrypted for you by your browser’s HTTPS sub-layer.
• The effectiveness of HTTPS can be limited by poor implementation of browser or server
software or a lack of support for some algorithms.
• Furthermore, although HTTPS secures data as it travels between the server and the client,
once the data is decrypted at its destination, it is only as secure as the host computer.
HTTP State Retention : Cookies
• HTTP is a stateless protocol.
• Cookies are an application-based solution to provide state retention over a stateless
protocol.
• They are small pieces of information that are sent in response from the web server to the
client.
• Cookies are the simplest technique used for storing client state.
• A cookie is also known as HTTP cookie, web cookie, or browser cookie.
• Cookies are not software; they cannot be programmed, cannot carry viruses, and cannot
install malware on the host computer.
• However, they can be used by spyware to track a user’s browsing activities.
• Cookies are stored on a client’s computer.
• They have a lifespan and are destroyed by the client browser at the end of that lifespan.
• A cookie is a combination of name and value, as in (name, value).
• A web application can generate multiple cookies, set their life spans in
terms of how many milliseconds each of them should be alive, and send
them back to a web browser as part of an HTTP response.
• If cookies are allowed, a web browser will save all cookies on its hosting
computer, along with their originating URLs and life spans.
• When an HTTP request is sent from a web browser of the same type on the
same computer to a web site, all live cookies originated from that web site
will be sent to the web site as part of the HTTP request.
• Therefore, session data can be stored in cookies, thus, the state can be
stored
• An HTTP cookie is a small file that is provided by the server as an
HTTP response header, stored by the client and returned to the server
as an HTTP request header. T
• This is the simplest approach to maintain session data. Since the web
server doesn’t need to commit any resources for the session data, this
is the most scalable approach to support session data of large number
of users.
• But it is not secure or efficient for cookies to go between a web
browser and a web site for every HTTP request, and hackers could
eavesdrop for the session data along the Internet path.
• FUN FACT: Cookies are browser specific. Each browser stores the
cookies in a different location.
• A cookie created in one browser (Google Chrome) will not be
accessed by another browser (Internet Explorer or Firefox).
• Most of the browsers have restrictions on the length of the text
stored in cookies. It is 4096(4kb), in general, but could vary from
browser to browser.
• Some browsers limit the number of cookies stored by each domain
(20 cookies). If the limit is exceeded, the new cookies will replace the
old cookies.

You might also like