Unit 2 - HTTP
Unit 2 - HTTP
Note : Round-trip time (RTT) is the duration in milliseconds (ms) it takes for a network
request to go from a starting point to a destination and back again to the starting point.
Non-persistent Vs Persistent
• Advantages of a persistent connection include,
• saving of CPU time in routers and hosts,
• the reduction in network congestion
• Reduction in latency on subsequent requests.
• Persistent HTTP connections are used with or without pipelining.
• Without pipelining means new requests can be made by client only after
previous response is received.
• Whereas, in persistent connections with pipelining, the client sends the
request as soon as it encounters a reference, i.e., multiple requests/responses.
• HTTP/1.1 supports pipelining
HTTP Communication
• The basic HTTP communication model has four steps:
• Handshaking: Opening a TCP connection to the web server.
• Client request: After a TCP connection is created, the HTTP client
sends a request message formatted according to the rules of the HTTP
standard—an HTTP Request.
• Server response: The server reads and interprets the request. It takes
action relevant to the request and creates an HTTP response message,
which it sends back to the client.
• Closing: Closing the connection (optional).Let us first understand how
a request and response look like (next slide)
Request Message
• After handshaking in the first step, the client (browser) requests an
object (file) from the server. This is done with a human-readable
message. Every HTTP request message has the same basic structure
as shown in Figure 4.4.
HTTP Request
Sample HTTP Request
HTTP Request
• Start line
• Every start line consists of three parts: the request method, Request
URI, and the HTTP version—with a single space used to separate
these adjacent parts.
• Request method: It indicates the type of the request a client wants to
send. They are also called methods. A method makes a message
either a request or a command to the server. Request messages are
used to retrieve data from the server whereas a command tells the
server to do a specific task.
Request Methods
• Method = GET | HEAD | POST | PUT| DELETE| TRACE | OPTIONS|
CONNECT | COPY| MOVE
• GET:
• Request server to return the resource specified by the Request-URI as
the body of a response. It is the most frequently used method in the
web.
• The message body is empty for the GET method.
• The information is sent by appending it to the URL using name-value
pair.
Request Methods
• HEAD: Requests server to return the same HTTP header fields that
would be returned if a GET method was used, but not return the
message body that would be returned to a GET method. It is useful to
inspect the characteristics of the resource (possibly large) without
actually downloading it.
• POST: Request server to pass the body of this request message as
data to be processed by the resource specified by the Request-URI.
The actual information is included in the body part of the request
message instead of appending it to the URL as done in the GET
method.
Request Methods
• PUT: Request server to store the body of this message on the server
and assign the specified Request-URI to the data stored so that future
GET request messages containing this Request-URI will receive this
data in their response messages. It is used to upload a new resource
or replace an existing document. The actual document is specified in
the body part.
• DELETE: Request server to respond to future HTTP request messages
that contain the specified Request-URI with a response indicating that
there is no resource associated with this Request-URI.
Request Methods
• TRACE: Request server to return a copy of the complete HTTP request
message, including start line, header fields, and body, received by the
server.
• OPTIONS: Request server to return a list of HTTP methods that may
be used to access the resource specified by the Request-URI. It is
usually used to check whether a server is functioning properly before
performing other tasks.
Request Methods
• CONNECT: It is used to convert a request connection into the
transparent TCP/ IP tunnel. It is usually done to facilitate secured socket
layer (SSL) encrypted communication, such as HTTPS through an
unencrypted HTTP proxy server.
• COPY: The HTTP protocol may be used to copy a file from one location
to another. The method COPY is used for this purpose. The URL
specified in the request line specifies the location of the source file. The
location of the target file is specified in the entity header. This method
is also vulnerable.
• MOVE: It is similar to the COPY method except that it deletes the source
file.
Request
• Request-URI: The second part of the start line is known as the
Request-URI. It identifies the resource for which the request needs to
be applied. • HTTP version: HTTP/1.1 version is used as it is the latest
available version.
Headers
• HTTP headers are a form of message metadata.
• Headers are used
• to construct sophisticated applications that establish and maintain
sessions,
• set caching policies,
• control authentication, and
• implement business logic, as they collectively specify the
characteristics of the resource requested and the data that is
provided.
Headers
• The HTTP protocol specification makes a clear distinction between
general headers, request headers, response headers, and entity
headers. The format
• Status code: It is a three-digit code that indicates the status of the response.
The status codes are classified with respect to their functionality into five
groups as follows:
• 1xx series (Informational)
• 2xx series (Success)
• 3xx series (Re-directional)
• 4xx series (Client error)
• 5xx series (Server error)
• Status phrase: It is also known as Reason-phrase and is intended to give a
Headers
• Response headers help the server to pass additional information about the response that cannot be
inferred from the status code alone, like the information about the server and the data being sent. Some
response headers are :
• Location: http://www.mywebsite.com/relocatedPage.html
• This header specifies a URL towards which the client should redirect its original request. It always
accompanies the “301” and “302” status codes that direct clients to try a new location.
• WWW-Authenticate: Basic
• This header accompanies the “401” status code that indicates an authorization challenge. It specifies
the authentication scheme which should be used to access the requested entity. In the case of web
browsers, the combination of the “401” status code and the WWW-Authenticate header causes users to
be prompted for ids and passwords.
• Server: Apache/1.2.5
• This header is not tied to a particular status code. It is an optional header that identifies the server
software.
• Age:22
• This header specifies the age of the resource in the proxy cache in seconds.
Message Body
• Similar to HTTP request messages, the message body in an HTTP
response message is also optional. The message body carries the
actual HTTP response data from the server (including files, images,
and so on) to the client.
Sample HTTP Response
HTTP Secure
• HTTPS is a protocol for secure communication over the Internet.
• It was developed by Netscape. It is not a protocol, but it is just the
result of the combination of the HTTP and SSL/ TLS (Secure Socket
Layer/Transport Layer Security) protocol.
• It is also called secure HTTP, as it sends and receives everything in the
encrypted form, adding the element of safety.
HTTPS
• HTTPS is often used to protect highly confidential online transactions like online
banking and online shopping order forms.
• The use of HTTPS protects against eavesdropping and man-in-the-middle attacks.
• While using HTTP, servers and clients still speak exactly the same HTTP to each
other, but over a secure SSL connection that encrypts and decrypts their
requests and responses.
• The SSL layer has two main purposes:
• Verifying that you are talking directly to the server that you think you are talking
to.
• Ensuring that only the server can read what you send it, and only you can read
what it sends back.
HTTPS
• Let us understand the working process of HTTPS with an example:
• Suppose you visit a web site to view their online catalog. When you’re ready to order, you
will be given a web page order form with a Uniform Resource Locator (URL) that starts
with https://.
• When you click “Send,” to send the page back to the catalog retailer, your browser’s
HTTPS layer will encrypt it.
• The acknowledgement you receive from the server will also travel in encrypted form,
arrive with an https:// URL, and be decrypted for you by your browser’s HTTPS sub-layer.
• The effectiveness of HTTPS can be limited by poor implementation of browser or server
software or a lack of support for some algorithms.
• Furthermore, although HTTPS secures data as it travels between the server and the client,
once the data is decrypted at its destination, it is only as secure as the host computer.
HTTP State Retention : Cookies
• HTTP is a stateless protocol.
• Cookies are an application-based solution to provide state retention over a stateless
protocol.
• They are small pieces of information that are sent in response from the web server to the
client.
• Cookies are the simplest technique used for storing client state.
• A cookie is also known as HTTP cookie, web cookie, or browser cookie.
• Cookies are not software; they cannot be programmed, cannot carry viruses, and cannot
install malware on the host computer.
• However, they can be used by spyware to track a user’s browsing activities.
• Cookies are stored on a client’s computer.
• They have a lifespan and are destroyed by the client browser at the end of that lifespan.
• A cookie is a combination of name and value, as in (name, value).
• A web application can generate multiple cookies, set their life spans in
terms of how many milliseconds each of them should be alive, and send
them back to a web browser as part of an HTTP response.
• If cookies are allowed, a web browser will save all cookies on its hosting
computer, along with their originating URLs and life spans.
• When an HTTP request is sent from a web browser of the same type on the
same computer to a web site, all live cookies originated from that web site
will be sent to the web site as part of the HTTP request.
• Therefore, session data can be stored in cookies, thus, the state can be
stored
• An HTTP cookie is a small file that is provided by the server as an
HTTP response header, stored by the client and returned to the server
as an HTTP request header. T
• This is the simplest approach to maintain session data. Since the web
server doesn’t need to commit any resources for the session data, this
is the most scalable approach to support session data of large number
of users.
• But it is not secure or efficient for cookies to go between a web
browser and a web site for every HTTP request, and hackers could
eavesdrop for the session data along the Internet path.
• FUN FACT: Cookies are browser specific. Each browser stores the
cookies in a different location.
• A cookie created in one browser (Google Chrome) will not be
accessed by another browser (Internet Explorer or Firefox).
• Most of the browsers have restrictions on the length of the text
stored in cookies. It is 4096(4kb), in general, but could vary from
browser to browser.
• Some browsers limit the number of cookies stored by each domain
(20 cookies). If the limit is exceeded, the new cookies will replace the
old cookies.