Lect15 - HTTP
Lect15 - HTTP
Lect15 - HTTP
Objectives:
Learn about the two main versions of the HTTP protocol, namely:
o HTTP 1.0 (RFC 1945) and
o HTTP 1.1 (RFC 2616)
Internet Protocols
Standard Internet protocols evolve through a community process
called Request for Comments (RFC), through which all members of
the Internet community can participate.
HTTP Protocol
HTTP is an application-level protocol based on client-server
architecture, designed for delivering hypermedia information on the
web.
The first version of the protocol was given version 0.9. However, the
two versions that are now in operation are version 1.0 and version
1.1.
The following sections briefly discuss these two versions. For details
about the protocols refer to the relevant links for the RFCs given
above.
Request-Line
Headers
.
.
.
Message-body
each client request message has a format where each line ends
with CRLF (“\r\n”):
method request-URI HTTP-version (request-line)
headers (0 or more lines)
<blank line> (CRLF)
message-body (only if a POST method)
request-URI
This specifies the full path of the resource relative to the server.
eg:
/swe344/lectures/lecture1.html
HTTP-version
This specifies the version of HTTP protocol that the client is able
to handle. The values are: HTTP/1.0 or HTTP/1.1
Header Lines
o Header lines provide information about the request or
response, or about the object sent in the message body.
o The header lines are in the form "Header-Name: value",
ending with CRLF.
o The header name is not case-sensitive (but the value may
be).
o Any number of spaces or tabs may be between the ":" and
the value.
example of request:
GET /index.html HTTP/1.0
User-Agent: Mozilla/2.02Gold
Accept: image/gif, image/jpeg, */*
<blank line here>
Status-Line
Headers
.
.
.
Message-body
example of response:
HTTP/1.0 200 OK
Date: Fri, 31 Dec 1999 23:59:59 GMT
Content-Type: text/html
Content-Length: 1354
<html>
<body>
<h1>Happy New Millennium!</h1>
(more file contents)
.
.
</body>
</html>
You can use TELNET to test a http server to visualize the above
examples.
Alternatively, you can write a TCP client using the TcpClient or the
Socket class to retrieve the document.
void OnGetClicked(object sender, System.EventArgs e)
{
String url = urlBox.Text;
int doubleSlahIndex = url.IndexOf("//");
if (doubleSlahIndex>0) { //remove protocol part
doubleSlahIndex+=2;
url = url.Substring(doubleSlahIndex);
}
string input;
while((input = reader.ReadLine()) != null) {
resultBox.Text += input + "";
}
}
HTTP 1.1 has recently been defined to address new needs and
overcome shortcomings of HTTP 1.0. Improvements include:
Faster response, by allowing multiple transactions to take
place over a single persistent connection.
Faster response and great bandwidth savings, by adding
cache support.
Faster response for dynamically-generated pages, by
supporting chunked encoding, which allows a response to
be sent before its total length is known.
Efficient use of IP addresses, by allowing multiple domains to
be served from a single IP address.
HTTP 1.1 requires a few extra things from both clients and servers
as explained below.
2.1 HTTP 1.1 Clients
Note: ":80" isn't required, since that's the default HTTP port.
Host is the only required header in an HTTP 1.1 request. It's also
the most urgently needed new feature in HTTP 1.1. Without it,
each host name requires a unique IP address, and we're quickly
running out of IP addresses with the explosion of new domains.
1a; ignore-stuff-here
abcdefghijklmnopqrstuvwxyz
10
1234567890abcdef
0
some-footer: some-value
another-footer: another-value
[blank line here]
Thus, the length of the text data is 42 bytes (1a + 10, in hex), and
the data itself is
abcdefghijklmnopqrstuvwxyz1234567890abcdef.
In HTTP 1.0 , TCP connections are closed after each request and
response, so each resource to be retrieved requires its own
connection.
A server might close the connection before all responses are sent,
so a client must keep track of requests and resend them as
needed.
When resending, don't pipeline the requests until you know the
connection is persistent. Don't pipeline at all if you know the server
won't support persistent connections (if it uses HTTP 1.0, based on
a previous response).
This means the server has received the first part of the request.
HTTP 1.1 clients must handle the “100 Continue” response
correctly (usually by just ignoring it).
Servers are not allowed to tolerate HTTP 1.1 requests without the
Host header. Instead, it must return a "400 Bad Request"
response.
Example:
HTTP/1.1 400 Bad Request
Content-Type: text/html
Content-Length: 111
<html><body>
<h2>No Host: header received</h2>
HTTP 1.1 requests must include the Host: header.
</body></html>
This requirement applies only to clients using HTTP 1.1, not any
future version of HTTP. See next section.
Also, the server should close an idle connection after some timeout
period.
When an HTTP 1.1 server receives the first line of an HTTP 1.1 (or
later) request, it must respond with either "100 Continue" or an
error.
The former says "only send the resource if it has changed since
this date"; the latter says the opposite.
Clients aren't required to use them, but HTTP 1.1 servers are
required to honor requests that do use them.
The If-Modified-Since: header is used with a GET request. If the
requested resource has been modified since the given date, ignore
the header and return the resource as you normal. Otherwise,
return a "304 Not Modified" response, including the Date:
header and no message body, like
HTTP/1.1 304 Not Modified
Date: Fri, 31 Dec 1999 23:59:59 GMT
[blank line here]
The If-Unmodified-Since: header is similar, but can be used with
any method. If the requested resource has not been modified since
the given date, ignore the header and return the resource as you
normally would. Otherwise, return a "412 Precondition Failed"
response, like:
HTTP/1.1 412 Precondition Failed
[blank line here]
2.2.8 Supporting the GET and HEAD methods
To comply with HTTP 1.1, a server must support at least the GET
and HEAD methods.