HTTP Protocol
HTTP Protocol
SISTEMAS INFORMÁTICOS I
SLIDES 12
1
HTTP Protocol: Architecture HTTP connections
Non-persistent HTTP Persistent HTTP
• At most one object is • Multiple objects can
sent over a TCP be sent over single
connection. TCP connection
• HTTP/1.0 uses non- between client and
persistent HTTP server.
• HTTP/1.1 uses
persistent
connections in default
mode
2
Non-Persistent Connections (HTTP/1.0)
RTT for a HTML Page Request
Response time:
• one RTT to initiate TCP
connection initiate TCP
connection
• one RTT for HTTP RTT
3
Persistent HTTP Persistent HTTP (HTTP/1.1)
Note:
• this does not account for server processing • this implies sequential request / response
• RTT: Round-trip-time (client to server and back to client) • non- pipelining: next request not sent until response
received for previous request
• TTT: Total transmission time
4
Persistent HTTP: Pipelining
Persistent with Pipelining
Persistent without pipelining: Persistent with pipelining:
• client issues new request • default in HTTP/1.1
only when previous response • client sends requests as
has been received soon as it encounters a
• one RTT for each referenced referenced object.
object. • as little as one RTT for all
the referenced objects.
5
HTTP response message HTTP Request Message
$ HTTP/1.1 200 OK
GET /somedir/page.html HTTP/1.1
% Connection close
User-agent: Mozilla (compatible; MSIE 5.01, Windows NT)
Date: Thu, 06 Aug 1998 12:00:15 GMT Accept: text/html, image/gif, image/jpeg
Server: Apache/1.3.0 (Unix) Accept-language: en-us
$
Last-Modified: Mon, 22 Jun 1998 …...
Content-Length: 6821
/* a blank line */
Content-Type: text/html
6
HTTP response status codes Try out HTTP (client side) for yourself
200 OK
– request succeeded, requested object later in this message 1. Telnet to your favorite Web server:
301 Moved Permanently telnet eden.dei.uc.pt 80 " )*
– requested object moved, new location specified later in this $ '$ ' '
message (Location:)
304 Not Modified
400 Bad Request
- request message not understood by server 2. Type in a GET HTTP request:
403 Forbidden +, ,
GET /~sdp/sumt.htm HTTP/1.0
404 Not Found - %! , $
– requested document not found on this server
500 Internal Server Error
503 Service Unavailable
3. Look at response message sent by HTTP
505 HTTP Version Not Supported
server!
HEAD Request
• A HEAD request is just like a GET request,
except it asks the server to return the status line
and response headers only, and not the actual Server-Side Programs:
resource (i.e. no message body).
GETs and POSTs
• This is useful to check characteristics of a
resource without actually downloading it, thus
saving bandwidth. Use HEAD when you don't
actually need a file's contents.
7
Server-side Programming POST Request
• A POST request is used to send data to the server to be
processed by a CGI script, a ASP, a JSP/Java Servlet or
HTML Page a PHP program.
with HTML Form
• A POST request is different from a GET request in the
following ways:
Web Server (1)There's a block of data sent with the request, in the
message body.
(2)There are usually extra headers to describe this
message body, like Content-Type: and Content-
Passing parameters: Length:.
GET or POST (3)The request URI is not a resource to retrieve; it's usually
a program to handle the data you're sending.
(4)The HTTP response is normally program output, not a
static file.
8
Request Message for POST Request Message for GET
GET login.jsp?user=joe&pass1=1234&pass2=1234 HTTP 1.1
POST cgi-bin/create.p1 HTTP 1.1 Host: xpto.dei.uc.pt
Accept: image/gif, image/x-xbit, image-jpeg, image/pjpeg, */*
Host: xpto.dei.uc.pt
Accept: image/gif, image/x-xbit, image-jpeg, image/pjpeg, */
Content-type: application/x-www-form-urlencoded
Content-length: 37
user=joe&pass1=1234&pass2=1234
Beware that some systems put a limit of 256 chars in the URL...
Data is sent to the server inside the HTML message. If the resulting URL is bigger than 256 chars is better to use POST.
The other problem is related with security...
- Proxies • Thus, every HTTP request must specify which host name)
the request is intended for, with the Host: header. A
- Web caching complete HTTP 1.1 request might be:
9
How to solve the problem of a
Cookies: keeping “state”
stateless HTTP?
• A problem of the HTTP protocol is that every request is Many major Web sites use cookies
completely unrelated to any other previous request. Four components:
• The protocol at the HTTP Server is stateless. 1) cookie header line in the HTTP Response
• Solution: use cookies. message
2) cookie header line in HTTP Request message
• The server returns a "Set-cookie" header that gives a 3) cookie file kept on user’s host and managed by
cookie name, expiry time and some more info. user’s browser
4) back-end database at Web site
• Cookies are stored as plain text files in the local disk.
• When the user returns to the same URL the browser
returns the cookie if it hasn't expired.
Cookie file
/ 0 www.yahoo.com
amazon: 1678 cookie: 1678 ss
ebay: 8734 & acce Set-Cookie: ....
- / 5
ce
ac
Cookie file / 0
amazon: 1678
cookie: 1678
&
ebay: 8734 store local cookie:
!" #$ #"%# ""% " ! !! "" $#%!$ "&
10
Cookies (continued) Local Caches + Proxy Caches
What cookies can bring:
• authorization
$
• shopping carts Cookies and privacy:
• recommendations Internet
• cookies permit sites
• user session state to learn a lot about
(Web e-mail) you
• search engines use
proxy.dei.uc.pt:8080
redirection &
cookies to learn yet
more Fiquei aqui
• That way, the proxy knows which server to forward the request to.
11
Why Caching? Caching...
• Reduce response time for client request. • Not all objects can’t be cached
– E.g., dynamic objects
• Reduce traffic on an institution’s access link.
• Cache consistency
– strong
– weak
• Cache Replacement Policies
– Variable size objects
– Varying cost of not finding an object (a “miss”) in the
cache
• Prefetch?
– A large fraction of the requests are single requests..
12
Caching: first response Caching: next request
www.yahoo.com www.yahoo.com
13
Persistent Connections "Connection: close" Header
• In HTTP 1.0, TCP connections are closed after each • If a client includes the "Connection: close" header in the request,
request and response, so each resource to be retrieved then the connection will be closed after the corresponding response.
requires its own connection.
• Use this if you don' t support persistent connections, or if you
• Opening and closing TCP connections takes a substantial know a request will be the last on its connection.
amount of CPU time, bandwidth, and memory.
• Most Web pages consist of several files on the same • Similarly, if a response contains this header, then the server will
server, so much can be saved by allowing several requests close the connection following that response, and the client
and responses to be sent through a single persistent shouldn't send any more requests through that connection.
connection.
14
4 independent companies How to provide CDNs services
"
!
"
# # $
%
$%
$
!
!
$%
&
()
* $
+
+
(
' ! #
15
Usage Scenario Usage Scenario
$#
/ $ #
$ , - - .../
/
, $ , - - .../
/
'0 ! #
5 6
7 8 9 :<; '0 ! #
$ , - - .../ + /
7 = = > ? @ @ AAAB C D E B C F G@ AAAB : F F B C F G@ C 7 9 H I 9 6 B GJ> K $ , - - .../ + /
L
,
" 4&
O<
PPQ/ QPQ/ R/ QSP
334 )
$ , - - .../ + /
-
$#
/ $ # 4M0 N/ / /
' ! #
’ $
12 $ , - - .../ + /
.
$ , - - .../
/
Usage Scenario
$ , - - .../
/
'0 ! #
$ , - - .../ + /
16