0% found this document useful (0 votes)
92 views16 pages

HTTP Protocol

The document discusses the HTTP protocol and how web pages are retrieved. It provides the following key points: 1. A web page consists of HTML files and other objects like images, which are individually addressable by URLs. 2. HTTP uses the client-server model where a browser is the client that requests objects from a web server. 3. HTTP 1.1 introduced persistent connections, allowing multiple objects to be retrieved over a single TCP connection rather than opening a new connection for each object like in HTTP 1.0. This reduces overhead and speeds up page loads.

Uploaded by

Hao Nguyen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
92 views16 pages

HTTP Protocol

The document discusses the HTTP protocol and how web pages are retrieved. It provides the following key points: 1. A web page consists of HTML files and other objects like images, which are individually addressable by URLs. 2. HTTP uses the client-server model where a browser is the client that requests objects from a web server. 3. HTTP 1.1 introduced persistent connections, allowing multiple objects to be retrieved over a single TCP connection rather than opening a new connection for each object like in HTTP 1.0. This reduces overhead and speeds up page loads.

Uploaded by

Hao Nguyen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Web and HTTP

• Web page consists of objects


• Object can be HTML file, JPEG image, Java
applet, audio file,…
Protocolo HTTP • Web page consists of base HTML-file which
includes several referenced objects
• Each object is addressable by a URL
• Example URL:
http://www.dei.uc.pt/aulas_sd/pic.gif

SISTEMAS INFORMÁTICOS I

SLIDES 12

HTTP overview HTTP overview


HTTP protocol
• client/server model Uses TCP: HTTP is “stateless”
– client: browser that • client initiates TCP • server maintains no
requests, receives, connection (creates information about past
“displays” Web objects socket) to server, port 80 client requests
– server: Web server • HTTP specifies the
sends objects in messages sent between
response to requests the browser (HTTP client)
and the Web server (HTTP
• HTTP 1.0: RFC 1945 server).
• Body field of message
responses sent to browser
• HTTP 1.1: RFC 2068
are represented in HTML.

1
HTTP Protocol: Architecture HTTP connections
Non-persistent HTTP Persistent HTTP
• At most one object is • Multiple objects can
sent over a TCP be sent over single
connection. TCP connection
• HTTP/1.0 uses non- between client and
persistent HTTP server.
• HTTP/1.1 uses
persistent
connections in default
mode

Nonpersistent HTTP Nonpersistent HTTP (cont.)


(contains text,
Suppose user enters URL references to 10
www.dei.uc.pt/sd/home.index jpeg images) 4. HTTP server closes TCP
connection.
1a. HTTP client initiates TCP 5. HTTP client receives response
connection to HTTP server message containing html file
1b. HTTP server at host and then displays html.
(process) at www.dei.uc.pt on
www.dei.uc.pt waiting for Parsing the html file, it finds 10
port 80
TCP connection at port 80. referenced jpeg objects
“accepts” connection, 6. Repeat steps 1-5 for each of
2. HTTP client sends HTTP notifying client
10 jpeg objects
request message (containing
URL) into TCP connection 3. HTTP server receives
socket. Message indicates request message, forms
that client wants object response message
sd/home.index containing requested object,
and sends message into its
socket

2
Non-Persistent Connections (HTTP/1.0)
RTT for a HTML Page Request

Response time:
• one RTT to initiate TCP
connection initiate TCP
connection
• one RTT for HTTP RTT

request and first few bytes request


file
of HTTP response to RTT
time to
transmit
return file
file

• file transmission time received

total = 2RTT+transmit time time time

Non-Persistent with Parallel Sessions

3
Persistent HTTP Persistent HTTP (HTTP/1.1)

Non-persistent HTTP: Persistent HTTP


• requires 2 RTTs per • server leaves connection
object open after sending
response
• OS must allocate host
resources for each TCP • subsequent HTTP
messages between
connection same client/server are
• but browsers often open sent over the same
parallel TCP connections connection.
to fetch referenced • HTTP server closes the
objects. connection with it is not
used for a certain time.

Overhead of HTTP/1.0 HTTP/1.1


• HTTP/1.1 [1998, rfc 2068]; persistent connections -- very
- 1 RTT overhead for each start (each request / response) helpful with multi object requests
• by default, server keeps TCP connection open
- if 10 objects: • only one slow start per server connection
• if 10 objects
TTT = [10 * 1 tcp/rtt ] + [10 * 1 req/resp rtt] = 20 rtt
TTT = [1 * 1 tcp/rtt ] + [10 * 1 req/resp rtt] = 11 rtt

Note:
• this does not account for server processing • this implies sequential request / response
• RTT: Round-trip-time (client to server and back to client) • non- pipelining: next request not sent until response
received for previous request
• TTT: Total transmission time

4
Persistent HTTP: Pipelining
Persistent with Pipelining
Persistent without pipelining: Persistent with pipelining:
• client issues new request • default in HTTP/1.1
only when previous response • client sends requests as
has been received soon as it encounters a
• one RTT for each referenced referenced object.
object. • as little as one RTT for all
the referenced objects.

HTTP request message Method types


• Two types of HTTP messages: request, HTTP/1.0 HTTP/1.1
• GET • GET, POST, HEAD
response • POST • PUT
• HTTP request message: • HEAD – uploads file in entity body to
path specified in URL field
– asks server to leave
requested object out of • DELETE
! " ! GET /somedir/page.html HTTP/1.1 response – deletes file specified in the
# $ % Host: www.someschool.edu URL field
User-agent: Mozilla/4.0 • TRACE: http “echo” for
$ Connection: close debugging (added in 1.1)
Accept-language:fr • CONNECT: used by proxies
for tunneling (1.1)
! • OPTIONS: request for
& $ (extra carriage return, line feed) server/proxy options (1.1)
Use non-
$ $ persistent
& connection

5
HTTP response message HTTP Request Message

$ HTTP/1.1 200 OK
GET /somedir/page.html HTTP/1.1
% Connection close
User-agent: Mozilla (compatible; MSIE 5.01, Windows NT)
Date: Thu, 06 Aug 1998 12:00:15 GMT Accept: text/html, image/gif, image/jpeg
Server: Apache/1.3.0 (Unix) Accept-language: en-us
$
Last-Modified: Mon, 22 Jun 1998 …...
Content-Length: 6821
/* a blank line */
Content-Type: text/html

$ ! ' '! data data data data data ...


$
(&

HTTP Response Message HTTP Response: Status Codes


HTTP/1.1 200 OK • The status code is a three-digit integer,
Date: Thu, 06 Aug 1999 12:00:36 GMT
Server: Apache/1.3.0 (Unix)br> Last -Modified: Mon, and the first digit identifies the general
22 Jun 1999 09:23:24 GMT category of response:
Content-Length: 6821
Content-Type: text/html
• 1xx indicates an informational message only
...
• 2xx indicates success of some kind
• 3xx redirects the client to another URL
• 4xx indicates an error on the client's part
• 5xx indicates an error on the server's part

6
HTTP response status codes Try out HTTP (client side) for yourself
200 OK
– request succeeded, requested object later in this message 1. Telnet to your favorite Web server:
301 Moved Permanently telnet eden.dei.uc.pt 80 " )*
– requested object moved, new location specified later in this $ '$ ' '
message (Location:)
304 Not Modified
400 Bad Request
- request message not understood by server 2. Type in a GET HTTP request:
403 Forbidden +, ,
GET /~sdp/sumt.htm HTTP/1.0
404 Not Found - %! , $
– requested document not found on this server
500 Internal Server Error
503 Service Unavailable
3. Look at response message sent by HTTP
505 HTTP Version Not Supported
server!

HEAD Request
• A HEAD request is just like a GET request,
except it asks the server to return the status line
and response headers only, and not the actual Server-Side Programs:
resource (i.e. no message body).
GETs and POSTs
• This is useful to check characteristics of a
resource without actually downloading it, thus
saving bandwidth. Use HEAD when you don't
actually need a file's contents.

7
Server-side Programming POST Request
• A POST request is used to send data to the server to be
processed by a CGI script, a ASP, a JSP/Java Servlet or
HTML Page a PHP program.
with HTML Form
• A POST request is different from a GET request in the
following ways:
Web Server (1)There's a block of data sent with the request, in the
message body.
(2)There are usually extra headers to describe this
message body, like Content-Type: and Content-
Passing parameters: Length:.
GET or POST (3)The request URI is not a resource to retrieve; it's usually
a program to handle the data you're sending.
(4)The HTTP response is normally program output, not a
static file.

Passing Parameters with a POST Web Page Example


The CGI script/JSP/JavaServlet receives the message body and decodes it. <html>
Here's a typical form submission, using POST: <head>

<title> Create New Account </title>


POST /path/script.cgi HTTP/1.0
User-Agent: HTTPTool/1.0 </head>
Content-Type: application/x-www-form-urlencoded <body>
Content-Length: 32
<form action = "cgi-bin/create.p1" method = "post" [get] >
username=joe&password=xpto Enter user name: <input name = "user">
Password: <input name = "pass1" Type = "password">
Re-enter Password : <input name = "pass2" Type = "password">
<br>
Just make sure the sender and the receiving program agree on the format. <input type = "submit" value = "create account" >
<input type = "reset" value = "start over" >
</form>

</body> </html> The user fills in the form and hits


the SUBMIT button

8
Request Message for POST Request Message for GET
GET login.jsp?user=joe&pass1=1234&pass2=1234 HTTP 1.1
POST cgi-bin/create.p1 HTTP 1.1 Host: xpto.dei.uc.pt
Accept: image/gif, image/x-xbit, image-jpeg, image/pjpeg, */*
Host: xpto.dei.uc.pt
Accept: image/gif, image/x-xbit, image-jpeg, image/pjpeg, */
Content-type: application/x-www-form-urlencoded
Content-length: 37

user=joe&pass1=1234&pass2=1234

Beware that some systems put a limit of 256 chars in the URL...
Data is sent to the server inside the HTML message. If the resulting URL is bigger than 256 chars is better to use POST.
The other problem is related with security...

Host Header in HTTP1.1


Advanced Topics: • Starting with HTTP 1.1, one server at one IP address can
be multi-homed, i.e. the home of several Web domains.
• For example, "www.host1.com" and "www.host2.com"
- Cookies can live on the same server.

- Proxies • Thus, every HTTP request must specify which host name)
the request is intended for, with the Host: header. A
- Web caching complete HTTP 1.1 request might be:

- Conditional Get GET /path/file.html HTTP/1.1


- Content Distribution Networks Host: www.host1.com
[blank line here]

9
How to solve the problem of a
Cookies: keeping “state”
stateless HTTP?
• A problem of the HTTP protocol is that every request is Many major Web sites use cookies
completely unrelated to any other previous request. Four components:
• The protocol at the HTTP Server is stateless. 1) cookie header line in the HTTP Response
• Solution: use cookies. message
2) cookie header line in HTTP Request message
• The server returns a "Set-cookie" header that gives a 3) cookie file kept on user’s host and managed by
cookie name, expiry time and some more info. user’s browser
4) back-end database at Web site
• Cookies are stored as plain text files in the local disk.
• When the user returns to the same URL the browser
returns the cookie if it hasn't expired.

Cookies: keeping “state” (cont.) Cookies


Cookie has a name, an expiry
time and some more info.

Cookie file en Web Server


da try i
tab n b
. 1# as ac Login... Keep the session...
ebay: 8734 e ke
Set-cookie: 1678 23 4 ) & nd

Cookie file
/ 0 www.yahoo.com
amazon: 1678 cookie: 1678 ss
ebay: 8734 & acce Set-Cookie: ....

send cookie to the server...


ss

- / 5
ce
ac

Cookie file / 0
amazon: 1678
cookie: 1678
&
ebay: 8734 store local cookie:
!" #$ #"%# ""% " ! !! "" $#%!$ "&

10
Cookies (continued) Local Caches + Proxy Caches
What cookies can bring:
• authorization
$
• shopping carts Cookies and privacy:
• recommendations Internet
• cookies permit sites
• user session state to learn a lot about
(Web e-mail) you
• search engines use
proxy.dei.uc.pt:8080
redirection &
cookies to learn yet
more Fiquei aqui

HTTP Proxies Web Caching Hierarchy


7 ,
• An HTTP proxy is a program that acts as an intermediary between a
client and a server.
• It receives requests from clients, and forwards those requests to the
intended servers. The responses pass back in the same way.
,
• Proxies are commonly used in firewalls, for LAN-wide caches, or in
other situations.

• When a client uses a proxy, it typically sends all requests to that ,


proxy, instead of to the servers in the URLs. ' '! 1 !
• Requests to a proxy differ from normal requests in one way: in the 6 ,%
first line, they use the complete URL of the resource being
requested, instead of just the path. For example,
GET http://www.somehost.com/path/file.html HTTP/1.0

• That way, the proxy knows which server to forward the request to.

11
Why Caching? Caching...

• Reduce response time for client request. • Not all objects can’t be cached
– E.g., dynamic objects
• Reduce traffic on an institution’s access link.
• Cache consistency
– strong
– weak
• Cache Replacement Policies
– Variable size objects
– Varying cost of not finding an object (a “miss”) in the
cache
• Prefetch?
– A large fraction of the requests are single requests..

Conditional GET: client-side


caching Caching of HTML Documents
– If a browser already has a version of the document in its cache it
can include the field If-Modified-Since set it to the time it
retrieved that version.
• Goal: don’t send object if If-modified-since:
<date> 8 – The server can then check if the document has been modified
client has up-to-date cached
version since the browser last downloaded it and send it again if
$& $ necessary.
• client: specify date of HTTP/1.0
cached copy in HTTP – If the document hasn't changed, then the server can just say so
304 Not Modified
request and save some waiting and network traffic.
If-modified-since:
<date> If-modified-since:
• server: response contains <date> 8 Web Server
no object if cached copy is $& $ HTTP
up-to-date:
HTTP/1.0 200 OK
HTTP/1.0 304 Not <data>
Modified local cache

12
Caching: first response Caching: next request

Web Server Web Server


HTTP/1.1 200 OK
Date: Thu, 06 Aug 1999 12:00:36 GMT
Content-Length: 6821
GET index.html HTTP/1.1
Content-Type: text/html
If-modified-since: Thu, 06 Aug 1999 12:00:36 GMT
...

www.yahoo.com www.yahoo.com

' () ' * +',

Content distribution networks (CDNs)


Caching: second request

Web Server • Content replication


• CDN company installs
hundreds of CDN servers
throughout Internet # $ $
HTTP/1.1 304 Not Modified
– in lower-tier ISPs, close
www.yahoo.com to users
• CDN replicates its
customers’ content in CDN
servers. When provider
updates content, CDN
#
updates servers #
' #
Get HTML page from local cache

13
Persistent Connections "Connection: close" Header
• In HTTP 1.0, TCP connections are closed after each • If a client includes the "Connection: close" header in the request,
request and response, so each resource to be retrieved then the connection will be closed after the corresponding response.
requires its own connection.
• Use this if you don' t support persistent connections, or if you
• Opening and closing TCP connections takes a substantial know a request will be the last on its connection.
amount of CPU time, bandwidth, and memory.
• Most Web pages consist of several files on the same • Similarly, if a response contains this header, then the server will
server, so much can be saved by allowing several requests close the connection following that response, and the client
and responses to be sent through a single persistent shouldn't send any more requests through that connection.
connection.

• Persistent connections are the default in HTTP 1.1. • Connection: close


• The browser just opens a connection and send several
requests in series (called pipelining), and read the • Connection: keep-alive
responses in the same order as the requests were sent.

The Date: Header


• Caching is an important improvement in HTTP 1.1, and
can't work without timestamped responses.

• So, servers must timestamp every response with a Date:


Content Distribution
header containing the current time, in the form:
Networks
Date: Fri, 31 Dec 1999 23:59:59 GMT

• All responses except those with 100-level status must


include the Date: header.

• All time values in HTTP use Greenwich Mean Time.

14
4 independent companies How to provide CDNs services
"   
 
 
            !  
 

 

   

 
  "   # #  $     
   %
  
 $%
  
$  !   


      !   
    
   
    $%
   &
 

How to provide CDNs services How to provide CDNs services


"   

() 
*  $
+    

  + 
 

(  
' ! # 
 

   
  

15
Usage Scenario Usage Scenario

   $#  / $ # 
 
$  , - - .../  /   , $  , - - .../  /  
'0 ! # 
  5 6
7 8 9 :<; '0 ! # 
 
$  , - - .../ + /   7 = = > ? @ @ AAAB C D E B C F G@ AAAB : F F B C F G@ C 7 9 H I 9 6 B GJ> K $  , - - .../ + /  
L
,
" 4&
O<   
PPQ/ QPQ/ R/ QSP
334 ) 
$  , - - .../ + /  -  $#  / $ # 4M0 N/ / /
' ! # 
   ’ $   
12 $  , - - .../ + /   
.  $  , - - .../  /   
 

Páginas HTML são distribuídas pelo content-provider.


Ficheiros de vídeo são distribuídas pela empresa que presta o serviço de CDN.

Usage Scenario

 
$  , - - .../  /  
'0 ! # 
 
$  , - - .../ + /  

12 334 )     


$  , - - .../  /  - .../ + /  -  $#  / !

16

You might also like