Basics - of WEB
Basics - of WEB
Module1
Web Architecture
• Web architecture refers to the overall structure of a website or web application, including the way it is
designed, implemented, and deployed. It involves the use of technologies and protocols such as HTML,
CSS, JavaScript, and HTTP to build and deliver web pages and applications to users.
• Web architecture consists of several components, including the client, the server, the network, and
the database.
1. The client is the web browser or application that the user interacts with, and
2. the server is the computer or group of computers that host the website or web application.
3. The network is the infrastructure that connects the client and the server, such as the internet.
4. The database is a collection of data that is used to store and retrieve information for the website or
web application.
• Web architecture also includes the design and layout of the website or web application, as well as the
way it is organized and the relationships between different pages and components.
• It also includes the way the website or web application is built and maintained, including the use of
frameworks and libraries, and the deployment and hosting of the website or web application.
Web Architecture Components: There are several components that make up a web
architecture −
• The client −This is the web browser or application that the user interacts with. The client sends
requests to the server and receives responses from the server.
• The server −This is the computer or group of computers that host the website or web
application. The server processes requests from the client and sends back the appropriate
response.
• The network This is the infrastructure that connects the client and the server, such as the
internet. It allows for communication between the client and the server.
• The database −This is a collection of data that is used to store and retrieve information for the
website or web application. The database can be located on the same server as the website or
web application, or it can be hosted on a separate server.
• The design and layout −This refers to the way the website or web application is structured and
organized, including the layout, navigation, and overall appearance.
• The frameworks and libraries − These are tools and resources that are used to build and
maintain the website or web application. They can include frameworks like Ruby on Rails or
Django, or libraries like jQuery or React.
• The deployment and hosting − This refers to the way the website or web application is deployed
and hosted, including the hosting environment (such as a shared hosting plan or a cloud
platform) and the process for deploying updates and changes to the website or web application.
Why is Web Architecture Important?
• Web architecture is important because it plays a crucial role in the performance,
scalability, and maintenance of a website or web application.
• A well-designed web architecture can improve the user experience by ensuring that
the website or web application loads quickly and reliably. It can also make the
website or web application easier to maintain, as it provides a clear structure and
organization that makes it easier to find and modify different components.
• Web architecture is also important for scalability, as it determines the website or
web application's ability to handle increasing traffic and usage without experiencing
performance issues. A scalable web architecture can support growth and handle
changes in traffic patterns without requiring major changes to the system.
• Finally, a strong web architecture can improve the security of a website or web
application by implementing
World Wide Web
• The World Wide Web (WWW or simply the Web) is an information system that
enables content sharing over the Internet through user-friendly ways meant to
appeal to users beyond IT specialists and hobbyists. It allows documents and other
web resources to be accessed over the Internet according to specific rules of the
Hypertext Transfer Protocol (HTTP).
• The Web was invented by English computer scientist Tim Berners-Lee while at
CERN in 1989 and opened to the public in 1991. It was conceived as a "universal
linked information system".
• Documents and other media content are made available to the network through
web servers and can be accessed by programs such as web browsers.
• Servers and resources on the World Wide Web are identified and located through
character strings called uniform resource locators (URLs).
W3C
The World Wide Web Consortium (W3C) develops standards and guidelines to
help everyone build a web based on the principles of accessibility,
internationalization, privacy and security.
Web standards
Web standards are the building blocks of a consistent digitally connected
world. They are implemented in browsers, blogs, search engines, and other
software that power our experience on the Web.
it as a computer program
that performs specific tasks
for its client using a web
browser.
Web-based applications are
also known as web apps.
What is a Web Application?
A web application is an interactive software application that runs on a
web server and is accessed through web browsers.
•Social media
•IT Industry
Dynamic Web Apps •Healthcare •Manage the website directly to change and update the information
•Retail & E-commerce •Easy user management to protect your server and manage all the users on the website
•Transportation & logistics
•On-demand
Single Page Apps •Email service •Allows optimized routing and navigation experience
•Communication industry •Helps to keep the consistent visual structure of the web application using presentation logic
Multiple Page Apps •Enterprise industries •Allows optimizing each page for search engines
•E-commerce industries •Lets users access different pages with the click of their mouse
•Animation •Hold people’s attention for a long time because of its unique design and attractive approach
Animated Web Apps •Education •Aspect ratios, portrait, and landscape orientations, as well as different pixel densities and viewing distances, are
•Game considered
•On-demand
•Retail & E-commerce
•Responsive and allow browser compatibility
Progressive Web Apps •Transportation & logistics
•Easy to work in online and offline mode
•Social media
•Update itself without any user interaction
•Healthcare
•IT industry
Web Developers vs Web Programmers vs Web Designers
• Your home address not only gives directions to find it. It also identifies it so that you can't
confuse it with another one.
URN:
It stands for Uniform Resource Name, and its scope is to identify resources in a permanent way,
even after that resource does not exist anymore.
Unlike a URL, a URN doesn't provide any information about locating the resource but simply
identifies it, just like a pure URI. In particular, a URN is a URI whose scheme is urn and has the
following structure, as described by the RFC 2141:
URI
URI stands for Uniform Resource Identifier. It identifies a logical or physical resource on the web. URL and
URN are subtypes of URI. URL locates a resource, while URN names a resource.
URL
• URL stands for Uniform Resource Locator, the key concept of HTTP. It is the address of a unique
resource on the web. It can be used with other protocols like FTP and JDBC.
URN
• URN stands for Uniform Resource Name. It uses the urn scheme. URNs cannot be used to locate a
resource. A simple example given in the diagram is composed of a namespace and a namespace-
specific string.
• If you would like to learn more detail on the subject, I would recommend W3C’s clarification.
HTTP
HTTP is the foundation of data communication for the World Wide
Web, where hypertext documents include hyperlinks to other resources
that the user can easily access, for example by a mouse click or by
tapping the screen in a web browser.
Version Year introduced Current status
HTTP/0.9 1991 Obsolete
HTTP/1.0 1996 Obsolete
HTTP/1.1 1997 Standard
HTTP/2 2015 Standard
HTTP/3 2022 Standard
• Hypertext Transfer Protocol (HTTP) is a method for encoding and transporting information between a client
(such as a web browser) and a web server. HTTP is the primary protocol for transmission of information
across the Internet.
• Information is exchanged between clients and servers in the form of Hypertext documents, from which
HTTP gets its name. Hypertext is structured text that uses logical links, or hyperlinks, between nodes
containing text. Hypertext documents can be manipulated using the Hypertext Markup Language (HTML).
Using HTTP and HTML, clients can request different kinds of content (such as text, images, video, and
application data) from web and application servers that host the content.
• HTTP follows a request‑response paradigm in which the client makes a request and the server issues a
response that includes not only the requested content, but also relevant status information about the
request. This self‑contained design allows for the distributed nature of the Internet, where a request or
response might pass through many intermediate routers and proxy servers. It also allows intermediary
servers to perform value‑added functions such as load balancing, caching, encryption, and compression.
• HTTP resources such as web servers are identified across the Internet using unique identifiers known as
Hypertext Transfer Protocol
HTTP is an application protocol (layer 7 OSI and layer 4 TCP/IP) employed to create distributed hypermedia
systems. Since it operates on the application layer, it is agnostic of several networking aspects, such as
addressing and transmission.
Historically, HTTP emerged as a simple protocol to enable communication between clients and servers over
the Internet. From its release, HTTP became one of the most relevant protocols to exchange information on
the World Wide Web.
In this way, HTTP has been used on the Internet since the ’90s.
The first release of HTTP (0.9) was much limited. This version only enabled clients to request information
from a server using a single operation: GET.
The first HTTP release only supported transmitting ASCII data. However, in the following releases, this
support expanded to other data types.
HTTP relies on TCP/IP to work. It means that HTTP is a connection-based protocol. In such a way, we can
understand an HTTP session as a sequence of message exchanges between a server connected to a client.
HTTP Version 1.0
In this context, version 1.0 of HTTP was released in 1996, about five years after version 0.9.
Version 1.0 of HTTP brings several new utilities. Let’s see some of them:
Header: only the method and the resource name composed an HTTP 0.9 request. HTTP 1.0, in turn, introduced the HTTP
header, thus allowing the transmission of metadata that made the protocol flexible and extensible
Versioning: the HTTP requests explicitly informs the employed version, appending it in the request line
Status code: HTTP responses now contain a status code, thus enabling the receiver to check the request processing status
(successful or failed)
Content-type: thanks to the HTTP header, in specific to the Content-Type field, HTTP can transmit other documents types
than a plain HTML file
New methods: besides GET, HTTP 1.0 provides two new methods (POST and HEAD)
In summary, HTTP 1.0 got much more robust than the 0.9 version. The most responsible for the protocol improvements are
the HTTP header and the new HTTP methods.
So, the HTTP header enabled clients to send and receive different file types and exchange relevant metadata. The new
methods, in turn, enabled the clients to both recover only the metadata about a document (HEAD) and transfer data from the
client to the server (POST).
HTTP Version 1.1
Version 1.1 of HTTP was released in 1997, only one year after the previous version 1.0. HTTP 1.1 is an enhancement
of HTTP 1.0, providing several extensions.
Host header: HTTP 1.0 does not officially require the host header. HTTP 1.1 requires it by the specification. The host
header is specially important to route messages through proxy servers, allowing to distinguish domains that point to
the same IP
Persistent connections: in HTTP 1.0, each request/response pair requires opening a new connection. In HTTP 1.1, it
is possible to execute several requests using a single connection
Continue status: to avoid servers refusing unprocessable requests, now clients can first send only the request
headers and check if they receive a continue status code (100)
New methods: besides the already available methods of HTTP 1.0, the 1.1 version added six extra methods: PUT,
PATCH, DELETE, CONNECT, TRACE, and OPTIONS
In addition to the highlighted enhancements, there are many others introduced in version 1.1 of HTTP, such as
compression and decompression, multi-language support, and byte-range transfers.
Specifically, the new methods represented a real improvement in using
HTTP. The PUT methods got in charge of replacing already existing resources.
The PATCH method updates particular data of an already existing resource.
On the other hand,
DELETE removes an already existing resource.
To do that, HTTP 2.0 implemented several features to improve connections and data exchange. Let’s see some of them:
Request multiplexing: HTTP 1.1 is a sequential protocol. So, we can send a single request at a time. HTTP 2.0, in turn,
allows to send requests and receive responses asynchronously. In this way, we can do multiple requests at the same
time using a single connection
Request prioritization: with HTTP 2.0, we can set a numeric prioritization in a batch of requests. Thus, we can be
explicit in which order we expect the responses, such as getting a webpage CSS before its JS files
Automatic compressing: in the previous version of HTTP (1.1), we must explicitly require the compression of requests
and responses. HTTP 2.0, however, executes a GZip compression automatically
Connection reset: a functionality that allows closing a connection between a server and a client for some reason, thus
immediately opening a new one
Server push: to avoid a server receiving lots of requests, HTTP 2.0 introduced a server push functionality. With that, the
server tries to predict the resources that will be requested soon. So, the server proactively pushes these resources to
the client cache
Furthermore, HTTP 2.0 became a binary protocol, replacing the previous HTTP plain text versions. In summary, we can
see HTTP 2.0 as a patch of enhancements to solve the problems and limitations of the last HTTP versions.
HTTP Version 3.0
HTTP 3.0 is an Internet-Draft, different from the previous HTTP versions, which were/are Request For
Comments (RFC) documents of the Internet Engineering Task Force (IETF). Its first draft was published in
2020.
The main difference between HTTP 2.0 and HTTP 3.0 is the employed transport layer protocol. In HTTP
2.0, we have TCP connections with or not TLS (HTTPS and HTTP). HTTP 3.0, in turn, is designed over QUIC
(Quick UDP Internet Connections).
QUIC, in short, is a transport layer protocol with native multiplexing and built-in encryption. QUIC
provides a quick handshake process, besides being able to mitigate latency problems in lossy and slow
connections.
In addition to the potential benefits inherited from QUIC, another relevant characteristic of HTTP 3.0 is
that it always creates encrypted connections. So, it is similar to always employing HTTPS in current HTTP
2.0.
Why choose HTTPS over HTTP?
• Security
HTTP messages are plaintext, which means unauthorized parties can easily access and read them over the
internet. In contrast, HTTPS transmits all data in encrypted form. When users submit sensitive data, they can be
confident that no third parties can intercept the data over the network. It’s better to choose HTTPS to protect
potentially sensitive information like credit card details or customers’ personal information.
• Authority
Search engines generally rank HTTP website content lower than HTTPS webpages due to HTTP being less
trustworthy. Customers also prefer HTTPS websites over HTTP. The browser makes the HTTPS connection visible
to your users by placing a padlock icon in the browser’s address bar next to the website URL. Users prefer
HTTPS websites and applications due to these additional security and trust factors.
Similarly, the server sends different types of HTTP responses in the form of number codes and data.
Here are some examples:
200 - OK
400 - Bad request
404 - Resource not found
This request-response communication is usually invisible to your users. It’s the communication
method that the browser and web servers use, so the World Wide Web works consistently for
everyone.
How does HTTPS protocol work?
HTTP transmits unencrypted data, which means that information sent from a browser can be intercepted and read by
third parties. This wasn’t an ideal process, so it was extended into HTTPS to add another layer of security to
communication. HTTPS combines HTTP requests and responses with SSL and TLS technology.
HTTPS websites must obtain an SSL/TLS certificate from an independent certificate authority (CA). These websites share
the certificate with the browser before exchanging data to establish trust. The SSL certificate also contains cryptographic
information, so the server and web browsers can exchange encrypted or scrambled data. The process works like this:
You visit an HTTPS website by typing the https:// URL format in your browser’s address bar.
• The browser attempts to verify the site’s authenticity by requesting the server’s SSL certificate.
• The server sends the SSL certificate that contains a public key as a reply.
• The website’s SSL certificate proves the server identity. Once the browser is satisfied, it uses the public key to encrypt
and send a message that contains a secret session key.
• The web server uses its private key to decrypt the message and retrieve the session key. It then encrypts the session
key and sends an acknowledgment message to the browser.
• Now, both browser and web server switch to using the same session key to exchange messages safely.
HTTPS
HTTP
• Stateless means each request is considered as the new request. In other words, server
doesn't recognize the user by default. Every communication in stateless protocol is different.
• Examples of stateless protocol are UDP, HTTP, etc. HTTP is a stateless protocol as both the
client and server know each other only during the current request. Due to this nature of the
protocol, both the client and server do not retain the information between various requests
of the web pages.
2. Stateful protocol
It gives good performance to the client by keeping track of the connection
information. It requires backing storage. Unlike the stateless protocol, in the
stateful protocol, when a client sends a request to the server it expects some
response, and if it does not get any response client resends the request.
• Simplicity: Stateless protocols are easier to implement and maintain than stateful protocols
because they do not require storing and managing any information about the
communication between clients and servers.
• Scalability: Stateless protocols are more scalable than stateful protocols because they do
not require any synchronization or coordination between clients and servers. Stateless
protocols can also handle more concurrent requests and responses than stateful protocols
because they do not consume any resources or memory on the servers.
• Reliability: Stateless protocols are more reliable than stateful protocols because they do
not depend on any previous or next message. Stateless protocols can handle any failure or
interruption in the communication without affecting the outcome of the request or
response.
HTTP being stateless also has some disadvantages, such as:
• Lack of context: Stateless protocols do not have any context or history of the
communication between clients and servers. This means that each request or
response has to contain all the necessary information to complete the transaction,
such as authentication, preferences, session data, etc. This can increase the size and
complexity of the messages and reduce the performance and efficiency of the
communication.
• Lack of personalization: Stateless protocols do not have any memory or knowledge of
the communication between clients and servers. This means that each request or
response has to be treated equally and generically without any customization or
adaptation based on the user's behavior, preferences, history, etc. This can reduce the
user experience and satisfaction of the communication.
Protocol stack of HTTP/3 compared to HTTP/1.1 and HTTP/2
HTTP/3
HTTP/3 will be the first major upgrade to the hypertext transfer protocol since HTTP/2 was approved in
2015.
An important difference in HTTP/3 is that it runs on QUIC, a new transport protocol. QUIC is designed for
mobile-heavy Internet usage in which people carry smartphones that constantly switch from one
network to another as they move about their day. This was not the case when the first Internet protocols
were developed: devices were less portable and did not switch networks very often.
The use of QUIC means that HTTP/3 relies on the User Datagram Protocol (UDP), not the Transmission
Control Protocol (TCP). Switching to UDP will enable faster connections and faster user experience when
browsing online.
The QUIC protocol was developed by Google in 2012 and was adopted by the Internet Engineering Task
Force (IETF) — a vendor-neutral standards organization — as they started creating the new HTTP/3
standard. After consulting with experts around the world, the IETF has made a host of changes to
develop their own version of QUIC.
The history of HTTP/1.1 to HTTP/2
• In 1989, Sir Timothy John Berners-Lee invented the HTTP protocol on a NeXTcube workstation with a 25 MHz CPU
and several MBs of RAM. The protocol worked on networks with port connection speeds of 10 Mbits. Today,
however, we have dramatically faster CPUs and thousands of MBs of RAM, but the main WWW protocol is still only
HTTP/1.1. It was last updated in 1992, so why are we still using it?
• While other protocols have been updated over the years (FTP became SFTP; POP3 evolved to IMAP, and telnet
became SSH), HTTP/1.1 has not changed and has developed many issues with speed and security, and user-
friendliness as a result.
• Google was the first to investigate issues with HTTP/1.1. At the time, they were spending millions of dollars a year to
support their data centers, and the HTTP/1.1 protocol simply cost too much in terms of CPU resources and internet
connection capacity. They developed SPDY as an experimental alternative to HTTP/1.1—a protocol designed for
better security and improved page load times.
• The HTTP Working Group of the Internet Engineering Task Force (IETF) investigated Google’s SPDY protocol and
Microsoft’s equivalent when designing the next version of HTTP. Facebook recommended HTTP/2 to be based on
SPDY in July 2012.
• Based on the research done by Google, Microsoft, and Facebook, the IETF released the HTTP/2 protocol in 2015. This
became the second major version of the most useful internet protocol, HTTP.
Why is a new version of HTTP needed? HTTP/3…
QUIC will help fix some of HTTP/2's biggest shortcomings:
Developing a workaround for the sluggish performance when a smartphone switches from WiFi to
cellular data (such as when leaving the house or office)
Decreasing the effects of packet loss — when one packet of information does not make it to its
destination, it will no longer block all streams of information (a problem known as “head-of-line
blocking”)
Other benefits include:
• Faster connection establishment: QUIC allows TLS version negotiation to happen at the same time
as the cryptographic and transport handshakes
• Zero round-trip time (0-RTT): For servers they have already connected to, clients can skip the
handshake requirement (the process of acknowledging and verifying each other to determine how
they will communicate)
• More comprehensive encryption: QUIC’s new approach to handshakes will provide encryption by
default — a huge upgrade from HTTP/2 — and will help mitigate the risk of attacks
What is the HTTP/2 Protocol
HTTP/2 is the second version of the HTTP protocol aiming to make applications faster, simpler, and more robust
by improving many of the drawbacks of the first HTTP version.
• Compression: HTTP/2 offers built-in compression of the request headers (HPACK). Modern web applications usually accept a range of different headers, such as
authorization, caching directives, and client information. While compression of these might not make much of a difference for a single request, there is a lot of
data sent over the network to be saved when compressing them in high-traffic applications. HTTP/1.1 does not compress headers by default.
• Performance: Many of the new features in HTTP/2 are aimed at improving performance for the end-user. One example of this is how external resources can be
preemptively pushed to the client's browser before they are explicitly requested. HTTP/1.1 does not have these advanced features.
• Binary protocol: HTTP/2 is binary instead of textual as HTTP/1.1. In practice, this means simplified implementation of commands that previously could be mixed up
due to optional whitespace when using the text format. Browsers that support HTTP/2 will convert textual commands into binary before sending them over the
network.
• Security: Because of the binary format used by HTTP/2, there is no longer a risk with so-called response splitting attacks that are possible with HTTP/1.1. This
enables an attacker to manipulate the response headers by injecting whitespace into a textual response. This is no longer possible with the binary format of
HTTP/2.
• Delivery models:While the HTTP/1.1 protocol delivers responses based on a single request, HTTP/2 uses multiplexing and server push features to increase the
delivery performance.
• Buffer overflow:The buffer is the space used by the client and server that holds the requests that have not yet been processed. In HTTP/1.1, the flow control used
to manage the available buffer space is implemented at the transport layer. In HTTP/2, the client and server can implement their own flow controls to
communicate the available buffer space.
• Multiplexing:HTTP/2 enables full request and response multiplexing. In practice, this means a connection made to a web server from your browser can be used to
send multiple requests and receive multiple responses. This gets rid of a lot of the additional time that it takes to establish a new connection for each request.
HTTP/1.1 does not support multiplexing.
• Faster encrypted connections:HTTP/2 uses the new ALPN extension, which allows for faster-encrypted connections since the application protocol is determined
during the initial connection. Using HTTP/1.1 without ALPN needs additional round trips for the encryption handshake.
• No need for HTTP/1.1 workarounds:In order to bypass some of the drawbacks with HTTP/1.1, multiple workarounds have been invented. Two examples of these
are:
• Domain sharding is a common performance workaround used with HTTP/1.1 to trick browsers into opening more simultaneous connections than would normally
be allowed.
• Another common workaround for HTTP/1.1 is content concatenation used to reduce the number of requests for different resources. To achieve this, web
developers often combine all the CSS and JavaScript into single files.
Benefits of HTTP/2
What are some pros to using HTTP/2?
The HTTP headers are used to pass additional information between the
clients and the server through the request and response header. All the
headers are case-insensitive, headers fields are separated by colon, key-
value pairs in clear-text string format. The end of the header section
denoted by an empty field header. There are a few header fields that
can contain the comments. And a few headers can contain quality(q)
key-value pairs that separated by an equal sign.
There are four kinds of headers context-wise:
General Header: This type of headers applied on Request and Response headers
both but with out affecting the database body.
Request Header: This type of headers contains information about the fetched
request by the client.
Response Header: This type of headers contains the location of the source that
has been requested by the client.
Entity Header: This type of headers contains the information about the body of
the resources like MIME type, Content-length.
1. Online tools
The easiest way to view your website's headers is to use a third-party
website. Just visit the site and enter your website URL. You most often
are looking for the Response headers.
• websniffer.cc
• securityheaders.com
• headers.cloxy.net
• w3.org/International/questions/qa-headers-charset.en
2. Browser tools
You can also use the tools built into your browser to view the headers. The steps are the same in Chrome, Firefox, and Safari as shown below.
General-purpose servers are required to support the HTTP GET and HEAD methods, whereas all of the other
HTTP methods are optional.
GET: The GET method requests a representation of the specified resource. Requests using GET should only
retrieve data.
HEAD: The HEAD method asks for a response identical to a GET request, but without the response body.
POST: The POST method submits an entity to the specified resource, often causing a change in state or side
effects on the server.
PUT: The PUT method replaces all current representations of the target resource with the request payload.
CONNECT: The CONNECT method establishes a tunnel to the server identified by the target resource.
OPTIONS: The OPTIONS method describes the communication options for the target resource.
TRACE: The TRACE method performs a message loop-back test along the path to the target resource.
Note that the query string (name/value pairs) is sent in the URL of a GET request:
/test/demo_form.php?name1=value1&name2=value2
GET requests:
The data sent to the server with POST is stored in the request body of the HTTP request:
name1=value1&name2=value2
POST requests:
6. The PATCH Method: The PATCH method is used to apply partial modifications to a resource.
7. The OPTIONS Method: The OPTIONS method describes the communication options for the
target resource.
8. The CONNECT Method: The CONNECT method is used to start a two-way communications
(a tunnel) with the requested resource.
9. The TRACE Method: The TRACE method is used to perform a message loop-back test that
tests the path for the target resource (useful for debugging purposes).
HTTP GET HTTP POST
In GET method we can not send large amount of data rather limited data
In POST method large amount of data can be sent because the request
of some number of characters is sent because the request parameter is
parameter is appended into the body.
appended into the URL.
GET request is comparatively better than Post so it is used more than the POST request is comparatively less better than Get method, so it is used
Post request. less than the Get request.
GET requests are only used to request data (not modify) POST requests can be used to create and modify data.
GET request is comparatively less secure because the data is exposed in POST request is comparatively more secure because the data is not
the URL bar. exposed in the URL bar.
Request made through GET method are stored in Browser history. Request made through POST method is not stored in Browser history.
GET method request can be saved as bookmark in browser. POST method request can not be saved as bookmark in browser.
Request made through GET method are stored in cache memory of Request made through POST method are not stored in cache memory of
Browser. Browser.
Data passed through GET method can be easily stolen by attackers as the
Data passed through POST method can not be easily stolen by attackers as
data is visible to everyone.GET requests should never be used when
the URL Data is not displayed in the URL
dealing with sensitive data
In GET method only ASCII characters are allowed. In POST method all types of data is allowed.
• HTTP headers contain metadata in key-value pairs that are sent along with HTTP requests
and responses. They can be used to define caching behavior, facilitate authentication,
and manage session state. HTTP headers help the API client and server communicate
more effectively—and enable developers to optimize and customize the API’s behavior.
• HTTP headers play a crucial role in server and client behavior throughout the request and
response cycle. Request headers are sent by the client to the server and contain
information and instructions related to the requested resource, while response headers
are sent by the server to the client and provide metadata, instructions, and additional
information about the response itself.
Headers can be grouped according to their contexts:
• Request headers: Contain more information about the resource to be fetched, or about the client requesting the
resource.
• Response headers: Hold additional information about the response, like its location or about the server providing
it.
• Representation headers: Contain information about the body of the resource, like its MIME type, or
encoding/compression applied.
• Payload headers: Contain representation-independent information about payload data, including content length
and the encoding used for transport.
• End-to-end headers: These headers must be transmitted to the final recipient of the message: the server for a
request, or the client for a response. Intermediate proxies must retransmit these headers unmodified and caches
must store them.
• Hop-by-hop headers: These headers are meaningful only for a single transport-level connection, and must not be
retransmitted by proxies or cached. Note that only hop-by-hop headers may be set using the Connection header.
Some of the most commonly used request headers are:
Accept: The Accept header defines the media types that the client is able to accept from the server. For instance, Accept:
application/json, text/html indicates that the client prefers JSON or HTML responses. This information allows the server to
send a resource representation that meets the client’s needs.
User-Agent: The User-Agent header identifies the web browser or client application that is making the request, which
enables the server to tailor its response to the client. For instance, if the User-Agent header indicates that the request is
coming from the Chrome browser, the server may include CSS prefixes for CSS properties that are compatible with
Chrome.
Authorization: The Authorization header is used to send the client’s credentials to the server when the client is attempting
to access a protected resource. For instance, the client might include a JSON Web Token (JWT) as the value of the header,
which the server will then verify before returning the requested resource.
Content-Type: The Content-Type header identifies the media type of the content in the request body. For instance,
Content-Type: application/json indicates that the request body contains JSON data. This information helps the server
successfully interpret and process the payload.
Cookie: The client can use the Cookie header to send previously stored cookies back to the server. The server then uses
these cookies to associate the request with a specific user or session. This header plays an important role in delivering
personalized experiences, as it enables the server to remember a user’s login state or language preference.
The most common response headers are:
Content-Type: The Content-Type response header is the counterpart of the Content-Type request header, as it indicates
the type of data that the server is sending to the client. The header value typically includes the media type (such as
text/html, application/json, image/jpeg, and audio/mp3), as well as any optional parameters.
Cache-Control:The Cache-Control header controls caching behavior in the client’s browser or intermediate caches. It
defines how the response can be cached, when it expires, and how it should be revalidated. For example, Cache-Control:
max-age=3600, public instructs the client to cache the response for a maximum of 3600 seconds (1 hour) and allows
caching by public caches.
Server:The Server header includes the name and version of the server software that generated the response, as well as
information about the server’s technology stack. For instance, Server: Apache/2.4.10 (Unix) indicates that the response
was generated by the Apache web server version 2.4.10. It’s important to note that the Server header is informational and
doesn’t affect the API’s functionality.
Set-Cookie:The Set-Cookie header instructs the client to store a cookie with the specified name, value, and additional
attributes, such as expiration, domain, path, and security flags. The client will then include the cookie in subsequent
requests in order to facilitate stateful communication and personalized experiences.
Content-Length:The Content-Length header, which specifies the size of the response body in bytes, can help the client
anticipate how much data it is going to receive. This improves performance by allowing the client to plan in advance for
more efficient memory allocation and data processing.
HTTP status codes
• An HTTP status code is a server response to a browser’s request. When you visit a website,
your browser sends a request to the site’s server, and the server then responds to the
browser’s request with a three-digit code: the HTTP status code.
• These status codes are the Internet equivalent of a conversation between your browser and
the server. They communicate whether things between the two are A-okay, touch-and-go, or
whether something is wrong. Understanding status codes and how to use them will help you to
diagnose site errors quickly to minimize downtime on your site. You can even use some of
these status codes to help search engines and people access your site; a 301 redirect, for
example, will tell bots and people that a page that has moved somewhere else permanently.
• The first digit of each three-digit status code begins with one of five numbers, 1 through 5; you
may see this expressed as 1xx or 5xx to indicate status codes in that range. Each of those
ranges encompasses a different class of server response.
status code
These status codes tell you what 's happening when browsers try to contact your website.
They show when things go right and when things go wrong.
An HTTP status code is a message a website 's server sends to the browser to indicate
whether or not that request can be fulfilled.
Status codes specs are set by the W3C. Status codes are embedded in the HTTP header of
a page to tell the browser the result of its request.
When that all goes according to plan the server returns a 200 code.
However, there 's a lot that could go wrong when trying to fulfill a browser 's request to a
server.
Common HTTP status code classes:
2xxs – Success! The request was successfully completed and the server gave the browser the expected
response.
3xxs – Redirection: You got redirected somewhere else. The request was received, but there’s a
redirect of some kind.
4xxs – Client errors: Page not found. The site or page couldn’t be reached. (The request was made, but
the page isn’t valid — this is an error on the website’s side of the conversation and often appears when
a page doesn’t exist on the site.)
5xxs – Server errors: Failure. A valid request was made by the client but the server failed to complete
the request.
HTTP response status codes indicate whether a specific HTTP request
has been successfully completed. Responses are grouped in five
classes:
1xx: Informational
2xx: Success!
3xx: Redirect. The requested page has moved somewhere else.
4xx: Client error. There 's something wrong with the way the browser asked for
the page.
5xx: Server error. Something went wrong with the way the server tried to send
the page.
1. Information responses
• 100 Continue: This interim response indicates that the client should continue
the request or ignore the response if the request is already finished.
• 102 Processing (WebDAV):This code indicates that the server has received
and is processing the request, but no response is available yet.
• 103 Early Hints: This status code is primarily intended to be used with the
Link header, letting the user agent start preloading resources while the
server prepares a response or preconnect to an origin from which the page
will need resources.
2. Successful responses
• 200 OK: The request succeeded. The result meaning of "success" depends on the HTTP method:
GET: The resource has been fetched and transmitted in the message body.
HEAD: The representation headers are included in the response without any message body.
PUT or POST: The resource describing the result of the action is transmitted in the message body.
TRACE: The message body contains the request message as received by the server.
• 201 Created:The request succeeded, and a new resource was created as a result. This is typically the response sent
after POST requests, or some PUT requests.
• 202 Accepted:The request has been received but not yet acted upon. It is noncommittal, since there is no way in
HTTP to later send an asynchronous response indicating the outcome of the request. It is intended for cases where
another process or server handles the request, or for batch processing.
• 203 Non-Authoritative Information:This response code means the returned metadata is not exactly the same as is
available from the origin server, but is collected from a local or a third-party copy. This is mostly used for mirrors or
backups of another resource. Except for that specific case, the 200 OK response is preferred to this status.
• 204 No Content:There is no content to send for this request, but the headers may be useful. The user agent may
update its cached headers for this resource with the new ones.
• 205 Reset Content:Tells the user agent to reset the document which sent this request.
3. Redirection messages
• 300 Multiple Choices:The request has more than one possible response. The user agent or user should choose one
of them. (There is no standardized way of choosing one of the responses, but HTML links to the possibilities are
recommended so the user can pick.)
• 301 Moved Permanently: The URL of the requested resource has been changed permanently. The new URL is
given in the response.
• 302 Found: This response code means that the URI of requested resource has been changed temporarily. Further
changes in the URI might be made in the future. Therefore, this same URI should be used by the client in future
requests.
• 303 See Other:The server sent this response to direct the client to get the requested resource at another URI with
a GET request.
• 304 Not Modified: This is used for caching purposes. It tells the client that the response has not been modified, so
the client can continue to use the same cached version of the response.
• 305 Use Proxy Deprecated: Defined in a previous version of the HTTP specification to indicate that a requested
response must be accessed by a proxy. It has been deprecated due to security concerns regarding in-band
configuration of a proxy.
• 306 unused: This response code is no longer used; it is just reserved. It was used in a previous version of the
HTTP/1.1 specification.
4. Client error responses
400 Bad Request: The server cannot or will not process the request due to something that is perceived to be a client
error (e.g., malformed request syntax, invalid request message framing, or deceptive request routing).
401 Unauthorized: Although the HTTP standard specifies "unauthorized", semantically this response means
"unauthenticated". That is, the client must authenticate itself to get the requested response.
402 Payment Required Experimental: This response code is reserved for future use. The initial aim for creating this
code was using it for digital payment systems, however this status code is used very rarely and no standard
convention exists.
403 Forbidden: The client does not have access rights to the content; that is, it is unauthorized, so the server is
refusing to give the requested resource. Unlike 401 Unauthorized, the client's identity is known to the server.
404 Not Found: The server cannot find the requested resource. In the browser, this means the URL is not recognized.
In an API, this can also mean that the endpoint is valid but the resource itself does not exist. Servers may also send
this response instead of 403 Forbidden to hide the existence of a resource from an unauthorized client. This response
code is probably the most well known due to its frequent occurrence on the web.
405 Method Not Allowed: The request method is known by the server but is not supported by the target resource.
For example, an API may not allow calling DELETE to remove a resource.
406 Not Acceptable: This response is sent when the web server, after performing server-driven content negotiation,
doesn't find any content that conforms to the criteria given by the user agent.
407 Proxy Authentication Required:This is similar to 401 Unauthorized but authentication is needed to be done by a
proxy.
408 Request Timeout: This response is sent on an idle connection by some servers, even without any previous request by
the client. It means that the server would like to shut down this unused connection. This response is used much more
since some browsers, like Chrome, Firefox 27+, or IE9, use HTTP pre-connection mechanisms to speed up surfing. Also
note that some servers merely shut down the connection without sending this message.
409 Conflict: This response is sent when a request conflicts with the current state of the server.
410 Gone: This response is sent when the requested content has been permanently deleted from server, with no
forwarding address. Clients are expected to remove their caches and links to the resource. The HTTP specification intends
this status code to be used for "limited-time, promotional services". APIs should not feel compelled to indicate resources
that have been deleted with this status code.
411 Length Required:Server rejected the request because the Content-Length header field is not defined and the server
requires it.
412 Precondition Failed:The client has indicated preconditions in its headers which the server does not meet.
413 Payload Too Large:Request entity is larger than limits defined by server. The server might close the connection or
return an Retry-After header field.
414 URI Too Long:The URI requested by the client is longer than the server is willing to interpret.
5. Server error responses
• 500 Internal Server Error:The server has encountered a situation it does not know how to handle.
• 501 Not Implemented:The request method is not supported by the server and cannot be handled. The only
methods that servers are required to support (and therefore that must not return this code) are GET and HEAD.
• 502 Bad Gateway:This error response means that the server, while working as a gateway to get a response needed
to handle the request, got an invalid response.
• 503 Service Unavailable: The server is not ready to handle the request. Common causes are a server that is down
for maintenance or that is overloaded. Note that together with this response, a user-friendly page explaining the
problem should be sent. This response should be used for temporary conditions and the Retry-After HTTP header
should, if possible, contain the estimated time before the recovery of the service. The webmaster must also take
care about the caching-related headers that are sent along with this response, as these temporary condition
responses should usually not be cached.
• 504 Gateway Timeout:This error response is given when the server is acting as a gateway and cannot get a
response in time.
• 505 HTTP Version Not Supported:The HTTP version used in the request is not supported by the server.
• 506 Variant Also Negotiates:The server has an internal configuration error: the chosen variant resource is
configured to engage in transparent content negotiation itself, and is therefore not a proper end point in the
negotiation process.
• 507 Insufficient Storage (WebDAV):The method could not be performed on the resource because the server is
unable to store the representation needed to successfully complete the request.
• 508 Loop Detected (WebDAV):The server detected an infinite loop while processing the request.
• 510 Not Extended:Further extensions to the request are required for the server to fulfill it.
• 511 Network Authentication Required:Indicates that the client needs to authenticate to gain network access.
HTTP Non-Persistent & Persistent
Connection
HTTP (Hypertext Transfer Protocol) is an application layer protocol that
is used to establish a connection between a client and a server so that
the client can transfer data to the server and vice versa. HTTP is divided
into two categories i.e. Non-Persistent connection HTTP and Persistent
connection HTTP
• when we enter a URL (Uniform Resource Locator) in
the browser to visit a website? The browser fetches
the IP address corresponding to the entered URL
using DNS (Domain Name System).
• Once the browser gets the IP address corresponding
to the entered URL, the browser sends a request to
the server at the backend (along with the IP address
of the website) to fetch the webpage of the website.
• In return, the browser receives a response from the
server and this response contains the HTML
(Hypertext Markup Language) information of the
webpage.
• This exchange of information between the browser
and the server takes place on an HTTP
connection.
• The HTTP connection is a connection that follows
HTTP over which, a client and a server can exchange
data. An HTTP connection is a TCP (Transmission
Control Protocol) oriented connection and it works on
PORT number 80.
Round Trip Time (RTT): Round Trip Time is defined as the time taken by a packet of
information to travel from client to server and then come back (i.e. server to client).
TCP 3-Way Handshake: The establishment of a TCP connection takes place in 3 steps which
are referred to as 3-Way Handshake. These 3 steps of the 3-Way Handshake are discussed below.
2. The server sends confirmation whether the connection can be established or not
After receiving the request from the client, the server sends an acknowledgment to tell the client to
establish a connection.
1. Non-Persistent
• In non-persistent connection HTTP, there can be at most one object that can be sent over a single TCP
connection. This means that for each object that is to be sent from source to destination, a new connection
will be created. HTTP/1.0 is the version of HTTP that uses a non-persistent connection.
• Non-persistent HTTP is used in fetching those objects which are not needed that frequently.
• Non-persistent connection HTTP requires 2 RTT (round trip time) for each object that is to be transmitted (1
RTT to open the connection and 1 RTT for transmission of data).
2. Persistent
• In persistent connection HTTP, multiple objects can be sent over a single TCP connection. This means that
multiple objects can be transmitted from source to destination on a single HTTP connection. HTTP/1.1 is the
version of HTTP that uses a persistent connection.
• All modern web browsers like Mozilla Firefox and Google Chrome use persistent HTTP connections.
• Persistent HTTP does not require 2 RTT (round trip time) for each object that is to be transmitted. After a
successful opening of the TCP connection (opening of TCP connection is done by 3-Way Handshaking which
takes 1 RTT), each object will require only 1 RTT to be transmitted.
Difference between Persistent and Non-
Persistent Connections
Non-Persistent HTTP
Persistent HTTP
The server leaves the connection open after
sending a response. Requires 2 RTTs per object.
The client sends requests as soon as it Browsers often open parallel TCP connections to
encounters a referenced object. fetch referenced objects.
Here, at most one object can be sent over one
As little as one RTT for all the referenced objects. TCP Connection.
1. Non-Persistent Connection
Non-Persistent Connections are those connections in which for each object we have to
create a new connection for sending that object from source to destination. Here, we can
send a maximum of one object from one TCP connection.
• It does not lead to wastage of resources since the connection is opened only when
some data needs to be sent over it.
• It is more secure than persistent HTTP since after sending data over the connection,
the connection gets terminated and nothing can be transmitted over it once it gets
terminated.
• It needs to maintain an extra overhead to open a TCP connection each time some data
needs to be transmitted over it.
• It has a slow start because of the opening of a TCP connection on every data
transmission.
2. Persistent Connection
a. Non-Pipelined Persistent Connection: In a Non-pipeline connection, we first
establish a connection that takes two RTTs then we send all the object’s
images/text files which take 1 RTT each (TCP for each object is not required).
• Persistent HTTP saves CPU resources and time since the opening of the connection
takes place only once.
• It gives a fast start to send any object from a source to a destination. Also, it results
in relatively less network congestion and latency on subsequent HTTP requests
compared to non-persistent HTTP.